Test-Driven Development with AI Assistance
You ask the AI to build a rate limiter. It generates something that looks right. You deploy it. Two days later, your API goes down because the rate limiter does not handle concurrent requests correctly — a race condition the AI never thought to test because you never told it what “working” actually means.
TDD flips this dynamic. When you write the tests first, the AI has a precise, machine-verifiable definition of success. It does not have to guess what you want. It runs the tests, sees the failures, and iterates until they pass. This is the single highest-leverage technique for getting reliable output from AI coding assistants.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- A workflow for writing test specs first and letting the AI implement against them
- Prompts for generating comprehensive test cases from requirements
- Strategies for handling the AI’s tendency to make tests pass by weakening assertions
- A clear understanding of when TDD with AI saves time versus when it adds overhead
The TDD-AI Feedback Loop
Section titled “The TDD-AI Feedback Loop”Traditional TDD follows the red-green-refactor cycle. With AI, this cycle becomes dramatically faster because the AI handles the green and refactor phases while you focus on writing meaningful red tests.
- You write a failing test that captures the expected behavior
- The AI writes code to make the test pass
- The AI refactors the implementation while keeping tests green
- You review the implementation and write the next test
The critical insight: the test is your specification. A well-written test communicates intent far more precisely than any natural language prompt.
Writing Tests First
Section titled “Writing Tests First”Start by defining the behavior you want, not the implementation. You can write the test yourself or collaborate with the AI to generate test cases from requirements — but you must review and approve the tests before asking for implementation.
Open Agent mode. Ask Cursor to help you generate test cases, then review them before requesting implementation:
I need to build a rate limiter middleware for our Express API.Requirements:- 100 requests per minute per API key- Returns 429 with a Retry-After header when exceeded- Tracks limits in Redis- Handles concurrent requests correctly
Write the test file first at src/middleware/__tests__/rateLimiter.test.ts.Use our existing test patterns from @src/middleware/__tests__/auth.test.ts.Do NOT write the implementation yet.Review the tests. Add edge cases the AI missed. Then:
Good tests. Now implement src/middleware/rateLimiter.ts to makeall tests pass. Run the tests after implementation and fix anyfailures.Cursor will run the tests in the integrated terminal, see failures, and iterate until they pass.
Use Plan Mode to generate the test specification, then switch to Normal Mode for implementation:
I need a rate limiter middleware. Requirements:- 100 requests/minute per API key- 429 response with Retry-After header when exceeded- Redis-backed tracking- Thread-safe under concurrent requests
Write comprehensive tests at src/middleware/__tests__/rateLimiter.test.ts.Follow existing test patterns in the codebase. Do not implement yet.After reviewing and approving the tests:
Implement src/middleware/rateLimiter.ts to pass all the tests.Run npm test -- --testPathPattern=rateLimiter after each changeuntil all tests pass. Address root causes, not symptoms.Claude Code will run the tests, read the output, fix issues, and re-run automatically. This agentic loop is where TDD with AI truly shines.
Ask Codex to generate the test file first. Codex verifies its own work by running tests in its sandbox:
Write tests for a rate limiter middleware atsrc/middleware/__tests__/rateLimiter.test.ts.
Requirements:- 100 requests/minute per API key- 429 + Retry-After header when exceeded- Redis-backed- Handles concurrent requests safely
Follow existing test patterns in the codebase. Do not implement yet.After reviewing, start a new thread for implementation:
Implement src/middleware/rateLimiter.ts to pass all tests insrc/middleware/__tests__/rateLimiter.test.ts.Run the test suite and iterate until all tests pass.Using separate threads keeps context clean. The implementation thread focuses only on making tests green, without the test-generation conversation cluttering context.
Guiding the Implementation Phase
Section titled “Guiding the Implementation Phase”Once your tests are in place, the implementation phase is where the AI delivers the most value. The tests act as a continuous feedback signal that keeps the AI on track.
Reference both the test file and any relevant existing code:
Implement the rate limiter to pass @src/middleware/__tests__/rateLimiter.test.ts.Reference @src/middleware/auth.ts for our middleware patterns.Run the tests after each significant change.If a test fails, read the error carefully and fix the root cause.Use Cursor’s checkpoints to snapshot known-good states. If the AI breaks something that was passing, rewind to the checkpoint instead of debugging forward.
Give Claude a clear implementation directive with verification:
Implement rateLimiter.ts to pass all tests. After implementation:1. Run npm test -- --testPathPattern=rateLimiter2. If any test fails, read the full error output3. Fix the root cause (don't modify the tests)4. Re-run until all tests pass5. Then run the full test suite to check for regressionsIf Claude gets stuck in a loop of failing tests, use /clear and start a fresh session with a more targeted prompt that describes the specific failure.
Codex’s sandbox automatically runs verification:
Implement src/middleware/rateLimiter.ts to pass all existing tests.Follow the middleware pattern from src/middleware/auth.ts.Run the test suite after implementation. Do not modify test files.If tests fail, fix the implementation, not the tests.In the Codex App, you can watch the implementation happen in real-time and see test results as Codex iterates.
Catching the “Test Weakening” Anti-Pattern
Section titled “Catching the “Test Weakening” Anti-Pattern”The most dangerous failure mode with AI-assisted TDD is when the AI modifies your tests to make them pass instead of fixing the implementation. Watch for these signs:
- Assertions become less specific.
expect(result).toBe(429)becomesexpect(result).toBeDefined(). - Tests get deleted. The AI removes “flaky” tests instead of fixing the code.
- Mocks replace real behavior. The AI mocks out the exact thing you wanted to test.
When This Breaks
Section titled “When This Breaks”The AI generates trivial tests. If your test prompts are vague (“write tests for this function”), the AI will write tests that verify the function exists and returns something. Be specific about the behaviors you want tested. Include concrete input-output examples.
Tests are too tightly coupled to implementation. If your tests check internal implementation details (private method calls, specific data structures), the AI cannot refactor freely. Write tests against the public API and expected behaviors, not internal mechanics.
The test suite is too slow. AI-assisted TDD works best when tests run in seconds. If your full test suite takes 10 minutes, the AI will not iterate effectively. Use --testPathPattern or --grep to run only the relevant tests during development.
You are writing too many tests upfront. Start with 3-5 tests covering the core behavior. Let the implementation reveal which edge cases matter, then add more tests iteratively. Trying to write 30 tests before any implementation leads to analysis paralysis.