Skip to content

Test-Driven Development with AI Assistance

You ask the AI to build a rate limiter. It generates something that looks right. You deploy it. Two days later, your API goes down because the rate limiter does not handle concurrent requests correctly — a race condition the AI never thought to test because you never told it what “working” actually means.

TDD flips this dynamic. When you write the tests first, the AI has a precise, machine-verifiable definition of success. It does not have to guess what you want. It runs the tests, sees the failures, and iterates until they pass. This is the single highest-leverage technique for getting reliable output from AI coding assistants.

  • A workflow for writing test specs first and letting the AI implement against them
  • Prompts for generating comprehensive test cases from requirements
  • Strategies for handling the AI’s tendency to make tests pass by weakening assertions
  • A clear understanding of when TDD with AI saves time versus when it adds overhead

Traditional TDD follows the red-green-refactor cycle. With AI, this cycle becomes dramatically faster because the AI handles the green and refactor phases while you focus on writing meaningful red tests.

  1. You write a failing test that captures the expected behavior
  2. The AI writes code to make the test pass
  3. The AI refactors the implementation while keeping tests green
  4. You review the implementation and write the next test

The critical insight: the test is your specification. A well-written test communicates intent far more precisely than any natural language prompt.

Start by defining the behavior you want, not the implementation. You can write the test yourself or collaborate with the AI to generate test cases from requirements — but you must review and approve the tests before asking for implementation.

Open Agent mode. Ask Cursor to help you generate test cases, then review them before requesting implementation:

I need to build a rate limiter middleware for our Express API.
Requirements:
- 100 requests per minute per API key
- Returns 429 with a Retry-After header when exceeded
- Tracks limits in Redis
- Handles concurrent requests correctly
Write the test file first at src/middleware/__tests__/rateLimiter.test.ts.
Use our existing test patterns from @src/middleware/__tests__/auth.test.ts.
Do NOT write the implementation yet.

Review the tests. Add edge cases the AI missed. Then:

Good tests. Now implement src/middleware/rateLimiter.ts to make
all tests pass. Run the tests after implementation and fix any
failures.

Cursor will run the tests in the integrated terminal, see failures, and iterate until they pass.

Once your tests are in place, the implementation phase is where the AI delivers the most value. The tests act as a continuous feedback signal that keeps the AI on track.

Reference both the test file and any relevant existing code:

Implement the rate limiter to pass @src/middleware/__tests__/rateLimiter.test.ts.
Reference @src/middleware/auth.ts for our middleware patterns.
Run the tests after each significant change.
If a test fails, read the error carefully and fix the root cause.

Use Cursor’s checkpoints to snapshot known-good states. If the AI breaks something that was passing, rewind to the checkpoint instead of debugging forward.

Catching the “Test Weakening” Anti-Pattern

Section titled “Catching the “Test Weakening” Anti-Pattern”

The most dangerous failure mode with AI-assisted TDD is when the AI modifies your tests to make them pass instead of fixing the implementation. Watch for these signs:

  • Assertions become less specific. expect(result).toBe(429) becomes expect(result).toBeDefined().
  • Tests get deleted. The AI removes “flaky” tests instead of fixing the code.
  • Mocks replace real behavior. The AI mocks out the exact thing you wanted to test.

The AI generates trivial tests. If your test prompts are vague (“write tests for this function”), the AI will write tests that verify the function exists and returns something. Be specific about the behaviors you want tested. Include concrete input-output examples.

Tests are too tightly coupled to implementation. If your tests check internal implementation details (private method calls, specific data structures), the AI cannot refactor freely. Write tests against the public API and expected behaviors, not internal mechanics.

The test suite is too slow. AI-assisted TDD works best when tests run in seconds. If your full test suite takes 10 minutes, the AI will not iterate effectively. Use --testPathPattern or --grep to run only the relevant tests during development.

You are writing too many tests upfront. Start with 3-5 tests covering the core behavior. Let the implementation reveal which edge cases matter, then add more tests iteratively. Trying to write 30 tests before any implementation leads to analysis paralysis.