Skip to content

Test Generation & Execution

Testing is the safety net that lets you refactor fearlessly, deploy confidently, and sleep peacefully. Yet most developers treat it as an afterthought - tedious boilerplate written grudgingly after the “real” work is done. Claude Code flips this script, making test creation as natural as describing what your code should do.

Scenario: You’ve just finished implementing a complex payment processing module. It handles multiple currencies, applies discounts, validates credit cards, and integrates with three payment providers. Now you need comprehensive tests. The traditional approach? Hours of writing test cases, mocking dependencies, and catching edge cases you forgot. With Claude Code? Let’s see.

// 1. Write test (manually think of cases)
test('should apply 10% discount', () => {
// Spend time setting up mocks
// Write assertion
// Run test - it fails
// Implement feature
// Run test - hopefully passes
});
// Repeat for dozens of test cases...
// Miss edge cases you didn't think of
  1. Create testing commands Save in .claude/commands/test-driven.md:

    Follow strict TDD workflow for: $ARGUMENTS
    1. Write failing tests first
    2. Run tests to confirm they fail
    3. Implement minimal code to pass
    4. Refactor while keeping tests green
    5. Add edge cases and error scenarios
    Use our testing stack:
    - Jest for unit tests
    - React Testing Library for components
    - Supertest for API endpoints
    - Playwright for E2E tests
  2. Set up test configuration

    > Create a comprehensive testing setup for our Node.js project.
    > Include Jest config, test helpers, mock utilities, and CI integration
  3. Configure IDE integration

    Terminal window
    # Enable test watching
    claude "Set up Jest in watch mode with coverage reporting"
  4. Install testing MCP servers

    Terminal window
    # For browser testing
    claude mcp add playwright
    # For API testing
    claude mcp add postman

Transform requirements directly into test cases:

> We need a user authentication system with these requirements:
> - Email/password login
> - OAuth (Google, GitHub)
> - Session management with JWT
> - Password reset via email
> - Account lockout after 5 failed attempts
> - 2FA optional
>
> Generate comprehensive tests for all these features

Claude generates:

Unit Tests

auth.service.test.js
describe('AuthService', () => {
describe('login', () => {
it('should authenticate valid credentials');
it('should reject invalid password');
it('should reject non-existent user');
it('should increment failed attempts');
it('should lock account after 5 failures');
it('should handle case-insensitive emails');
});
describe('password reset', () => {
it('should generate secure reset token');
it('should expire tokens after 1 hour');
it('should invalidate token after use');
it('should prevent enumeration attacks');
});
});

Integration Tests

auth.integration.test.js
describe('Auth API', () => {
it('POST /login returns JWT for valid user');
it('POST /login rate limits requests');
it('GET /oauth/google redirects correctly');
it('POST /reset-password sends email');
it('validates JWT in protected routes');
it('refreshes tokens correctly');
});

E2E Tests

auth.e2e.test.js
describe('Auth Flow', () => {
it('user can complete login flow');
it('user can reset forgotten password');
it('OAuth login creates new account');
it('2FA flow works correctly');
it('session persists across pages');
it('logout clears all sessions');
});

Security Tests

auth.security.test.js
describe('Security', () => {
it('prevents SQL injection in login');
it('prevents XSS in user inputs');
it('enforces password complexity');
it('prevents timing attacks');
it('masks user existence on login fail');
it('prevents session fixation');
});

Claude doesn’t just write test descriptions - it implements complete, runnable tests:

// Example: Complete test implementation
describe('AuthService', () => {
let authService;
let mockUserRepo;
let mockEmailService;
let mockTokenGenerator;
beforeEach(() => {
// Claude sets up all mocks
mockUserRepo = {
findByEmail: jest.fn(),
updateFailedAttempts: jest.fn(),
lockAccount: jest.fn()
};
mockEmailService = {
sendPasswordReset: jest.fn()
};
authService = new AuthService(
mockUserRepo,
mockEmailService,
mockTokenGenerator
);
});
describe('login', () => {
it('should authenticate valid credentials', async () => {
// Arrange
const credentials = {
email: 'user@example.com',
password: 'SecurePass123!'
};
const hashedPassword = await bcrypt.hash(credentials.password, 10);
const user = {
id: 1,
email: credentials.email,
password: hashedPassword,
failedAttempts: 0,
isLocked: false
};
mockUserRepo.findByEmail.mockResolvedValue(user);
// Act
const result = await authService.login(credentials);
// Assert
expect(result).toHaveProperty('token');
expect(result).toHaveProperty('refreshToken');
expect(result.user).not.toHaveProperty('password');
expect(mockUserRepo.updateFailedAttempts).toHaveBeenCalledWith(1, 0);
});
it('should lock account after 5 failed attempts', async () => {
// Claude implements complete test with edge cases
const user = {
id: 1,
failedAttempts: 4,
isLocked: false
};
mockUserRepo.findByEmail.mockResolvedValue(user);
await expect(authService.login({
email: 'user@example.com',
password: 'wrong'
})).rejects.toThrow('Account locked');
expect(mockUserRepo.lockAccount).toHaveBeenCalledWith(1);
expect(mockUserRepo.updateFailedAttempts).toHaveBeenCalledWith(1, 5);
});
});
});
  1. Run the failing tests

    > Run the auth tests. They should all fail since we haven't
    > implemented anything yet
  2. Implement minimal code

    > Implement just enough code to make the first test pass.
    > Don't add any functionality not required by the tests
  3. Iterate until green

    > Continue implementing until all tests pass. Show me the
    > test output after each change
  4. Refactor with confidence

    > Now that tests pass, refactor the implementation for better
    > structure and performance. Keep running tests to ensure nothing breaks

Beyond specific examples, test properties that should always hold:

> Generate property-based tests for our sorting algorithm.
> It should maintain these properties regardless of input:
> - Output length equals input length
> - Output contains same elements as input
> - Each element is <= the next element
> - Sorting twice gives same result

Claude creates:

import fc from 'fast-check';
describe('Sorting Algorithm Properties', () => {
it('maintains array length', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = sortArray([...arr]);
expect(sorted.length).toBe(arr.length);
})
);
});
it('preserves all elements', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const sorted = sortArray([...arr]);
const sortedCopy = [...sorted].sort((a, b) => a - b);
const originalSorted = [...arr].sort((a, b) => a - b);
expect(sortedCopy).toEqual(originalSorted);
})
);
});
it('is idempotent', () => {
fc.assert(
fc.property(fc.array(fc.integer()), (arr) => {
const once = sortArray([...arr]);
const twice = sortArray([...once]);
expect(twice).toEqual(once);
})
);
});
});

Ensure your tests actually catch bugs:

> Use mutation testing to verify our test suite quality.
> Introduce small bugs and confirm tests catch them

Claude implements:

// Original function
function calculateDiscount(price, discountPercent) {
if (discountPercent < 0 || discountPercent > 100) {
throw new Error('Invalid discount');
}
return price * (1 - discountPercent / 100);
}
// Claude generates mutations and verifies tests catch them
const mutations = [
{
name: 'Boundary mutation',
change: 'discountPercent > 100',
to: 'discountPercent >= 100'
},
{
name: 'Operation mutation',
change: '1 - discountPercent',
to: '1 + discountPercent'
},
{
name: 'Constant mutation',
change: '/ 100',
to: '/ 1000'
}
];
// Verifies each mutation causes test failures
mutations.forEach(mutation => {
console.log(`Testing mutation: ${mutation.name}`);
// Apply mutation and run tests
// Ensure at least one test fails
});

For microservices, ensure API contracts are maintained:

> Generate contract tests between our Order Service and Payment Service.
> Order Service expects payment responses in a specific format
// Consumer (Order Service) contract test
describe('Payment Service Contract', () => {
it('should return expected payment response structure', async () => {
const payment = await paymentService.processPayment({
amount: 100,
currency: 'USD',
method: 'credit_card'
});
// Contract expectations
expect(payment).toMatchObject({
id: expect.any(String),
status: expect.stringMatching(/^(pending|completed|failed)$/),
amount: expect.any(Number),
currency: expect.any(String),
processedAt: expect.any(String),
provider: expect.objectContaining({
name: expect.any(String),
transactionId: expect.any(String)
})
});
});
});
// Provider (Payment Service) contract test
describe('Payment Service Provider Contract', () => {
it('should fulfill Order Service expectations', async () => {
const response = await request(app)
.post('/payments')
.send({ amount: 100, currency: 'USD', method: 'credit_card' });
// Verify response matches consumer expectations
expect(response.body).toMatchSnapshot('payment-response-contract');
});
});
> Write comprehensive tests for a React shopping cart component that:
> - Displays items with quantities
> - Calculates totals with tax
> - Handles quantity updates
> - Removes items
> - Shows empty state
> - Persists to localStorage

Claude generates:

import { render, screen, fireEvent, waitFor } from '@testing-library/react';
import userEvent from '@testing-library/user-event';
import { ShoppingCart } from './ShoppingCart';
describe('ShoppingCart Component', () => {
const mockItems = [
{ id: 1, name: 'Widget', price: 10, quantity: 2 },
{ id: 2, name: 'Gadget', price: 20, quantity: 1 }
];
beforeEach(() => {
localStorage.clear();
});
it('renders items with correct quantities and prices', () => {
render(<ShoppingCart items={mockItems} taxRate={0.08} />);
expect(screen.getByText('Widget')).toBeInTheDocument();
expect(screen.getByText('$10.00 × 2')).toBeInTheDocument();
expect(screen.getByText('Gadget')).toBeInTheDocument();
expect(screen.getByText('$20.00 × 1')).toBeInTheDocument();
});
it('calculates total with tax correctly', () => {
render(<ShoppingCart items={mockItems} taxRate={0.08} />);
const subtotal = 10 * 2 + 20 * 1; // $40
const tax = subtotal * 0.08; // $3.20
const total = subtotal + tax; // $43.20
expect(screen.getByText('Subtotal: $40.00')).toBeInTheDocument();
expect(screen.getByText('Tax: $3.20')).toBeInTheDocument();
expect(screen.getByText('Total: $43.20')).toBeInTheDocument();
});
it('updates quantity when changed', async () => {
const onUpdate = jest.fn();
render(
<ShoppingCart
items={mockItems}
taxRate={0.08}
onUpdateQuantity={onUpdate}
/>
);
const quantityInput = screen.getByLabelText('Quantity for Widget');
await userEvent.clear(quantityInput);
await userEvent.type(quantityInput, '5');
expect(onUpdate).toHaveBeenCalledWith(1, 5);
});
it('persists cart to localStorage', () => {
render(<ShoppingCart items={mockItems} taxRate={0.08} />);
const savedCart = JSON.parse(localStorage.getItem('cart'));
expect(savedCart).toEqual(mockItems);
});
// Claude adds accessibility tests too
it('is keyboard navigable', async () => {
render(<ShoppingCart items={mockItems} taxRate={0.08} />);
const firstButton = screen.getByLabelText('Remove Widget from cart');
firstButton.focus();
expect(document.activeElement).toBe(firstButton);
// Tab through interactive elements
await userEvent.tab();
expect(document.activeElement).toBe(
screen.getByLabelText('Quantity for Widget')
);
});
});
> Create API tests for our REST endpoints including:
> - Authentication
> - CRUD operations
> - Error handling
> - Rate limiting
> - Response validation
import request from 'supertest';
import app from '../app';
import { createTestUser, generateToken } from './helpers';
describe('User API', () => {
let authToken;
let testUser;
beforeAll(async () => {
testUser = await createTestUser();
authToken = generateToken(testUser);
});
describe('GET /api/users/:id', () => {
it('returns user data for authenticated requests', async () => {
const response = await request(app)
.get(`/api/users/${testUser.id}`)
.set('Authorization', `Bearer ${authToken}`)
.expect(200);
expect(response.body).toMatchObject({
id: testUser.id,
email: testUser.email,
name: testUser.name
});
expect(response.body).not.toHaveProperty('password');
});
it('returns 401 without authentication', async () => {
await request(app)
.get(`/api/users/${testUser.id}`)
.expect(401);
});
it('returns 404 for non-existent user', async () => {
await request(app)
.get('/api/users/99999')
.set('Authorization', `Bearer ${authToken}`)
.expect(404)
.expect(res => {
expect(res.body.error).toBe('User not found');
});
});
it('validates response schema', async () => {
const response = await request(app)
.get(`/api/users/${testUser.id}`)
.set('Authorization', `Bearer ${authToken}`)
.expect(200);
// Claude includes schema validation
expect(response.body).toMatchSchema({
type: 'object',
required: ['id', 'email', 'name', 'createdAt'],
properties: {
id: { type: 'number' },
email: { type: 'string', format: 'email' },
name: { type: 'string' },
createdAt: { type: 'string', format: 'date-time' }
}
});
});
});
describe('Rate Limiting', () => {
it('enforces rate limits', async () => {
const requests = Array(101).fill(null).map(() =>
request(app)
.get('/api/users')
.set('Authorization', `Bearer ${authToken}`)
);
const responses = await Promise.all(requests);
const rateLimited = responses.filter(r => r.status === 429);
expect(rateLimited.length).toBeGreaterThan(0);
expect(rateLimited[0].headers['x-ratelimit-limit']).toBe('100');
expect(rateLimited[0].headers['retry-after']).toBeDefined();
});
});
});
> Create load tests for our API that simulate:
> - 1000 concurrent users
> - Mixed read/write operations
> - Gradual ramp-up
> - Spike testing
> - Endurance testing over 1 hour

Claude generates:

// k6 load test script
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Rate, Trend } from 'k6/metrics';
// Custom metrics
const errorRate = new Rate('errors');
const apiLatency = new Trend('api_latency');
export const options = {
scenarios: {
// Gradual ramp-up
gradual_load: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '5m', target: 100 },
{ duration: '10m', target: 500 },
{ duration: '10m', target: 1000 },
{ duration: '5m', target: 0 },
],
},
// Spike test
spike_test: {
executor: 'ramping-vus',
startVUs: 0,
stages: [
{ duration: '2m', target: 100 },
{ duration: '30s', target: 2000 },
{ duration: '2m', target: 100 },
{ duration: '30s', target: 0 },
],
startTime: '30m',
},
// Endurance test
endurance: {
executor: 'constant-vus',
vus: 200,
duration: '1h',
startTime: '35m',
},
},
thresholds: {
http_req_duration: ['p(95) < 500'], // 95% of requests under 500ms
errors: ['rate < 0.01'], // Error rate under 1%
},
};
export default function () {
const userId = Math.floor(Math.random() * 10000);
// Mixed operations based on realistic usage
const operation = Math.random();
if (operation < 0.7) {
// 70% read operations
const response = http.get(
`${__ENV.API_URL}/api/users/${userId}`,
{ headers: { Authorization: `Bearer ${__ENV.TOKEN}` } }
);
check(response, {
'read status is 200': (r) => r.status === 200,
'read response time < 200ms': (r) => r.timings.duration < 200,
});
errorRate.add(response.status !== 200);
apiLatency.add(response.timings.duration);
} else if (operation < 0.9) {
// 20% write operations
const payload = JSON.stringify({
name: `User ${userId}`,
email: `user${userId}@example.com`,
});
const response = http.put(
`${__ENV.API_URL}/api/users/${userId}`,
payload,
{
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${__ENV.TOKEN}`,
},
}
);
check(response, {
'write status is 200': (r) => r.status === 200,
'write response time < 500ms': (r) => r.timings.duration < 500,
});
errorRate.add(response.status !== 200);
apiLatency.add(response.timings.duration);
} else {
// 10% complex operations
const batch = http.batch([
['GET', `${__ENV.API_URL}/api/users/${userId}/orders`],
['GET', `${__ENV.API_URL}/api/users/${userId}/preferences`],
['GET', `${__ENV.API_URL}/api/users/${userId}/history`],
]);
batch.forEach(response => {
errorRate.add(response.status !== 200);
apiLatency.add(response.timings.duration);
});
}
sleep(Math.random() * 3 + 1); // Random think time 1-4s
}
export function handleSummary(data) {
return {
'summary.html': htmlReport(data),
'summary.json': JSON.stringify(data),
};
}
> Set up a comprehensive CI/CD testing pipeline that:
> - Runs on every commit
> - Parallelizes test execution
> - Generates coverage reports
> - Runs different test suites based on changes
> - Notifies on failures
> - Blocks deployment if quality gates fail

Claude creates:

.github/workflows/test-pipeline.yml
name: Comprehensive Test Pipeline
on:
pull_request:
types: [opened, synchronize]
push:
branches: [main, develop]
jobs:
# Quick tests run first
unit-tests:
runs-on: ubuntu-latest
strategy:
matrix:
node: [18, 20]
shard: [1, 2, 3, 4] # Parallel execution
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Run unit tests (shard ${{ matrix.shard }}/4)
run: |
npm test -- \
--shard=${{ matrix.shard }}/4 \
--coverage \
--coverageReporters=json
- name: Upload coverage
uses: actions/upload-artifact@v3
with:
name: coverage-${{ matrix.node }}-${{ matrix.shard }}
path: coverage/coverage-final.json
# Integration tests after unit tests pass
integration-tests:
needs: unit-tests
runs-on: ubuntu-latest
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: testpass
options: >-
--health-cmd pg_isready
--health-interval 10s
redis:
image: redis:7
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
- name: Install and build
run: |
npm ci
npm run build
- name: Run integration tests
env:
DATABASE_URL: postgresql://postgres:testpass@localhost/test
REDIS_URL: redis://localhost
run: npm run test:integration
# E2E tests only on main branch
e2e-tests:
if: github.ref == 'refs/heads/main'
needs: [unit-tests, integration-tests]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install Playwright
run: npx playwright install --with-deps
- name: Run E2E tests
run: npm run test:e2e
- name: Upload test artifacts
if: failure()
uses: actions/upload-artifact@v3
with:
name: playwright-traces
path: test-results/
# Aggregate coverage and check quality gates
quality-gates:
needs: [unit-tests, integration-tests]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Download all coverage reports
uses: actions/download-artifact@v3
with:
pattern: coverage-*
path: coverage/
- name: Merge coverage reports
run: |
npx nyc merge coverage coverage/merged.json
npx nyc report \
--reporter=text \
--reporter=html \
--reporter=json-summary \
--temp-dir=coverage
- name: Check coverage thresholds
run: |
node scripts/check-coverage.js \
--statements=80 \
--branches=75 \
--functions=80 \
--lines=80
- name: Comment PR with coverage
if: github.event_name == 'pull_request'
uses: actions/github-script@v6
with:
script: |
const coverage = require('./coverage/coverage-summary.json');
const comment = generateCoverageComment(coverage);
github.rest.issues.createComment({
...context.repo,
issue_number: context.issue.number,
body: comment
});
// Claude suggests this structure
describe('PaymentService', () => {
// Group by functionality
describe('processPayment', () => {
// Group by scenario
describe('with valid card', () => {
it('charges the correct amount');
it('returns transaction ID');
it('sends receipt email');
});
describe('with invalid card', () => {
it('throws appropriate error');
it('does not charge customer');
it('logs failed attempt');
});
});
});
> Create a test data factory system that generates:
> - Realistic user profiles
> - Valid/invalid credit cards
> - Order histories
> - Edge case data

Claude can adapt to your preferred style:

  • Jest expectations
  • Chai assertions
  • Should.js
  • Custom matchers
> Optimize our test suite. It takes 20 minutes to run.
> Make it faster without losing coverage

You’ve learned how Claude Code transforms testing from a chore into a superpower. With AI handling the boilerplate, edge cases, and framework setup, you can focus on what really matters: ensuring your code works correctly and stays that way.

Remember: Tests are not just about catching bugs - they’re about confidence. The confidence to refactor, to deploy on Friday afternoon, to hand off code to teammates. Let Claude Code help you build that confidence through comprehensive, maintainable test suites that would take days to write manually.