Skip to content

API Development with AI

Your GET /posts resolver loads each post’s author with a separate query. It sailed through review and the demo, because the demo had three posts. In production a feed renders 50 posts, the resolver fires 51 queries, and the database connection pool is on fire by 9am. The AI wrote exactly what you asked for — it just didn’t know your access pattern, because the prompt never told it.

AI is genuinely fast at API work: spec-to-code, validation, error middleware, contract tests. But “fast” turns into “on fire in prod” when you let it improvise the shape of the system. The reliable workflow is spec-first: pin the contract (OpenAPI, GraphQL schema, or proto), make the AI generate against it, and let your tests — not the demo — decide when it’s done.

  • A spec-first loop where the contract drives generation, so the implementation can’t silently drift
  • Copy-paste prompts that name the stack (Express + TypeScript + Zod + Vitest, Pact for contracts, k6 for load) instead of leaving [placeholder] brackets
  • The Cursor / Claude Code / Codex variant for spec-to-code, CI contract runs, and SDK regeneration
  • The failure modes that bite AI-generated APIs: spec drift, N+1 resolvers, missing pagination cursors, auth middleware ordering
  1. Pin the contract. Generate the OpenAPI spec, GraphQL schema, or proto file first and review it as a human. This is the artifact everything else is checked against.

  2. Generate against the contract. Point the agent at the spec file and ask it to implement endpoints/resolvers with validation and error handling — not to invent the API as it goes.

  3. Lock behavior with tests. Generate unit, integration, and contract tests. Make the response shapes in the tests match the handlers exactly, then run them in CI.

  4. Regenerate clients. Re-run the SDK generator from the (now authoritative) spec so consumers stay in lockstep with the server.

Whatever the protocol, get the AI to produce the contract before any implementation. Be specific about the maturity and conventions you want.

The AI returns a spec you can review and version. A trimmed slice of what to expect:

paths:
/tasks:
get:
summary: List tasks
parameters:
- { name: status, in: query, schema: { type: string, enum: [todo, in_progress, done] } }
- { name: limit, in: query, schema: { type: integer, default: 20, maximum: 100 } }
- { name: cursor, in: query, schema: { type: string } }
responses:
'200':
description: Paginated task list
content:
application/json:
schema: { $ref: '#/components/schemas/TaskList' }

For GraphQL, ask for the schema with the connection types and subscriptions spelled out; for gRPC, ask for the .proto with streaming RPCs and field masks. The discipline is the same: contract first, review, then implement.

Generate the implementation against the spec

Section titled “Generate the implementation against the spec”

Now point the agent at the spec and name the stack. The response shape it returns must match what your tests will assert — drift here is the number-one source of “passes locally, 500s in CI.”

A representative slice of what the agent produces:

import { Router } from 'express';
import { z } from 'zod';
import { requireAuth } from '../middleware/auth';
const listQuery = z.object({
status: z.enum(['todo', 'in_progress', 'done']).optional(),
limit: z.coerce.number().int().min(1).max(100).default(20),
cursor: z.string().optional(),
});
const router = Router();
router.get('/tasks', requireAuth, async (req, res, next) => {
const parsed = listQuery.safeParse(req.query);
if (!parsed.success) {
return res.status(400).json({
type: 'about:blank',
title: 'Invalid query parameters',
status: 400,
errors: parsed.error.issues,
});
}
try {
const { data, nextCursor } = await taskService.list({
...parsed.data,
userId: req.user.id,
});
res.json({ data, nextCursor }); // shape matches the spec and the tests
} catch (err) {
next(err);
}
});

Schema-aware generation with an MCP server

Section titled “Schema-aware generation with an MCP server”

For database-backed endpoints, the single biggest quality jump comes from giving the agent your real schema instead of making it guess. A Postgres MCP server turns “generate a tasks endpoint” from blind scaffolding into schema-accurate code with the right column names, types, and indexes.

Without it: the AI invents taskService.list() and you spend a round correcting field names against your actual tables.

With it: the agent reads the live schema, generates queries that match it, and flags the missing index behind your status filter. For TypeScript teams, the Prisma Postgres MCP is built into the Prisma CLI and also manages migrations:

Terminal window
# Claude Code — register the Prisma Postgres MCP (schema + migrations)
claude mcp add prisma -- npx prisma mcp

The same server registers in Cursor (Settings -> MCP) and Codex (~/.codex/config.toml) — MCP setup is identical across all three tools. If you only need lightweight, single-purpose augmentation — say, linting the OpenAPI spec rather than a persistent DB connection — an Agent Skill is the lighter fit: install one from skills.sh with npx skills add <owner/repo> (the universal CLI from vercel-labs/skills), which works across Claude Code, Cursor, and Codex.

Generate the cross-cutting middleware once, and be explicit that ordering matters.

The point of tests here is to freeze the response shape and the error contract so a later AI edit can’t quietly change them.

Use Agent mode to generate the integration suite, then run it inline. In Settings -> Cursor Settings -> Agents -> Auto-Run, allowlist npx vitest so the suite runs without prompting, and watch the diff: reject any change where the test’s asserted body diverges from the handler’s actual response. Cursor’s multi-file edit is the sweet spot for “regenerate the handler and its test together so the shapes stay in sync.”

The generated integration test must mirror the handler’s { data, nextCursor } shape:

describe('GET /tasks', () => {
it('returns a cursor-paginated list', async () => {
const token = await getAuthToken();
const res = await request(app)
.get('/tasks?status=todo&limit=10')
.set('Authorization', `Bearer ${token}`)
.expect(200);
expect(res.body.data).toBeInstanceOf(Array);
expect(['string', 'object']).toContain(typeof res.body.nextCursor); // string or null
});
});

For load, generate a k6 script with explicit thresholds so a regression fails the run rather than just looking slow:

import http from 'k6/http';
import { check } from 'k6';
export const options = {
stages: [
{ duration: '2m', target: 100 },
{ duration: '5m', target: 100 },
{ duration: '2m', target: 0 },
],
thresholds: {
http_req_duration: ['p(95)<500'],
http_req_failed: ['rate<0.01'],
},
};
export default function () {
const res = http.get(`${__ENV.BASE_URL}/tasks`, {
headers: { Authorization: `Bearer ${__ENV.TOKEN}` },
});
check(res, { 'status is 200': (r) => r.status === 200 });
}

When you cut a v2, generate the version middleware and emit a deprecation signal with a clearly future sunset date.

app.use('/api/v1', v1Routes);
app.use('/api/v2', v2Routes);
const deprecateV1 = (_req, res, next) => {
// RFC 9745: Deprecation is an sf-date (RFC 9651) — an @-prefixed Unix timestamp, not "true".
res.setHeader('Deprecation', '@1780617600'); // 2026-06-05, the date v1 was deprecated
res.setHeader('Sunset', 'Wed, 31 Dec 2026 23:59:59 GMT'); // RFC 8594: HTTP-date
res.setHeader('Link', '<https://docs.example.com/migration-v2>; rel="deprecation"');
next();
};

After any spec change, regenerate the clients so consumers move with you:

Terminal window
npx @openapitools/openapi-generator-cli generate -i openapi.yaml -g typescript-axios -o ./sdk/typescript