Database Migration Patterns
You shipped an innocent-looking ALTER TABLE products ALTER COLUMN price TYPE numeric at 2pm. Postgres took an ACCESS EXCLUSIVE lock to rewrite a 40-million-row table, every checkout query queued behind it, and the storefront started returning 500s for nine minutes while your PM watched the error graph climb. The migration was “correct” — it just wasn’t safe.
The fix isn’t to write SQL more carefully by hand. It’s to let an AI agent generate the migration, then make it review its own work against the exact hazards that cause incidents: long locks, unbatched backfills, and indexes built without CONCURRENTLY. This recipe shows the workflow across Cursor, Claude Code, and Codex.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- A reusable expand-contract prompt that turns a risky one-shot
ALTERinto three safe, numbered migrations plus the dual-write application code - A migration safety-review prompt that flags
ACCESS EXCLUSIVElocks, missingCONCURRENTLY, and unbatchedUPDATEs before you run anything - A read-only Postgres MCP server wired in so the agent reads your live schema instead of guessing column types
- A rollback recipe and a “When This Breaks” runbook for lock contention, half-finished backfills, and partial migration state
The Expand-Contract Workflow
Section titled “The Expand-Contract Workflow”The single most useful pattern for zero-downtime change is expand-contract (a.k.a. parallel change): never mutate a column in place. Add the new shape, dual-write to both, backfill in batches, cut reads over, then drop the old shape in a later deploy. Each step is independently deployable and reversible.
The change-the-column-type incident above becomes three migrations instead of one:
-- 001_expand_add_price_cents.sql (deploy 1: add nullable column, no rewrite, no lock)ALTER TABLE products ADD COLUMN price_cents integer;
-- 002_backfill_price_cents.sql (deploy 1+: run OUTSIDE a txn, in batches)-- Repeat until 0 rows are updated. Each statement is a short, separate transaction.UPDATE productsSET price_cents = round(price * 100)WHERE price_cents IS NULL AND id IN ( SELECT id FROM products WHERE price_cents IS NULL ORDER BY id LIMIT 1000 );
-- 003_contract_drop_price.sql (deploy 3, days later: only after dual-write is live)ALTER TABLE products ALTER COLUMN price_cents SET NOT NULL;ALTER TABLE products DROP COLUMN price;Between deploy 1 and deploy 3 the application dual-writes and reads the new column with a fallback, so old and new code versions coexist safely:
// Dual-write while both columns exist (transition phase)async function setProductPrice(productId, priceCents) { await db.query( `UPDATE products SET price_cents = $1, price = $1 / 100.0 WHERE id = $2`, [priceCents, productId] );}
// Read the new column, fall back to the old one until backfill completesasync function getProductPriceCents(productId) { const { rows } = await db.query( 'SELECT price, price_cents FROM products WHERE id = $1', [productId] ); return rows[0].price_cents ?? Math.round(rows[0].price * 100);}You do not have to author this from memory. Hand the agent your real schema and let it produce the whole sequence.
Let the Agent See the Real Schema (Postgres MCP)
Section titled “Let the Agent See the Real Schema (Postgres MCP)”The prompt above works, but you had to paste the schema by hand — and if you get a column type wrong, the agent generates a migration against a table that doesn’t exist. A Postgres MCP server closes that gap: it gives the agent a connection so it can inspect live table definitions, row counts, and existing indexes before writing a single line. Use a maintained server — Postgres MCP Pro (crystaldba/postgres-mcp) is the actively supported choice, and its --access-mode=restricted flag enforces read-only access at the server level. (Avoid the old @modelcontextprotocol/server-postgres package: it’s deprecated and archived, and carries a known SQL-injection issue that lets a stacked statement escape its read-only transaction.)
Without MCP: you copy \d products output into the prompt and hope it’s current.
With MCP: the agent runs the equivalent of SELECT … FROM information_schema.columns itself, sees that price is numeric(10,2) and the table has 40M rows, and tailors the batch size and lock strategy accordingly.
Postgres MCP Pro is a Python tool — install it with pipx install postgres-mcp (or uv pip install postgres-mcp). Setup is then identical across all three tools, since they all speak MCP. Add the server to your MCP config, passing the connection string via DATABASE_URI and forcing read-only with --access-mode=restricted:
{ "mcpServers": { "postgres": { "command": "postgres-mcp", "args": ["--access-mode=restricted"], "env": { "DATABASE_URI": "postgresql://readonly@localhost:5432/mydb" } } }}For Claude Code you can add the same server from the CLI:
claude mcp add --transport stdio \ --env DATABASE_URI=postgresql://readonly@localhost:5432/mydb \ postgres -- postgres-mcp --access-mode=restrictedIf all you need is “let the agent run safe read-only queries” without a persistent connection, a lighter-weight alternative is an Agent Skill such as supabase/postgres-best-practices, installed with npx skills add supabase/agent-skills. Skills are single-purpose augmentation; the MCP server is the better fit when you want the agent to repeatedly introspect a live database across a whole session.
Driving It With Each Tool
Section titled “Driving It With Each Tool”The migration files and dual-write code are produced by the same prompt everywhere — but how you run that loop, review the diff, and ship it differs.
Open the migrations directory and switch the model picker to Claude Opus 4.8 (or Claude Fable 5 for the most complex, multi-file refactors where budget matters less than correctness). In Agent mode, paste the expand-contract prompt above. Cursor proposes the three SQL files plus the dual-write changes as a multi-file edit; review them inline in the diff view before accepting.
The visual diff is the point here: read the WHERE … LIMIT 1000 batch clause and confirm the contract migration is in a separate file from expand, so you can deploy them in different releases. Use a checkpoint before accepting so you can roll the edit back if the agent over-reaches. Then run the migration through your normal tool (npm run db:migrate) in the integrated terminal and watch for lock waits.
Generate and verify in one headless pass — ideal for a CI step or a repeatable script. Claude Code writes the files, then runs your migration test suite via the Bash tool:
claude -p "Generate an expand-contract migration replacing products.price (numeric) \with price_cents (integer) in db/migrations/. Three numbered files: expand, batched \backfill (runs outside a transaction), contract. Then run 'npm run db:migrate' against \the local test database and 'npm test -- migrations' and report failures." \ --allowedTools "Read" "Write" "Edit" "Bash(npm run db:migrate)" "Bash(npm test *)" \ --output-format json--allowedTools pre-authorizes exactly these patterns to run without a permission prompt, so in this headless -p run the agent can prove the migration applies and the suite stays green unattended — anything outside the list would stop for approval that never comes. (To hard-restrict the available tool set rather than just skip prompts, use --tools instead.) Pipe --output-format json into your CI logs for an auditable record of what ran.
Run the change on an isolated worktree so it never disturbs your working checkout. In the Codex app (or codex CLI), start a thread on a worktree based on main, then paste the expand-contract prompt.
For a scripted run, use codex exec non-interactively:
codex exec --cd ../app-migrations \ "Add an expand-contract migration replacing products.price with price_cents. \Three numbered files, batched backfill outside a transaction, contract separate. \Run the migration test suite and summarize."When the diff looks right, hit Create branch here on the worktree, push it, and open a pull request on GitHub straight from the thread. The worktree isolation means a long backfill experiment never blocks the feature branch you’re editing locally.
Make the Agent Audit Its Own Migration
Section titled “Make the Agent Audit Its Own Migration”Generating the migration is half the job. The other half is catching the production hazards a human reviewer skims past at 2pm. The two failure modes from the opening incident — full-table locks and unbatched writes — are exactly what an LLM is good at spotting when you ask it to look.
A few corrections this review reliably surfaces:
- Index builds:
CREATE INDEX CONCURRENTLY idx_products_sku ON products(sku);instead of a bareCREATE INDEX. The concurrent form avoids theACCESS EXCLUSIVElock — at the cost that it cannot run inside a transaction block, so it must be its own migration step. - New constraints: add as
NOT VALID, thenALTER TABLE … VALIDATE CONSTRAINTin a separate statement. Validation then takes only aSHARE UPDATE EXCLUSIVElock and doesn’t block writes. - Backfills: batched in a loop (as in the expand-contract files above), never one giant
UPDATEthat holds row locks and bloats the table with dead tuples.
If you wrap online schema changes yourself, prefer a battle-tested tool over hand-rolled triggers. For MySQL, pt-online-schema-change drives the copy-and-swap with triggers, getting the explicit column lists right where hand-written INSERT … SELECT NEW.* triggers go wrong (MySQL NEW is a row reference, not a table you can SELECT from). gh-ost takes the opposite approach — it’s triggerless, tailing the binary log to replay writes onto the ghost table, which keeps trigger overhead off your source under heavy write traffic. Ask the agent to invoke one of these rather than reinventing it.
A Validated Data Migration You Can Actually Run
Section titled “A Validated Data Migration You Can Actually Run”When you’re reshaping data — not just schema — pair the migration with a validating, batched copy. Here is a short, self-contained version centered on the loop the agent generates; keep it small and let the prompt produce the table-specific transform/validate:
// Batched, validated copy from a legacy table. db is a node-postgres-style client.async function migrateUsers({ batchSize = 1000 } = {}) { let cursor = 0; let migrated = 0; const errors = [];
for (;;) { const { rows: batch } = await db.query( 'SELECT * FROM legacy_users WHERE id > $1 ORDER BY id LIMIT $2', [cursor, batchSize] ); if (batch.length === 0) break;
let successful = 0; // let, not const — this gets incremented below for (const row of batch) { const email = row.email_address?.toLowerCase(); if (!email || !email.includes('@')) { errors.push({ id: row.id, reason: 'invalid email' }); continue; } await db.query( 'INSERT INTO users (email, legacy_id) VALUES ($1, $2) ON CONFLICT (email) DO NOTHING', [email, row.id] ); successful++; }
migrated += successful; cursor = batch[batch.length - 1].id; console.info(`migrated ${migrated} (batch ${batch.length}, ${errors.length} skipped)`); }
return { migrated, errors, ok: errors.length === 0 };}Note the deliberately boring details that make it production-safe: keyset pagination by id (not OFFSET, which slows down linearly), ON CONFLICT DO NOTHING for idempotent re-runs, a validation gate that records bad rows instead of crashing, and let successful = 0 so the per-batch counter can actually increment. Ask the agent to wrap each batch in a transaction if you need all-or-nothing semantics per batch.
When This Breaks
Section titled “When This Breaks”What’s Next
Section titled “What’s Next”- SQL Patterns — query optimization the agent can apply to your slow paths
- ORM Patterns — generating safe migrations through Drizzle, Prisma, and Knex
- NoSQL Patterns — document-database migrations and backfills