Database Operations with Codex
Your product manager wants a reporting dashboard that queries across five tables with aggregations, date ranges, and user-level filters. The existing queries are already slow on the main tables, and adding complex joins will make things worse. You need a new schema, optimized queries, proper indexes, and a migration that runs safely on a production database with 50 million rows. Getting this wrong means downtime. Codex can design the schema, generate the migration, write the queries, and test them against a real database in a cloud environment — all before you touch production.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- Prompts for schema design, migration generation, and query optimization with real ORMs
- A cloud environment workflow for testing migrations against production-sized data
- An automation recipe for weekly query performance audits
- Techniques for using MCP servers to give Codex direct database access during development
The Workflow
Section titled “The Workflow”Step 1: Design the Schema
Section titled “Step 1: Design the Schema”Start in the CLI or IDE extension where you can iterate quickly. Give Codex the requirements and your existing schema as context.
Review the design, ask follow-up questions, and iterate. Schema changes are expensive to undo — invest time here.
Step 2: Generate the Migration
Section titled “Step 2: Generate the Migration”Once the schema is finalized, generate the migration in a worktree to keep it isolated:
Generate a Drizzle migration for the reporting dashboard tables we just designed.
Requirements:- The migration must be safe to run on a production database with 50M+ rows- Add tables and indexes in the correct order (tables before indexes, referenced tables before referencing tables)- Use IF NOT EXISTS for safety- Include a DOWN migration that cleanly drops everything- Follow the migration naming convention in drizzle/migrations/
After generating the migration, run it against the dev database to verify it applies cleanly.Do NOT populate data. We will handle backfill separately.Step 3: Write Optimized Queries
Section titled “Step 3: Write Optimized Queries”Database queries are where Codex’s ability to read your entire schema pays off. It can write queries that account for indexes, join strategies, and the ORM’s actual API.
Step 4: Seed Test Data
Section titled “Step 4: Seed Test Data”Comprehensive test data makes it possible to verify queries and catch performance issues before production. Generate a seeding script:
Create a database seeding script at scripts/seed-reports.ts that:
1. Creates 100 test users2. Generates 6 months of order data (varying volume per user, per day)3. Includes realistic patterns: weekday vs weekend variation, seasonal trends, some users with refunds4. Populates the daily summary table from the generated orders5. Is idempotent (safe to run multiple times)
Use Drizzle for inserts. Use batched inserts (chunks of 1000) for performance.The script should be runnable via: npx tsx scripts/seed-reports.tsStep 5: Test Migrations in Cloud
Section titled “Step 5: Test Migrations in Cloud”Before running migrations on staging or production, test them in a cloud environment that mirrors your production setup. Cloud environments support setup scripts where you can install your database, run migrations, and seed data.
Configure your cloud environment with a setup script:
# Install dependenciesnpm install
# Start PostgreSQLpg_ctl start
# Run all existing migrationsnpm run db:migrate
# Seed with production-scale test datanpx tsx scripts/seed-reports.tsThen submit a cloud task:
codex cloud exec --env db-test "Run the new reporting dashboard migration. After it completes, execute each of the 5 dashboard queries from src/lib/db/queries/reports.ts against the seeded data. Report:1. Whether each query returns correct results2. Execution time for each query3. EXPLAIN ANALYZE output for the two slowest queries4. Any missing indexes or full table scans
If any query takes longer than 500ms, suggest optimizations."Performance Auditing with Automations
Section titled “Performance Auditing with Automations”Set up a weekly automation to monitor query performance:
Using MCP for Direct Database Access
Section titled “Using MCP for Direct Database Access”If you want Codex to query your database directly during development (not just generate code), configure a PostgreSQL MCP server:
[mcp_servers.postgres]type = "stdio"command = "npx"args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost:5432/mydb"]With the MCP server connected, Codex can execute queries during a session, inspect actual data, and verify results without you running the queries manually.
When This Breaks
Section titled “When This Breaks”Migration works on dev but fails on production-sized data. Adding an index on a 50M-row table can take minutes and lock the table. Tell Codex: “Generate the index creation with CONCURRENTLY for PostgreSQL so it does not lock the table. Include a note about expected duration based on table size.”
ORM-generated queries are inefficient. Drizzle (and other ORMs) sometimes generate suboptimal SQL. Include “check the generated SQL for each query using .toSQL() and verify it uses the expected indexes” in your prompt. For critical queries, consider raw SQL wrapped in a Drizzle execute call.
Cloud environment does not have enough data to reveal performance issues. The seeding script generates 100 users with 6 months of data, but production has 100K users with 3 years of data. Scale your seed script proportionally, or explicitly test with EXPLAIN ANALYZE against production table statistics.
Timezone handling in date aggregations. Date-range queries that do not account for timezones will produce incorrect results for users in different zones. Include “all date-range queries must accept and handle timezone-aware timestamps. Store in UTC, convert for display” in your constraints.