Database Operations with Codex

Your product manager wants a reporting dashboard that queries across five tables with aggregations, date ranges, and user-level filters. The existing queries are already slow on the main tables, and adding complex joins will make things worse. You need a new schema, optimized queries, proper indexes, and a migration that runs safely on a production database with 50 million rows. Getting this wrong means downtime. Codex can design the schema, generate the migration, write the queries, and test them against a real database in a cloud environment — all before you touch production.

What You’ll Walk Away With

Prompts for schema design, migration generation, and query optimization with real ORMs
A cloud environment workflow for testing migrations against production-sized data
An automation recipe for weekly query performance audits
Techniques for using MCP servers to give Codex direct database access during development

The Workflow

Step 1: Design the Schema

Start in the CLI or IDE extension where you can iterate quickly. Give Codex the requirements and your existing schema as context.

Copy-paste prompt for schema design:

Design database tables for a reporting dashboard. Existing schema is in src/lib/db/schema.ts.

Requirements:
- Users need pre-computed daily summaries of their activity (orders, revenue, refunds)
- Dashboard queries should not hit the main orders table directly
- Support date range filtering (last 7 days, 30 days, custom range)
- Support aggregation by day, week, and month
- Data should be backfillable from existing order records

Design the new tables using Drizzle ORM schema definitions. Include:
1. Table definitions with proper types and constraints
2. Indexes optimized for the dashboard query patterns
3. Foreign key relationships to existing tables
4. Explanation of why you chose this structure over alternatives

Do not generate the migration file yet. Just the schema design for review.

Review the design, ask follow-up questions, and iterate. Schema changes are expensive to undo — invest time here.

Step 2: Generate the Migration

Once the schema is finalized, generate the migration in a worktree to keep it isolated:

Generate a Drizzle migration for the reporting dashboard tables we just designed.

Requirements:
- The migration must be safe to run on a production database with 50M+ rows
- Add tables and indexes in the correct order (tables before indexes, referenced tables before referencing tables)
- Use IF NOT EXISTS for safety
- Include a DOWN migration that cleanly drops everything
- Follow the migration naming convention in drizzle/migrations/

After generating the migration, run it against the dev database to verify it applies cleanly.
Do NOT populate data. We will handle backfill separately.

Step 3: Write Optimized Queries

Database queries are where Codex’s ability to read your entire schema pays off. It can write queries that account for indexes, join strategies, and the ORM’s actual API.

Copy-paste prompt for query generation:

Write Drizzle ORM queries for the reporting dashboard in src/lib/db/queries/reports.ts:

1. getDailySummary(userId, startDate, endDate) - returns daily aggregates
2. getWeeklySummary(userId, startDate, endDate) - returns weekly aggregates
3. getMonthlyTrend(userId, months) - returns month-over-month comparison
4. getTopProducts(userId, startDate, endDate, limit) - top products by revenue
5. getRefundRate(userId, startDate, endDate) - refund percentage

For each query:
- Use the Drizzle query builder (not raw SQL)
- Use the indexes we defined (check the schema for available indexes)
- Include proper type annotations for return values
- Handle edge cases: no data in range, timezone handling, null aggregates

Read our existing query patterns in src/lib/db/queries/ for style conventions.

Step 4: Seed Test Data

Comprehensive test data makes it possible to verify queries and catch performance issues before production. Generate a seeding script:

Create a database seeding script at scripts/seed-reports.ts that:

1. Creates 100 test users
2. Generates 6 months of order data (varying volume per user, per day)
3. Includes realistic patterns: weekday vs weekend variation, seasonal trends, some users with refunds
4. Populates the daily summary table from the generated orders
5. Is idempotent (safe to run multiple times)

Use Drizzle for inserts. Use batched inserts (chunks of 1000) for performance.
The script should be runnable via: npx tsx scripts/seed-reports.ts

Step 5: Test Migrations in Cloud

Before running migrations on staging or production, test them in a cloud environment that mirrors your production setup. Cloud environments support setup scripts where you can install your database, run migrations, and seed data.

Configure your cloud environment with a setup script:

# Install dependencies
npm install

# Start PostgreSQL
pg_ctl start

# Run all existing migrations
npm run db:migrate

# Seed with production-scale test data
npx tsx scripts/seed-reports.ts

Then submit a cloud task:

codex cloud exec --env db-test "Run the new reporting dashboard migration. After it completes, execute each of the 5 dashboard queries from src/lib/db/queries/reports.ts against the seeded data. Report:
1. Whether each query returns correct results
2. Execution time for each query
3. EXPLAIN ANALYZE output for the two slowest queries
4. Any missing indexes or full table scans

If any query takes longer than 500ms, suggest optimizations."

Performance Auditing with Automations

Set up a weekly automation to monitor query performance:

Copy-paste automation prompt for query performance audit:

Run all database queries in src/lib/db/queries/ against the dev database with EXPLAIN ANALYZE. For each query:

1. Report execution time
2. Flag any sequential scans on tables with more than 10,000 rows
3. Flag any queries that do not use available indexes
4. Flag any N+1 query patterns in the service layer

If performance degraded compared to last week's audit, investigate which recent code changes caused the regression. Report findings with specific file paths and line numbers.

Using MCP for Direct Database Access

If you want Codex to query your database directly during development (not just generate code), configure a PostgreSQL MCP server:

[mcp_servers.postgres]
type = "stdio"
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost:5432/mydb"]

With the MCP server connected, Codex can execute queries during a session, inspect actual data, and verify results without you running the queries manually.

When This Breaks

Migration works on dev but fails on production-sized data. Adding an index on a 50M-row table can take minutes and lock the table. Tell Codex: “Generate the index creation with CONCURRENTLY for PostgreSQL so it does not lock the table. Include a note about expected duration based on table size.”

ORM-generated queries are inefficient. Drizzle (and other ORMs) sometimes generate suboptimal SQL. Include “check the generated SQL for each query using .toSQL() and verify it uses the expected indexes” in your prompt. For critical queries, consider raw SQL wrapped in a Drizzle execute call.

Cloud environment does not have enough data to reveal performance issues. The seeding script generates 100 users with 6 months of data, but production has 100K users with 3 years of data. Scale your seed script proportionally, or explicitly test with EXPLAIN ANALYZE against production table statistics.

Timezone handling in date aggregations. Date-range queries that do not account for timezones will produce incorrect results for users in different zones. Include “all date-range queries must accept and handle timezone-aware timestamps. Store in UTC, convert for display” in your constraints.

What’s Next

Deployment Automation with Codex Deploy your database changes alongside application updates

Performance Optimization Profile and optimize the queries and systems you built