Skip to content

Database Design and Queries

You are staring at a migration that touches millions of rows, has to keep referential integrity, and cannot take the app offline. The schema has eighteen tables you half-remember and a foreign-key graph nobody has drawn since 2022. Paste a stripped-down dump into a chat window and the AI guesses at relationships it cannot see; the SQL it returns references columns that do not exist.

A database MCP server fixes the blind spot. It gives Cursor, Claude Code, and Codex a live, structured connection to your database so they can introspect the real schema, read actual EXPLAIN output, and check constraints before suggesting a single line of DDL. The work shifts from describing your database to reasoning about it together.

  • A read-only database MCP server wired into Cursor, Claude Code, and Codex (setup is nearly identical across all three).
  • A reusable prompt for designing a new schema that fits your existing tables.
  • A query-optimization workflow that reads real execution plans instead of guessing at indexes.
  • A zero-downtime migration template for column changes on large tables.
  • A short list of the failure modes that bite first, and how to recover.

Why a database MCP server beats pasting schema dumps

Section titled “Why a database MCP server beats pasting schema dumps”

Live schema, not a snapshot

The model sees real tables, columns, types, and constraints through the connection, so it stops inventing foreign keys that do not exist.

Real execution plans

A maintained server can run EXPLAIN/EXPLAIN ANALYZE and read pg_stat_statements, so index advice is based on selectivity, not vibes.

Safer migrations

With the actual schema in context, the AI can plan additive, reversible steps and generate the rollback alongside the forward script.

Scoped access

Point the server at a read-only role on a dev or staging replica. The model gets full introspection without write access to anything that matters.

Setup is the same shape on all three tools: a stdio server launched with uvx (Python servers) or npx (Node servers), with the connection string passed through an environment variable so it never lands in shell history or a checked-in config.

The recommended server is Crystal DBA’s Postgres MCP Pro (postgres-mcp on PyPI). On top of plain SQL it adds EXPLAIN analysis, index-tuning recommendations, and database health checks, and it ships an --access-mode restricted flag that enforces read-only, single-statement execution.

Add to .cursor/mcp.json (project) or ~/.cursor/mcp.json (global):

{
"mcpServers": {
"postgres": {
"command": "uvx",
"args": ["postgres-mcp", "--access-mode", "restricted"],
"env": { "DATABASE_URI": "postgresql://readonly:secret@localhost:5432/mydb" }
}
}
}

If you are on Supabase, the Supabase MCP (@supabase/mcp-server-supabase) is a drop-in alternative that authenticates with a personal access token and supports --read-only.

The other three follow the exact same pattern shown above for Postgres: a command/args/env block in .cursor/mcp.json, a claude mcp add ... -- ... line, or an [mcp_servers.<name>] table in ~/.codex/config.toml. Only the package name and the connection environment variable change. The Claude Code form is shown below; the Cursor and Codex equivalents map one-to-one.

Server: mysql-mcp-server (PyPI). It reads discrete environment variables, not a single connection URL.

Terminal window
claude mcp add mysql \
-e MYSQL_HOST=localhost -e MYSQL_PORT=3306 \
-e MYSQL_USER=readonly -e MYSQL_PASSWORD=secret -e MYSQL_DATABASE=mydb \
-- uvx mysql-mcp-server

For Cursor, put the same five MYSQL_* keys in the env object alongside "command": "uvx", "args": ["mysql-mcp-server"]. A lone MYSQL_URL does nothing — the server will start with no credentials and fail to connect.

With the schema in context, design stops being isolated planning and becomes a dialogue. The model examines your real users and orders tables, then proposes additions that match your naming conventions, types, and constraints.

A typical result, grounded in your actual structure:

-- Required for the EXCLUDE constraint below: GiST equality on uuid/text
-- columns needs the btree_gist operator classes.
CREATE EXTENSION IF NOT EXISTS btree_gist;
CREATE TABLE subscription_plans (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(100) NOT NULL,
price DECIMAL(10,2) NOT NULL,
billing_cycle INTERVAL NOT NULL, -- e.g. '1 month'::interval
features JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE user_subscriptions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
plan_id UUID NOT NULL REFERENCES subscription_plans(id),
status VARCHAR(20) NOT NULL DEFAULT 'active',
current_period_start TIMESTAMPTZ NOT NULL,
current_period_end TIMESTAMPTZ NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
-- One active subscription per user, enforced at the DB level
CONSTRAINT one_active_subscription
EXCLUDE (user_id WITH =) WHERE (status = 'active')
);

If you prefer to skip the extension, the same rule can be a partial unique index: CREATE UNIQUE INDEX one_active_sub ON user_subscriptions(user_id) WHERE status = 'active';. The value of working through the MCP server is that the model already knows users(id) is a UUID and matches it, instead of guessing BIGINT.

Schemas rarely emerge perfect. Because the server keeps context across turns, you can iterate: “Now add an audit trail that records every plan change with the previous and new plan, a reason (upgrade, downgrade, renewal), and the proration amount,” and the follow-up DDL will reference user_subscriptions(id) correctly.

The strongest part of database MCP integration is turning a business question into efficient SQL, then validating it against a real execution plan.

SELECT
c.id,
c.email,
COUNT(o.id) AS order_count,
SUM(o.total_amount) AS total_revenue,
AVG(o.total_amount) AS avg_order_value,
MAX(o.created_at) AS last_order_at
FROM customers c
JOIN orders o ON c.id = o.customer_id
WHERE o.created_at >= date_trunc('quarter', CURRENT_DATE)
AND o.status <> 'cancelled'
AND c.is_test_account = false
GROUP BY c.id, c.email
ORDER BY total_revenue DESC
LIMIT 10;

Because the server can read the plan, the index advice is concrete rather than generic:

-- Supports the join + WHERE + sort, scoped to live orders
CREATE INDEX CONCURRENTLY idx_orders_customer_created_active
ON orders (customer_id, created_at)
WHERE status <> 'cancelled';

When something is slow in production, the workflow is investigation, not guesswork.

A grounded response walks the plan and then proposes targeted fixes:

  1. Pre-aggregate reviews in a CTE so the rating average is computed once per product instead of multiplied across the join.
  2. Add a covering index for the price + created_at filter on products.
  3. Re-run EXPLAIN ANALYZE to confirm the sequential scan became an index scan.
CREATE INDEX CONCURRENTLY idx_products_price_created
ON products (price, created_at)
INCLUDE (name, category_id)
WHERE created_at > CURRENT_DATE - INTERVAL '2 years';
WITH product_stats AS (
SELECT product_id, AVG(rating) AS avg_rating, COUNT(*) AS review_count
FROM reviews
GROUP BY product_id
)
SELECT p.id, p.name, p.price, c.name AS category_name,
COALESCE(ps.avg_rating, 0) AS avg_rating,
COALESCE(ps.review_count, 0) AS review_count
FROM products p
LEFT JOIN categories c ON p.category_id = c.id
LEFT JOIN product_stats ps ON p.id = ps.product_id
WHERE p.price BETWEEN 100 AND 500
AND p.created_at > CURRENT_DATE - INTERVAL '2 years'
ORDER BY ps.avg_rating DESC NULLS LAST, ps.review_count DESC NULLS LAST;

The MongoDB server brings the same pattern to aggregation. Ask for “users grouped by signup month, with average orders per user and a count of users with no order in the last 90 days,” and it generates the pipeline plus the indexes that make it fast (db.orders.createIndex({ user_id: 1, created_at: -1 })).

Migrations on large tables need additive, reversible steps. With the real schema in context, the model can plan the phases and generate the rollback alongside the forward script.

  1. Add the new column with a CHECK constraint (no table rewrite, no lock).
  2. Dual-write from the application so new rows populate both columns.
  3. Backfill existing rows in batches during low traffic.
  4. Add the unique index with CREATE INDEX CONCURRENTLY (no write lock).
  5. Cut over reads, then drop the old column once validated.
-- Phase 1: additive, no rewrite
ALTER TABLE users
ADD COLUMN email_new VARCHAR(320)
CHECK (email_new ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');
-- Phase 3: batched backfill, keeps locks short
DO $$
DECLARE batch INTEGER := 10000; affected INTEGER;
BEGIN
LOOP
UPDATE users SET email_new = email
WHERE id IN (
SELECT id FROM users WHERE email_new IS NULL LIMIT batch
);
GET DIAGNOSTICS affected = ROW_COUNT;
EXIT WHEN affected = 0;
PERFORM pg_sleep(0.1); -- breathe between batches
END LOOP;
END $$;
-- Phase 4: enforce without blocking writes
CREATE UNIQUE INDEX CONCURRENTLY idx_users_email_new ON users (email_new);
ALTER TABLE users ALTER COLUMN email_new SET NOT NULL;
-- Phase 5 (after app cutover)
ALTER TABLE users DROP COLUMN email;
ALTER TABLE users RENAME COLUMN email_new TO email;

Throughout, ask the model to generate validation queries — duplicate detection, constraint-violation counts, foreign-key orphan checks — so you can prove integrity before each phase.

Converting a single-tenant app to row-level security is a classic high-stakes change. With the schema in context, the model can propose tenant columns plus RLS policies that match your roles.

ALTER TABLE orders ADD COLUMN tenant_id UUID REFERENCES tenants(id);
ALTER TABLE orders ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation_orders ON orders
FOR ALL TO application_role
USING (tenant_id = current_setting('app.tenant_id')::UUID);
CREATE INDEX idx_orders_tenant_customer
ON orders (tenant_id, customer_id, created_at);

For high-volume time-series data, ask for a PARTITION BY RANGE table with monthly partitions and an automated creation function. Keep the example partitions on current, forward-looking months so the boundaries stay valid:

CREATE TABLE sensor_readings (
id BIGSERIAL,
device_id INTEGER NOT NULL,
metric_type VARCHAR(50) NOT NULL,
value DOUBLE PRECISION NOT NULL,
recorded_at TIMESTAMPTZ NOT NULL DEFAULT now()
) PARTITION BY RANGE (recorded_at);
CREATE TABLE sensor_readings_2026_06 PARTITION OF sensor_readings
FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');
CREATE TABLE sensor_readings_2026_07 PARTITION OF sensor_readings
FOR VALUES FROM ('2026-07-01') TO ('2026-08-01');

Follow least privilege when creating the role the MCP server connects as:

CREATE ROLE mcp_readonly WITH LOGIN PASSWORD 'use-a-secret-manager';
GRANT CONNECT ON DATABASE myapp TO mcp_readonly;
GRANT USAGE ON SCHEMA public TO mcp_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO mcp_readonly;
-- Let the AI read plans and stats without touching data
GRANT pg_read_all_stats TO mcp_readonly;

Combined with the server’s own read-only mode (--access-mode restricted for postgres-mcp, --read-only for Supabase, --readOnly for MongoDB), this gives the model full introspection and zero write access.