Graph Databases
Explore Neo4j for complex relationships
Ta treść nie jest jeszcze dostępna w Twoim języku.
Your startup is building a multi-tenant SaaS platform for project management. You need to design a database that handles: user organizations, projects, tasks, time tracking, file attachments, real-time comments, and activity feeds. The system must scale to thousands of organizations with millions of tasks while maintaining sub-100ms query performance.
By completing this lesson, you’ll master:
Design a database that:
Start with Ask mode to clarify requirements:
"Help me design a database for a project management SaaS:- Multi-tenant (B2B)- Organizations have multiple users- Projects contain tasks with subtasks- Time tracking per task- File attachments (S3 URLs)- Real-time comments- Activity feeds- Full audit trailWhat questions should I answer first?"
"Based on these features, analyze the access patterns:- User dashboard (projects, recent tasks, notifications)- Project view (tasks, milestones, team members)- Task details (comments, attachments, time logs)- Reports (time by project, user productivity)- Search (across all entities)Identify read/write ratios and hot paths"
Switch to Agent mode:
"Compare database options for this use case:- PostgreSQL with JSONB- MySQL with proper indexing- MongoDB for flexibility- Hybrid approach (PostgreSQL + Redis)Consider: ACID needs, query complexity, scaling strategy"
Configure database connections via MCP:
{ "mcpServers": { "postgres-dev": { "command": "npx", "args": [ "-y", "@modelcontextprotocol/server-postgres", "postgresql://user:pass@localhost:5432/projectdb" ] }, "sqlite-local": { "command": "npx", "args": [ "-y", "@modelcontextprotocol/server-sqlite", "./data/local.db" ] }, "redis-cache": { "command": "npx", "args": ["-y", "@redis/mcp-redis"], "env": { "REDIS_URL": "redis://localhost:6379" } } }}
// With PostgreSQL MCP active:"Using PostgreSQL MCP, analyze the current schema and:- Show all tables and relationships- Identify missing indexes- Suggest normalization improvements- Check for anti-patterns"
// Direct schema creation:"Create the organizations and users tables with:- UUID primary keys- Proper constraints- Audit columns (created_at, updated_at)- Soft delete support"
"Using PostgreSQL MCP:- List all foreign key relationships- Find tables without primary keys- Identify unused indexes- Show table sizes and row counts"
"Analyze slow queries with MCP:- Run EXPLAIN ANALYZE on this query- Suggest index improvements- Show query execution plan- Identify table scan issues"
"Using SQLite MCP for prototyping:- Export schema to PostgreSQL format- Generate migration scripts- Validate data integrity- Create rollback procedures"
"Using Redis MCP, implement caching for:- User session data (SET/GET with TTL)- Dashboard query results (with invalidation)- Real-time presence tracking (using Redis Sets)- Rate limiting counters (using INCR with EXPIRE)"
// Example Redis operations via MCP:"Set up a rate limiter using Redis:- Use INCR for request counting- Set 60-second expiration- Return current count- Handle race conditions"
For AI-powered features:
{ "mcpServers": { "chroma": { "command": "npx", "args": ["-y", "chroma-mcp"], "env": { "CHROMA_URL": "http://localhost:8000" } } }}
"Using Chroma MCP, implement:- Semantic search for tasks- Similar project recommendations- Content-based filtering- Embedding storage for ML features"
Security Considerations:
Performance Tips:
"Design the core database schema with:- Proper normalization (3NF where appropriate)- Multi-tenancy strategy (shared schema vs separate)- Soft delete implementation- Audit fields (created_by, updated_at, etc.)- UUID vs serial IDsCreate SQL DDL statements"
Example schema:
-- Organizations (tenants)CREATE TABLE organizations ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name VARCHAR(255) NOT NULL, slug VARCHAR(100) UNIQUE NOT NULL, plan VARCHAR(50) DEFAULT 'free', settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), deleted_at TIMESTAMPTZ);
-- UsersCREATE TABLE users ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), email VARCHAR(255) UNIQUE NOT NULL, name VARCHAR(255) NOT NULL, password_hash VARCHAR(255), settings JSONB DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), deleted_at TIMESTAMPTZ);
-- Organization membershipCREATE TABLE organization_members ( organization_id UUID REFERENCES organizations(id), user_id UUID REFERENCES users(id), role VARCHAR(50) NOT NULL DEFAULT 'member', joined_at TIMESTAMPTZ DEFAULT NOW(), PRIMARY KEY (organization_id, user_id));
-- ProjectsCREATE TABLE projects ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), organization_id UUID REFERENCES organizations(id) NOT NULL, name VARCHAR(255) NOT NULL, description TEXT, status VARCHAR(50) DEFAULT 'active', settings JSONB DEFAULT '{}', created_by UUID REFERENCES users(id), created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), deleted_at TIMESTAMPTZ, -- Composite index for tenant isolation INDEX idx_org_projects (organization_id, status, created_at DESC));
-- Tasks with hierarchical structureCREATE TABLE tasks ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), organization_id UUID REFERENCES organizations(id) NOT NULL, project_id UUID REFERENCES projects(id) NOT NULL, parent_task_id UUID REFERENCES tasks(id), title VARCHAR(500) NOT NULL, description TEXT, status VARCHAR(50) DEFAULT 'todo', priority INTEGER DEFAULT 0, due_date DATE, assignee_id UUID REFERENCES users(id), estimated_hours DECIMAL(5,2), metadata JSONB DEFAULT '{}', created_by UUID REFERENCES users(id), created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW(), deleted_at TIMESTAMPTZ, -- Indexes for common queries INDEX idx_org_tasks (organization_id, project_id, status), INDEX idx_assignee_tasks (assignee_id, status, due_date), INDEX idx_parent_tasks (parent_task_id));
@schema.sql"Add supporting tables for:- Time tracking entries- Comments (polymorphic)- Attachments (polymorphic)- Activity feed- NotificationsInclude proper indexes and constraints"
"Implement advanced database features:- Row Level Security for multi-tenancy- Materialized views for dashboards- Full-text search configuration- Trigger-based audit logging- Partitioning strategy for large tables"
@schema.sql"Analyze query patterns and create indexes:- Covering indexes for hot queries- Partial indexes for filtered queries- GIN indexes for JSONB search- Composite indexes for sorts- Consider index maintenance cost"
Example indexing strategy:
-- Dashboard queries (most frequent)CREATE INDEX idx_user_recent_tasksON tasks (assignee_id, updated_at DESC)WHERE deleted_at IS NULL AND status != 'done';
-- Project task list with filtersCREATE INDEX idx_project_tasks_filteredON tasks (project_id, status, priority DESC, created_at DESC)WHERE deleted_at IS NULL;
-- Full-text search on tasksCREATE INDEX idx_tasks_searchON tasks USING gin( to_tsvector('english', title || ' ' || COALESCE(description, '')));
-- JSONB metadata searchCREATE INDEX idx_tasks_metadataON tasks USING gin(metadata);
-- Time tracking aggregationsCREATE INDEX idx_time_entries_reportingON time_entries (organization_id, user_id, date)INCLUDE (hours);
"Design caching strategy with Redis:- Cache warming for dashboards- Invalidation patterns- Query result caching- Session storage- Real-time presence dataCreate implementation plan"
// Method 1: Using Database MCP"Using PostgreSQL MCP, optimize this dashboard query:SELECT * FROM tasks WHERE assignee_id = $1 AND status = 'active'- Run EXPLAIN ANALYZE- Suggest better indexes- Show query cost reduction"
// Method 2: Manual optimization@queries/dashboard.sql"Optimize these critical queries:- User dashboard (projects + recent tasks)- Project overview (stats + team + progress)- Task search with filters- Time report aggregationsUse EXPLAIN ANALYZE and suggest improvements"
// MCP can directly test optimizations:"Using PostgreSQL MCP:1. Create the suggested index2. Re-run EXPLAIN ANALYZE3. Compare before/after performance4. Rollback if not improved"
"Implement PostgreSQL RLS for multi-tenancy:- Policies for each table- Performance implications- Testing strategies- Bypass for admin operationsEnsure complete tenant isolation"
Example RLS implementation:
-- Enable RLSALTER TABLE projects ENABLE ROW LEVEL SECURITY;ALTER TABLE tasks ENABLE ROW LEVEL SECURITY;
-- Create policiesCREATE POLICY tenant_isolation_projects ON projects FOR ALL TO application_user USING (organization_id = current_setting('app.current_org_id')::UUID);
CREATE POLICY tenant_isolation_tasks ON tasks FOR ALL TO application_user USING (organization_id = current_setting('app.current_org_id')::UUID);
-- Function to set current tenantCREATE OR REPLACE FUNCTION set_current_tenant(org_id UUID)RETURNS void AS $$BEGIN PERFORM set_config('app.current_org_id', org_id::text, true);END;$$ LANGUAGE plpgsql;
"Create complete tenant isolation strategy:- Connection pooling per tenant- Query monitoring and limits- Storage quotas- Backup strategies- Data export capabilities"
"Set up performance monitoring:- Slow query logging- Index usage statistics- Table bloat monitoring- Connection pool metrics- Tenant resource usage"
"Set up database migration system:- Version control for schema- Rollback capabilities- Zero-downtime migrations- Data migration strategies- Testing migrations"
Example migration:
export const up = async (db: Database) => { await db.schema.createTable('task_dependencies', (table) => { table.uuid('id').primary().defaultTo(db.raw('gen_random_uuid()')); table.uuid('task_id').references('id').inTable('tasks').notNullable(); table.uuid('depends_on_task_id').references('id').inTable('tasks').notNullable(); table.enum('dependency_type', ['blocks', 'relates_to', 'duplicates']); table.timestamps(true, true);
table.unique(['task_id', 'depends_on_task_id']); table.index(['depends_on_task_id']); });
// Add check constraint to prevent self-dependencies await db.raw(` ALTER TABLE task_dependencies ADD CONSTRAINT no_self_dependency CHECK (task_id != depends_on_task_id) `);};
export const down = async (db: Database) => { await db.schema.dropTable('task_dependencies');};
"Plan for future schema changes:- Adding custom fields per tenant- Archival strategies- Sharding preparation- Read replica setup- Schema versioning"
"Generate realistic test data:- Multiple organizations- Varying data volumes- Edge cases (deep nesting, etc)- Performance test scenarios- Data integrity verification"
Leverage MCP for rapid iteration:
// Real-time schema exploration"Using PostgreSQL MCP:- Show me the current schema- Find all tables with > 1M rows- List indexes not used in 30 days"
// Direct query execution"Using PostgreSQL MCP, run this query:SELECT table_name, pg_size_pretty(pg_total_relation_size(table_name::regclass))FROM information_schema.tablesWHERE table_schema = 'public'ORDER BY pg_total_relation_size(table_name::regclass) DESC"
// Performance analysis"Analyze this slow query with PostgreSQL MCP:- Show execution plan- Identify bottlenecks- Suggest optimizations"
Use proven patterns:
-- Polymorphic associations for flexibilityCREATE TABLE comments ( id UUID PRIMARY KEY, commentable_type VARCHAR(50), -- 'task', 'project' commentable_id UUID, content TEXT, -- Partial indexes for each type INDEX idx_task_comments (commentable_id) WHERE commentable_type = 'task');
-- Hierarchical data with ltreeCREATE EXTENSION ltree;CREATE TABLE categories ( id SERIAL PRIMARY KEY, path ltree NOT NULL, name VARCHAR(255));CREATE INDEX idx_category_path ON categories USING gist(path);
Optimize complex queries:
-- Use CTEs for readability and performanceWITH user_projects AS ( SELECT p.* FROM projects p JOIN organization_members om ON p.organization_id = om.organization_id WHERE om.user_id = $1 AND p.deleted_at IS NULL),recent_tasks AS ( SELECT t.* FROM tasks t WHERE t.project_id IN (SELECT id FROM user_projects) AND t.updated_at > NOW() - INTERVAL '7 days' ORDER BY t.updated_at DESC LIMIT 10)SELECT * FROM recent_tasks;
Leverage JSONB wisely:
-- Settings with defaults and validationCREATE TABLE tenant_settings ( organization_id UUID PRIMARY KEY, settings JSONB NOT NULL DEFAULT '{ "features": { "timeTracking": true, "customFields": false }, "limits": { "maxProjects": 10, "maxUsersPerProject": 50 } }', CONSTRAINT valid_settings CHECK ( settings ? 'features' AND settings ? 'limits' ));
-- Query JSONB efficientlyCREATE INDEX idx_premium_tenants ON tenant_settings ((settings->'features'->>'customFields')) WHERE (settings->'features'->>'customFields')::boolean = true;
Problem: Loading related data in loops
Solution:
// Bad: N+1 queriesconst projects = await db.query('SELECT * FROM projects');for (const project of projects) { project.tasks = await db.query( 'SELECT * FROM tasks WHERE project_id = $1', [project.id] );}
// Good: Single query with joinconst projectsWithTasks = await db.query(` SELECT p.*, COALESCE( json_agg(t.*) FILTER (WHERE t.id IS NOT NULL), '[]' ) as tasks FROM projects p LEFT JOIN tasks t ON t.project_id = p.id GROUP BY p.id`);
Problem: Slow queries on large tables
Solution:
-- Analyze query performanceEXPLAIN (ANALYZE, BUFFERS)SELECT * FROM tasksWHERE organization_id = $1 AND status = 'active'ORDER BY created_at DESC;
-- Add appropriate indexCREATE INDEX CONCURRENTLY idx_active_tasksON tasks (organization_id, created_at DESC)WHERE status = 'active' AND deleted_at IS NULL;
Problem: Deadlocks and slow updates
Solution:
-- Use advisory locks for complex operationsSELECT pg_advisory_lock(hashtext('project_' || $1));
-- Perform operations
SELECT pg_advisory_unlock(hashtext('project_' || $1));
-- Or use SKIP LOCKED for queuesUPDATE jobs SET status = 'processing'WHERE id = ( SELECT id FROM jobs WHERE status = 'pending' ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1)RETURNING *;
Take your database design further:
Advanced Features
Scaling Strategies
Performance Tuning
Your database design is successful when:
Teams using these patterns achieve:
Before going to production:
You’ve mastered database design. Ready for more?
Graph Databases
Explore Neo4j for complex relationships
Time Series
Design for IoT and metrics data
Event Sourcing
Implement event-driven architecture
Continue to Frontend UI from Design →