Semantic, not textual
With a vector index, “find all authentication flows” surfaces OAuth, JWT, and session code even when none of them share a keyword.
You inherited a three-million-line monolith. The original architects left two years ago, the docs describe a system that no longer exists, and your first ticket touches a PaymentProcessor that twelve other services import. Loading the whole thing into an AI assistant just blows the context window and produces confident nonsense. This guide shows the workflows that actually scale: semantic search instead of grep, layered context instead of “read everything,” and incremental refactors instead of big-bang rewrites.
Semantic, not textual
With a vector index, “find all authentication flows” surfaces OAuth, JWT, and session code even when none of them share a keyword.
Dependency tracing
The AI follows imports and call sites across module boundaries far faster than you can click through “find usages.”
Characterization tests
For undocumented legacy code, the AI drafts tests that pin current behavior so you can refactor without fear.
You stay the architect
The AI does the mechanical scanning and boilerplate. You make the domain and architecture calls it cannot.
Text search fails at scale because related code rarely shares vocabulary. A semantic index built on vector embeddings fixes that. The maintained server is Zilliz Claude Context (@zilliz/claude-context-mcp — previously published as code-context). MCP setup is nearly identical across all three tools; only the registration command differs.
Add to ~/.cursor/mcp.json:
{ "mcpServers": { "claude-context": { "command": "npx", "args": ["-y", "@zilliz/claude-context-mcp@latest"], "env": { "EMBEDDING_PROVIDER": "OpenAI", "OPENAI_API_KEY": "your-api-key", "MILVUS_TOKEN": "your-zilliz-key" } } }}claude mcp add claude-context \ -e OPENAI_API_KEY=your-api-key \ -e MILVUS_TOKEN=your-zilliz-key \ -- npx -y @zilliz/claude-context-mcp@latestcodex mcp add claude-context \ --env OPENAI_API_KEY=your-api-key \ --env MILVUS_TOKEN=your-zilliz-key \ -- npx -y @zilliz/claude-context-mcp@latestOr add it directly to ~/.codex/config.toml:
[mcp_servers.claude-context]command = "npx"args = ["-y", "@zilliz/claude-context-mcp@latest"]env = { EMBEDDING_PROVIDER = "OpenAI", OPENAI_API_KEY = "your-api-key", MILVUS_TOKEN = "your-zilliz-key" }Once indexed, you ask for concepts and the server returns the relevant files regardless of naming. For sensitive codebases that can’t reach a cloud embedding API, LuotoCompany/cursor-local-indexing runs an on-premise ChromaDB index and exposes it over a local SSE endpoint:
Add to ~/.cursor/mcp.json:
{ "mcpServers": { "workspace-code-search": { "url": "http://localhost:8978/sse" } }}claude mcp add --transport sse workspace-code-search http://localhost:8978/ssecodex mcp add workspace-code-search --url http://localhost:8978/sseThis keeps source code on your own infrastructure — the right call for financial services, healthcare, or defense work where code can’t leave the network.
Where do you start with an unfamiliar monolith? Top-down. Get the AI to build a mental model before you touch anything, then drill into the area your ticket actually concerns.
Cursor’s Agent self-gathers context from the indexed codebase — just describe what you want. Use @Folders to scope a question to one area and @Code to point at a specific snippet:
@Folders services/authExplain the authentication and authorization architecture: where tokensare issued, how refresh works, and which services validate them.For a precise reference, select a function in the editor and add it with @Code before asking the agent to trace its call sites.
Scope the session to the directory you care about with the --add-dir flag at startup (or /add-dir <path> mid-session), then ask broad-to-narrow. Use @-path mentions to pull a specific file into context:
Analyze this codebase and build a mental model of the system architecture.Cover: core business domains, service boundaries, data-flow patterns, andexternal dependencies. Present it as an overview for a new senior engineer.Then drill in with semantic search via the MCP server:
Using claude-context, find all payment-processing flows. I need entrypoints, state management during processing, external provider integration,and the retry/error-handling logic. Reference @services/payment as you go.Drop an AGENTS.md at the repo root (Codex’s project-context file, the equivalent of CLAUDE.md or .cursor/rules) describing the domains and conventions, then run /init inside the TUI to have Codex bootstrap it. For a large refactor, work in a dedicated git worktree so the exploration never touches your main checkout:
Map this codebase top-down: business domains, service boundaries, dataflow, and external dependencies. Then locate the payment-processing flowand summarize its entry points and retry logic.The biggest mistake with large codebases is loading everything at once. Your assistant doesn’t need all three million lines — it needs the right slice at the right moment. Think of it as zooming on a map: continent, country, city, street.
Domain level (10,000 ft)
What are the main bounded contexts in this system, and how do the payment,user, and inventory domains interact?Service level (1,000 ft)
Within the payment domain, explain the service architecture and the mainAPIs each service exposes.Component level (100 ft)
Show me how PaymentProcessor handles credit-card transactions and what itsretry strategy is for failed charges.Implementation level (ground)
In PaymentProcessor.processCard(), why is there a 30-second timeout, and isthe synchronized block safe to remove?Each tool has its own mechanism for scoping what the AI sees. The principle is identical: load narrow, expand only when the answer requires it.
Scope with @Folders and @Code, and encode standing rules as a Project Rule so you don’t repeat them every prompt:
# In .cursor/rules/payment.mdc (a Project Rule with glob: services/payment/**)When working with payment code:- All monetary amounts are integer cents — never floats- Mutations require an idempotency key- Never log full card numbers (PCI)- Add audit logging for every state transitionFor straightforward projects, an AGENTS.md at the repo root works as a simpler alternative to structured rules.
Use a hierarchy of CLAUDE.md files — each directory’s file layers onto its parents, giving the AI focused context as it moves through the tree:
/CLAUDE.md # System-wide conventions/services/CLAUDE.md # Service-layer patterns/services/payment/CLAUDE.md # Payment-specific rulesSwitch cleanly between unrelated tasks with /clear and /add-dir:
/clear/add-dir services/paymentAnalyze the payment-processing flow.
/clear/add-dir services/usersReview the authentication implementation.Codex reads AGENTS.md from the repo root and any subdirectory, so place focused instructions next to the code they govern:
# In services/payment/AGENTS.mdThis service handles all payment processing.- Amounts are integer cents to avoid floating-point error- Idempotency keys required on all transactions- PCI: never log full card numbersFor parallel exploration, spin up a worktree per task so contexts stay isolated and your main branch is untouched.
Refactoring a million-line codebase is like renovating a hospital while surgery continues — you can’t shut everything down. The pattern that works: discover, template, migrate in small batches, verify.
Take a Node.js codebase still riddled with error-first callbacks. Manual migration to async/await would take months. Instead, have the AI categorize the work by risk, then generate one reusable transformation per category:
// Before — error-first callbackfunction loadUser(id, callback) { db.query('SELECT * FROM users WHERE id = ?', [id], (err, rows) => { if (err) return callback(err); callback(null, rows[0]); });}
// After — async, with a backward-compatible callback shimasync function loadUser(id, callback) { try { const rows = await db.query('SELECT * FROM users WHERE id = ?', [id]); if (callback) return callback(null, rows[0]); return rows[0]; } catch (err) { if (callback) return callback(err); throw err; }}The shim lets callers migrate on their own schedule. Apply the transformation directory by directory, run the existing tests after each batch, and track progress — never transform the whole tree in one pass.
For a large effort split across a team, have the AI partition the work to minimize cross-team conflicts, then keep the branches honest:
Partition by dependency boundaries
Analyze module dependencies and propose how to split this refactor acrossfour developers so their territories barely overlap. Flag any shared filesthat two teams would both need to edit.Branch per territory
git checkout -b refactor/user-servicesgit checkout -b refactor/payment-servicesgit checkout -b refactor/shared-utilsDetect collisions early
Review the diffs across all refactor/* branches and identify conflictingor breaking changes between teams before we attempt to merge.Every large codebase has archaeological layers — code from different eras and philosophies, some of it predating the team. The classic horror: a 15,000-line stored procedure no one understands that still processes real money daily.
The strangler-fig pattern lets you modernize without a rewrite: wrap the legacy code behind a clean interface, then extract pieces one at a time while running old and new in parallel until you trust the new path.
When documentation doesn’t exist, tests become the documentation. Ask the AI to write characterization tests that pin current behavior — including the weird parts — so any future change that alters output fails loudly:
describe('Legacy OrderProcessor — current behavior', () => { it('returns status code 1 on a standard single-item order', async () => { const result = await processOrder({ customerId: 123, items: [{ sku: 'WIDGET-1', quantity: 1 }], }); expect(result.status).toBe(1); // 1 = success (undocumented magic number) expect(result.orderId).toMatch(/^ORD-\d{8}$/); });
it('returns -99 when inventory is unavailable', async () => { const result = await processOrder({ customerId: 123, items: [{ sku: 'OUT-OF-STOCK', quantity: 1 }], }); expect(result.status).toBe(-99); // -99 = inventory error });});In a million-line codebase, different teams own different territory. The hard part is making a change that crosses a boundary without breaking someone else. Before any breaking change, get an impact report.
Pair this with auto-generated contracts. Ask the AI to produce an OpenAPI spec and event schemas for a service another team consumes — that turns “go read our code” into a stable boundary they can integrate against without spelunking through your internals.
Large-codebase AI workflows fail in specific, recognizable ways. Know the recovery for each.