Skip to content

Million+ LOC Strategies

You inherited a three-million-line monolith. The original architects left two years ago, the docs describe a system that no longer exists, and your first ticket touches a PaymentProcessor that twelve other services import. Loading the whole thing into an AI assistant just blows the context window and produces confident nonsense. This guide shows the workflows that actually scale: semantic search instead of grep, layered context instead of “read everything,” and incremental refactors instead of big-bang rewrites.

  • A semantic-search MCP setup (Zilliz Claude Context) wired into Cursor, Claude Code, and Codex so the AI finds code by intent, not string match
  • A reusable architecture reconnaissance prompt for mapping an unfamiliar codebase top-down
  • A context hierarchy technique that keeps the AI focused on the right 20% instead of choking on the whole repo
  • A copy-paste dependency-impact prompt for planning breaking changes without surprising other teams
  • A strangler-fig prompt for wrapping and decomposing legacy code safely
  • Concrete recovery steps for when the index goes stale, the AI hallucinates dependency counts, or a parallel refactor collides

Semantic, not textual

With a vector index, “find all authentication flows” surfaces OAuth, JWT, and session code even when none of them share a keyword.

Dependency tracing

The AI follows imports and call sites across module boundaries far faster than you can click through “find usages.”

Characterization tests

For undocumented legacy code, the AI drafts tests that pin current behavior so you can refactor without fear.

You stay the architect

The AI does the mechanical scanning and boilerplate. You make the domain and architecture calls it cannot.

Semantic Code Search With Zilliz Claude Context

Section titled “Semantic Code Search With Zilliz Claude Context”

Text search fails at scale because related code rarely shares vocabulary. A semantic index built on vector embeddings fixes that. The maintained server is Zilliz Claude Context (@zilliz/claude-context-mcp — previously published as code-context). MCP setup is nearly identical across all three tools; only the registration command differs.

Add to ~/.cursor/mcp.json:

{
"mcpServers": {
"claude-context": {
"command": "npx",
"args": ["-y", "@zilliz/claude-context-mcp@latest"],
"env": {
"EMBEDDING_PROVIDER": "OpenAI",
"OPENAI_API_KEY": "your-api-key",
"MILVUS_TOKEN": "your-zilliz-key"
}
}
}
}

Once indexed, you ask for concepts and the server returns the relevant files regardless of naming. For sensitive codebases that can’t reach a cloud embedding API, LuotoCompany/cursor-local-indexing runs an on-premise ChromaDB index and exposes it over a local SSE endpoint:

Add to ~/.cursor/mcp.json:

{
"mcpServers": {
"workspace-code-search": {
"url": "http://localhost:8978/sse"
}
}
}

This keeps source code on your own infrastructure — the right call for financial services, healthcare, or defense work where code can’t leave the network.

Where do you start with an unfamiliar monolith? Top-down. Get the AI to build a mental model before you touch anything, then drill into the area your ticket actually concerns.

Cursor’s Agent self-gathers context from the indexed codebase — just describe what you want. Use @Folders to scope a question to one area and @Code to point at a specific snippet:

@Folders services/auth
Explain the authentication and authorization architecture: where tokens
are issued, how refresh works, and which services validate them.

For a precise reference, select a function in the editor and add it with @Code before asking the agent to trace its call sites.

The biggest mistake with large codebases is loading everything at once. Your assistant doesn’t need all three million lines — it needs the right slice at the right moment. Think of it as zooming on a map: continent, country, city, street.

  1. Domain level (10,000 ft)

    What are the main bounded contexts in this system, and how do the payment,
    user, and inventory domains interact?
  2. Service level (1,000 ft)

    Within the payment domain, explain the service architecture and the main
    APIs each service exposes.
  3. Component level (100 ft)

    Show me how PaymentProcessor handles credit-card transactions and what its
    retry strategy is for failed charges.
  4. Implementation level (ground)

    In PaymentProcessor.processCard(), why is there a 30-second timeout, and is
    the synchronized block safe to remove?

Each tool has its own mechanism for scoping what the AI sees. The principle is identical: load narrow, expand only when the answer requires it.

Scope with @Folders and @Code, and encode standing rules as a Project Rule so you don’t repeat them every prompt:

# In .cursor/rules/payment.mdc (a Project Rule with glob: services/payment/**)
When working with payment code:
- All monetary amounts are integer cents — never floats
- Mutations require an idempotency key
- Never log full card numbers (PCI)
- Add audit logging for every state transition

For straightforward projects, an AGENTS.md at the repo root works as a simpler alternative to structured rules.

Refactoring a million-line codebase is like renovating a hospital while surgery continues — you can’t shut everything down. The pattern that works: discover, template, migrate in small batches, verify.

Take a Node.js codebase still riddled with error-first callbacks. Manual migration to async/await would take months. Instead, have the AI categorize the work by risk, then generate one reusable transformation per category:

// Before — error-first callback
function loadUser(id, callback) {
db.query('SELECT * FROM users WHERE id = ?', [id], (err, rows) => {
if (err) return callback(err);
callback(null, rows[0]);
});
}
// After — async, with a backward-compatible callback shim
async function loadUser(id, callback) {
try {
const rows = await db.query('SELECT * FROM users WHERE id = ?', [id]);
if (callback) return callback(null, rows[0]);
return rows[0];
} catch (err) {
if (callback) return callback(err);
throw err;
}
}

The shim lets callers migrate on their own schedule. Apply the transformation directory by directory, run the existing tests after each batch, and track progress — never transform the whole tree in one pass.

For a large effort split across a team, have the AI partition the work to minimize cross-team conflicts, then keep the branches honest:

  1. Partition by dependency boundaries

    Analyze module dependencies and propose how to split this refactor across
    four developers so their territories barely overlap. Flag any shared files
    that two teams would both need to edit.
  2. Branch per territory

    Terminal window
    git checkout -b refactor/user-services
    git checkout -b refactor/payment-services
    git checkout -b refactor/shared-utils
  3. Detect collisions early

    Review the diffs across all refactor/* branches and identify conflicting
    or breaking changes between teams before we attempt to merge.

Every large codebase has archaeological layers — code from different eras and philosophies, some of it predating the team. The classic horror: a 15,000-line stored procedure no one understands that still processes real money daily.

The strangler-fig pattern lets you modernize without a rewrite: wrap the legacy code behind a clean interface, then extract pieces one at a time while running old and new in parallel until you trust the new path.

When documentation doesn’t exist, tests become the documentation. Ask the AI to write characterization tests that pin current behavior — including the weird parts — so any future change that alters output fails loudly:

describe('Legacy OrderProcessor — current behavior', () => {
it('returns status code 1 on a standard single-item order', async () => {
const result = await processOrder({
customerId: 123,
items: [{ sku: 'WIDGET-1', quantity: 1 }],
});
expect(result.status).toBe(1); // 1 = success (undocumented magic number)
expect(result.orderId).toMatch(/^ORD-\d{8}$/);
});
it('returns -99 when inventory is unavailable', async () => {
const result = await processOrder({
customerId: 123,
items: [{ sku: 'OUT-OF-STOCK', quantity: 1 }],
});
expect(result.status).toBe(-99); // -99 = inventory error
});
});

In a million-line codebase, different teams own different territory. The hard part is making a change that crosses a boundary without breaking someone else. Before any breaking change, get an impact report.

Pair this with auto-generated contracts. Ask the AI to produce an OpenAPI spec and event schemas for a service another team consumes — that turns “go read our code” into a stable boundary they can integrate against without spelunking through your internals.

Large-codebase AI workflows fail in specific, recognizable ways. Know the recovery for each.