Million+ LOC Strategies

You inherited a three-million-line monolith. The original architects left two years ago, the docs describe a system that no longer exists, and your first ticket touches a PaymentProcessor that twelve other services import. Loading the whole thing into an AI assistant just blows the context window and produces confident nonsense. This guide shows the workflows that actually scale: semantic search instead of grep, layered context instead of “read everything,” and incremental refactors instead of big-bang rewrites.

What You’ll Walk Away With

A semantic-search MCP setup (Zilliz Claude Context) wired into Cursor, Claude Code, and Codex so the AI finds code by intent, not string match
A reusable architecture reconnaissance prompt for mapping an unfamiliar codebase top-down
A context hierarchy technique that keeps the AI focused on the right 20% instead of choking on the whole repo
A copy-paste dependency-impact prompt for planning breaking changes without surprising other teams
A strangler-fig prompt for wrapping and decomposing legacy code safely
Concrete recovery steps for when the index goes stale, the AI hallucinates dependency counts, or a parallel refactor collides

Why AI Helps at This Scale

Semantic, not textual

With a vector index, “find all authentication flows” surfaces OAuth, JWT, and session code even when none of them share a keyword.

Dependency tracing

The AI follows imports and call sites across module boundaries far faster than you can click through “find usages.”

Characterization tests

For undocumented legacy code, the AI drafts tests that pin current behavior so you can refactor without fear.

You stay the architect

The AI does the mechanical scanning and boilerplate. You make the domain and architecture calls it cannot.

Semantic Code Search With Zilliz Claude Context

Text search fails at scale because related code rarely shares vocabulary. A semantic index built on vector embeddings fixes that. The maintained server is Zilliz Claude Context (@zilliz/claude-context-mcp — previously published as code-context). MCP setup is nearly identical across all three tools; only the registration command differs.

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "claude-context": {
      "command": "npx",
      "args": ["-y", "@zilliz/claude-context-mcp@latest"],
      "env": {
        "EMBEDDING_PROVIDER": "OpenAI",
        "OPENAI_API_KEY": "your-api-key",
        "MILVUS_TOKEN": "your-zilliz-key"
      }
    }
  }
}

claude mcp add claude-context \
  -e OPENAI_API_KEY=your-api-key \
  -e MILVUS_TOKEN=your-zilliz-key \
  -- npx -y @zilliz/claude-context-mcp@latest

codex mcp add claude-context \
  --env OPENAI_API_KEY=your-api-key \
  --env MILVUS_TOKEN=your-zilliz-key \
  -- npx -y @zilliz/claude-context-mcp@latest

Or add it directly to ~/.codex/config.toml:

[mcp_servers.claude-context]
command = "npx"
args = ["-y", "@zilliz/claude-context-mcp@latest"]
env = { EMBEDDING_PROVIDER = "OpenAI", OPENAI_API_KEY = "your-api-key", MILVUS_TOKEN = "your-zilliz-key" }

Once indexed, you ask for concepts and the server returns the relevant files regardless of naming. For sensitive codebases that can’t reach a cloud embedding API, LuotoCompany/cursor-local-indexing runs an on-premise ChromaDB index and exposes it over a local SSE endpoint:

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "workspace-code-search": {
      "url": "http://localhost:8978/sse"
    }
  }
}

claude mcp add --transport sse workspace-code-search http://localhost:8978/sse

codex mcp add workspace-code-search --url http://localhost:8978/sse

This keeps source code on your own infrastructure — the right call for financial services, healthcare, or defense work where code can’t leave the network.

Architecture Reconnaissance

Where do you start with an unfamiliar monolith? Top-down. Get the AI to build a mental model before you touch anything, then drill into the area your ticket actually concerns.

Cursor’s Agent self-gathers context from the indexed codebase — just describe what you want. Use @Folders to scope a question to one area and @Code to point at a specific snippet:

@Folders services/auth
Explain the authentication and authorization architecture: where tokens
are issued, how refresh works, and which services validate them.

For a precise reference, select a function in the editor and add it with @Code before asking the agent to trace its call sites.

Scope the session to the directory you care about with the --add-dir flag at startup (or /add-dir <path> mid-session), then ask broad-to-narrow. Use @-path mentions to pull a specific file into context:

Analyze this codebase and build a mental model of the system architecture.
Cover: core business domains, service boundaries, data-flow patterns, and
external dependencies. Present it as an overview for a new senior engineer.

Then drill in with semantic search via the MCP server:

Using claude-context, find all payment-processing flows. I need entry
points, state management during processing, external provider integration,
and the retry/error-handling logic. Reference @services/payment as you go.

Drop an AGENTS.md at the repo root (Codex’s project-context file, the equivalent of CLAUDE.md or .cursor/rules) describing the domains and conventions, then run /init inside the TUI to have Codex bootstrap it. For a large refactor, work in a dedicated git worktree so the exploration never touches your main checkout:

Map this codebase top-down: business domains, service boundaries, data
flow, and external dependencies. Then locate the payment-processing flow
and summarize its entry points and retry logic.

Copy-paste prompt — architecture recon for an unfamiliar codebase:

Analyze this codebase and create a mental model of the system architecture.
Focus on:
1. Core business domains and bounded contexts
2. Service boundaries and how they communicate (HTTP, gRPC, events)
3. Data-flow patterns and where state lives
4. External dependencies and integration points

Present it as a high-level overview suitable for a senior engineer joining
the team today. Flag anything that looks like an architectural smell.

Context Management at Scale

The biggest mistake with large codebases is loading everything at once. Your assistant doesn’t need all three million lines — it needs the right slice at the right moment. Think of it as zooming on a map: continent, country, city, street.

Domain level (10,000 ft)

What are the main bounded contexts in this system, and how do the payment,
user, and inventory domains interact?

Service level (1,000 ft)

Within the payment domain, explain the service architecture and the main
APIs each service exposes.

Component level (100 ft)

Show me how PaymentProcessor handles credit-card transactions and what its
retry strategy is for failed charges.

Implementation level (ground)

In PaymentProcessor.processCard(), why is there a 30-second timeout, and is
the synchronized block safe to remove?

Keeping Context Focused

Each tool has its own mechanism for scoping what the AI sees. The principle is identical: load narrow, expand only when the answer requires it.

Scope with @Folders and @Code, and encode standing rules as a Project Rule so you don’t repeat them every prompt:

# In .cursor/rules/payment.mdc  (a Project Rule with glob: services/payment/**)
When working with payment code:
- All monetary amounts are integer cents — never floats
- Mutations require an idempotency key
- Never log full card numbers (PCI)
- Add audit logging for every state transition

For straightforward projects, an AGENTS.md at the repo root works as a simpler alternative to structured rules.

Use a hierarchy of CLAUDE.md files — each directory’s file layers onto its parents, giving the AI focused context as it moves through the tree:

/CLAUDE.md                        # System-wide conventions
/services/CLAUDE.md               # Service-layer patterns
/services/payment/CLAUDE.md       # Payment-specific rules

Switch cleanly between unrelated tasks with /clear and /add-dir:

/clear
/add-dir services/payment
Analyze the payment-processing flow.

/clear
/add-dir services/users
Review the authentication implementation.

Codex reads AGENTS.md from the repo root and any subdirectory, so place focused instructions next to the code they govern:

# In services/payment/AGENTS.md
This service handles all payment processing.
- Amounts are integer cents to avoid floating-point error
- Idempotency keys required on all transactions
- PCI: never log full card numbers

For parallel exploration, spin up a worktree per task so contexts stay isolated and your main branch is untouched.

Copy-paste prompt — targeted dependency graph (avoids loading the whole repo):

Build a dependency graph for the UserService module only:
1. What services does it depend on, and why?
2. What services depend on it?
3. Are there any circular dependencies?
4. Which dependencies look tightly coupled and could be replaced with an
   interface or an event?

Do not scan unrelated modules. Cite the file and line for each dependency.

Incremental Refactoring

Refactoring a million-line codebase is like renovating a hospital while surgery continues — you can’t shut everything down. The pattern that works: discover, template, migrate in small batches, verify.

Take a Node.js codebase still riddled with error-first callbacks. Manual migration to async/await would take months. Instead, have the AI categorize the work by risk, then generate one reusable transformation per category:

// Before — error-first callback
function loadUser(id, callback) {
  db.query('SELECT * FROM users WHERE id = ?', [id], (err, rows) => {
    if (err) return callback(err);
    callback(null, rows[0]);
  });
}

// After — async, with a backward-compatible callback shim
async function loadUser(id, callback) {
  try {
    const rows = await db.query('SELECT * FROM users WHERE id = ?', [id]);
    if (callback) return callback(null, rows[0]);
    return rows[0];
  } catch (err) {
    if (callback) return callback(err);
    throw err;
  }
}

The shim lets callers migrate on their own schedule. Apply the transformation directory by directory, run the existing tests after each batch, and track progress — never transform the whole tree in one pass.

Copy-paste prompt — plan a risk-tiered migration:

Using the claude-context index, find every error-first callback in src/ and
categorize them by migration risk:
- Simple (single async op)
- Chains (sequential ops)
- Parallel (concurrent ops)
- Complex error handling or shared closure state

For each tier, produce ONE reusable async/await transformation template with
edge cases to watch for and a testing strategy. Recommend a migration order,
lowest risk first. Do not change any code yet.

Coordinating Parallel Refactors

For a large effort split across a team, have the AI partition the work to minimize cross-team conflicts, then keep the branches honest:

Partition by dependency boundaries

Analyze module dependencies and propose how to split this refactor across
four developers so their territories barely overlap. Flag any shared files
that two teams would both need to edit.

Branch per territory

git checkout -b refactor/user-services
git checkout -b refactor/payment-services
git checkout -b refactor/shared-utils

Detect collisions early

Review the diffs across all refactor/* branches and identify conflicting
or breaking changes between teams before we attempt to merge.

Legacy Code Archaeology

Every large codebase has archaeological layers — code from different eras and philosophies, some of it predating the team. The classic horror: a 15,000-line stored procedure no one understands that still processes real money daily.

The strangler-fig pattern lets you modernize without a rewrite: wrap the legacy code behind a clean interface, then extract pieces one at a time while running old and new in parallel until you trust the new path.

Copy-paste prompt — wrap legacy code in a modern facade (step one of strangler-fig):

Create a modern API facade that wraps the legacy billing stored procedure
(billing_mega_proc) without changing its behavior:
- REST endpoints for each billing operation
- Internally still call the stored procedure
- Translate its magic-number error codes into typed HTTP errors
- Return consistent JSON, and add structured logging
- Generate an OpenAPI spec for the new surface

Do not reimplement any billing logic yet — only wrap and adapt.

When documentation doesn’t exist, tests become the documentation. Ask the AI to write characterization tests that pin current behavior — including the weird parts — so any future change that alters output fails loudly:

describe('Legacy OrderProcessor — current behavior', () => {
  it('returns status code 1 on a standard single-item order', async () => {
    const result = await processOrder({
      customerId: 123,
      items: [{ sku: 'WIDGET-1', quantity: 1 }],
    });
    expect(result.status).toBe(1); // 1 = success (undocumented magic number)
    expect(result.orderId).toMatch(/^ORD-\d{8}$/);
  });

  it('returns -99 when inventory is unavailable', async () => {
    const result = await processOrder({
      customerId: 123,
      items: [{ sku: 'OUT-OF-STOCK', quantity: 1 }],
    });
    expect(result.status).toBe(-99); // -99 = inventory error
  });
});

Cross-Team Coordination

In a million-line codebase, different teams own different territory. The hard part is making a change that crosses a boundary without breaking someone else. Before any breaking change, get an impact report.

Copy-paste prompt — dependency-impact analysis before a breaking change:

I need to change the signature of UserService.authenticate(). Analyze the
blast radius:
1. Every call site, grouped by owning service
2. The arguments each caller passes
3. How each handles the response shape and which errors it expects
4. Any indirect/transitive dependents

Then propose a backward-compatible migration: add authenticateV2(), route the
old method through it, and give me a deprecation timeline. Cite file and line
for each call site.

Pair this with auto-generated contracts. Ask the AI to produce an OpenAPI spec and event schemas for a service another team consumes — that turns “go read our code” into a stable boundary they can integrate against without spelunking through your internals.

When This Breaks

Large-codebase AI workflows fail in specific, recognizable ways. Know the recovery for each.

The semantic index goes stale after a big merge. Vector indexes drift when thousands of lines change at once, so the AI cites files that moved or no longer exist. Re-index after large merges or rebases (Zilliz Claude Context re-indexes incrementally on file changes, but force a full reindex after history rewrites), and treat any file path the AI gives you as a claim to verify with a quick open, not gospel.

The AI invents precise-looking numbers. Ask “how many callbacks are there?” on a huge repo and you may get “47,832” — a confident hallucination, because it never actually scanned every file. Force it through a real tool: “Use ripgrep / claude-context and report the exact count with the command you ran.” Trust counts only when they come with a reproducible command.

Context-window blowups. Pasting or @-mentioning a 200k-line subtree degrades answers and burns tokens. If responses get vague or the tool truncates, you loaded too much — drop back to the context hierarchy, scope to one service, and expand only when the answer demands it.

Parallel-refactor branch conflicts. When two teams edit a shared file the partition step missed, you get silent merge breakage. Re-run the cross-branch collision prompt before each integration, and keep shared utilities on a single owner’s branch rather than splitting them.

MCP server won’t connect. If the AI can’t see the index, confirm the server is registered (claude mcp list, codex mcp list, or check ~/.cursor/mcp.json) and that OPENAI_API_KEY / MILVUS_TOKEN are set in its env. For the local SSE server, verify it’s actually listening on localhost:8978 before debugging anything else.

What’s Next

Code Quality at Scale — keeping standards consistent once you’re changing code across the monolith
Monorepo Management — scoping AI context cleanly when many packages share one repo
Unit Testing with AI — turning characterization tests into a real safety net
Essential MCP Servers — wiring up database, git, and browser MCP servers to extend these workflows