Skip to content

Large Codebase Management: Tips 36-50

You ask Claude to refactor the auth flow in a 400k-line monorepo, and it confidently edits the wrong service: the deprecated legacy-auth package instead of the live one, because it never loaded the right context. On a large codebase, the binding constraint is no longer “can the model write the code” but “does it have the right slice of the repo in context, and only that slice.”

These 15 tips cover how to navigate, understand, and safely modify enterprise-scale codebases with Claude Code without drowning it in irrelevant context or letting parallel sessions clobber each other.

  • A repeatable way to onboard onto an unfamiliar codebase using agentic search instead of reading files yourself
  • Copy-paste prompts for dependency-impact analysis, N+1 audits, and security sweeps across millions of lines
  • A hierarchical CLAUDE.md layout that gives Claude the right context per package without bloating every prompt
  • A parallel-instance workflow that lets you work several modules at once without context pollution
  • The failure modes that bite on large repos, and the recovery move for each

Tip 36: Leverage Claude’s Codebase Awareness

Section titled “Tip 36: Leverage Claude’s Codebase Awareness”

Claude Code uses agentic search to understand your project structure automatically. Instead of pasting files or explaining the layout yourself, you can ask a question like “Explain how the authentication system works” and Claude will search for the auth-related files, identify the key components, trace the dependencies, follow the flow, and give you a grounded explanation.

Agentic Search Capabilities

  • Pattern Recognition: Finds similar code patterns across files
  • Dependency Tracing: Understands import chains and relationships
  • Context Building: Automatically gathers relevant context
  • Smart Filtering: Focuses on important files, ignores noise
  • Cross-Reference: Links related functionality across modules

A typical onboarding question on a data platform looks like this:

Claude searches the repo, follows the import and call chains, and answers with the endpoint handlers, the transformation pipeline, the schema relationships, and the dashboard query patterns, instead of you spending a day reading files manually.

Claude Code excels where other tools fail with massive files:

Suppose you have an 18,000-line legacy React component (the kind that accretes in any long-lived app). The trick is to give Claude a precise anchor instead of asking it to hold the whole file in its head:

Update the handleSubmit function in src/components/CheckoutForm.tsx
(around line 8500) so it debounces duplicate submissions. Show me
just that function before and after, not the whole file.

Instead of grepping yourself, let Claude trace structure for you. These are prompts you type into the Claude Code REPL, not shell commands:

  • “Show me all places where we handle user permissions.”
  • “How do our microservices communicate?”
  • “Where is the email validation logic?”
  • “Find all React components that directly access localStorage.”
  • “Check if any frontend components import from backend modules.”
  • “Find all synchronous file operations in our async handlers.”

The two queries that pay off most on a large codebase are dependency-impact analysis (before a risky change) and a cross-cutting performance audit:

Structure large refactoring projects effectively:

  1. Initial Analysis

    Terminal window
    "Analyze the current authentication system and identify areas for improvement"
  2. Create a Plan

    Terminal window
    "Create a step-by-step plan to migrate from session-based to JWT authentication"
  3. Implement Incrementally

    Terminal window
    "Step 1: Create new JWT utility functions"
    "Step 2: Update user model to support refresh tokens"
    "Step 3: Modify login endpoint"
  4. Verify Each Step

    Terminal window
    "Write tests for the JWT utilities"
    "Verify backward compatibility"

Task Breakdown Strategy

  • Size Limit: Keep each task under 200 lines of changes
  • Test First: Write tests before implementation
  • Checkpoint: Commit after each successful step
  • Rollback Plan: Always have a way to revert
  • Document: Update CLAUDE.md with decisions

Optimize Claude’s performance with targeted context:

Terminal window
# Less effective: Vague request
"Optimize our application"
# More effective: Focused request
"Optimize the database queries in the UserRepository class"
# Even better: Specific context
"The getUsersWithOrders method in UserRepository has N+1 query issues.
Optimize it using eager loading."

The real constraint is the context budget, not a line count. Current models (Claude Fable 5, Opus 4.8, and Sonnet 4.6) carry a 1M-token window, but every file you pull in competes for it, and quality degrades as you fill it with irrelevant code. As a rough rule of thumb:

  • A focused module or a few related files: name them and let Claude load the lot.
  • A package spanning many files: scope the prompt to one concern (“the query layer,” “the auth middleware”) so search pulls only what’s relevant.
  • A whole monorepo: never load it all. Point Claude at one package via --add-dir, lean on hierarchical CLAUDE.md files, and work module by module.

Tip 41: Use Multiple Instances for Different Areas

Section titled “Tip 41: Use Multiple Instances for Different Areas”

Run parallel Claude Code instances, each scoped to one area of the repo. Start each one in its own terminal with the relevant working directory:

Terminal window
# Terminal 1: Frontend
claude --add-dir ./frontend
# Terminal 2: Backend API
claude --add-dir ./backend
# Terminal 3: Database migrations
claude --add-dir ./database
# Terminal 4: Tests
claude --add-dir ./tests

Then drive each session with a focused prompt, for example “Implement the new user dashboard” in the frontend instance and “Create REST endpoints for the dashboard data” in the backend one.

Benefits of parallel instances:

  • No Context Switching: Each instance stays focused
  • Team Simulation: Work like a team of developers
  • Faster Development: Complete tasks simultaneously
  • Better Organization: Clear separation of concerns

Tip 42: Leverage Filesystem as Shared Workspace

Section titled “Tip 42: Leverage Filesystem as Shared Workspace”

Use the filesystem as the handoff point between instances. One generates artifacts; the others consume them:

  • Instance 1: “Generate TypeScript interfaces from our API responses and write them to shared/types/api.types.ts.”
  • Instance 2: “Create React Query hooks using the types in shared/types/api.types.ts.”
  • Instance 3: “Document the types in shared/types/ with usage examples.”

Shared Workspace Patterns

project/
├── .claude/
│ ├── generated/ # AI-generated code
│ ├── templates/ # Reference implementations
│ └── workspace/ # Shared working files
├── docs/
│ └── ai-sessions/ # Session documentation

Study external patterns without copying their source into your repo (which drags in licensing baggage). Have Claude summarize the approach, then design your own:

Tip 43: Implement Hierarchical CLAUDE.md Files

Section titled “Tip 43: Implement Hierarchical CLAUDE.md Files”

Structure documentation for large monorepos:

monorepo/
├── CLAUDE.md # Global rules and patterns
├── packages/
│ ├── frontend/
│ │ ├── CLAUDE.md # Frontend-specific
│ │ └── src/
│ │ └── components/
│ │ └── CLAUDE.md # Component guidelines
│ ├── backend/
│ │ ├── CLAUDE.md # Backend patterns
│ │ └── src/
│ │ ├── services/
│ │ │ └── CLAUDE.md # Service patterns
│ │ └── models/
│ │ └── CLAUDE.md # Data model rules
│ └── shared/
│ └── CLAUDE.md # Shared code rules

Example hierarchical documentation:

# Monorepo Overview
## Architecture Principles
- Microservices with shared libraries
- Event-driven communication
- TypeScript throughout
## Global Standards
- Conventional commits
- 100% test coverage for shared code
- No circular dependencies

Help Claude understand your system design:

# Architecture Documentation
## System Overview
```mermaid
graph TD
A[API Gateway] --> B[User Service]
A --> C[Product Service]
A --> D[Order Service]
B --> E[PostgreSQL]
C --> F[MongoDB]
D --> E
D --> G[Redis Cache]
B --> H[Event Bus]
C --> H
D --> H
```
## Design Patterns
1. **Repository Pattern**: All database access through repositories
2. **CQRS**: Separate read/write models for complex domains
3. **Event Sourcing**: Audit trail for critical operations
4. **Circuit Breaker**: For external service calls
5. **Saga Pattern**: For distributed transactions
## Communication Patterns
- **Sync**: REST APIs with OpenAPI specs
- **Async**: RabbitMQ for events
- **Real-time**: WebSockets for live updates
## Data Flow
1. Client → API Gateway (authentication)
2. Gateway → Microservice (authorized request)
3. Service → Database (data operation)
4. Service → Event Bus (state change)
5. Other Services → Event Bus (react to changes)

Ask sophisticated questions about code relationships:

Terminal window
# Dependency analysis
"Show me all services that depend on the User model"
"What would break if I change the authenticate method signature?"
"Find circular dependencies in our import structure"
# Performance analysis
"Identify all database queries in hot code paths"
"Find synchronous operations that could be async"
"Show me all uncached expensive computations"
# Security audit
"Find all user input that isn't validated"
"Show me everywhere we construct dynamic SQL"
"Identify exposed sensitive data in API responses"
# Architecture validation
"Verify all services follow our repository pattern"
"Find direct database access outside of repositories"
"Check if any frontend code imports backend modules"

Manage token usage in large projects:

Terminal window
# 1. Clear frequently
/clear
# 2. Focus conversations
"Work only on the authentication module"
# 3. Use specific file references
@auth/login.service.ts
# Instead of: "the login service file"
# 4. Compress context
/compact
# 5. Remove unnecessary files
"Ignore test files for this task"

Tip 47: Create Module-Specific Documentation

Section titled “Tip 47: Create Module-Specific Documentation”

Document each major module comprehensively:

# Payment Module Documentation
## Overview
Handles all payment processing including credit cards,
PayPal, and cryptocurrency payments.
## Key Files
- `payment.service.ts` - Main service orchestrator
- `processors/` - Payment processor implementations
- `models/transaction.model.ts` - Transaction data model
- `webhooks/` - Payment provider webhooks
## Critical Business Logic
1. **Retry Logic**: 3 attempts with exponential backoff
2. **Idempotency**: Use transaction_id to prevent duplicates
3. **Audit Trail**: Every operation logged to audit_log table
4. **Refunds**: Max 90 days, requires manager approval
## Integration Points
- User Service: For customer data
- Order Service: For order fulfillment
- Notification Service: For payment receipts
- Audit Service: For compliance logging
## Testing Requirements
- Unit tests: Mock all external providers
- Integration tests: Use sandbox environments
- Load tests: 1000 TPS minimum
- Security tests: PCI compliance required
## Common Issues
1. Webhook timeouts - implement async processing
2. Currency conversion - cache rates for 1 hour
3. Failed payments - clear user communication
4. Partial refunds - complex state management

Approach large refactoring systematically:

  1. Analyze Current State

    Terminal window
    "Analyze the authentication system and create a refactoring plan"
  2. Create Safety Net

    Terminal window
    "Write comprehensive tests for current authentication behavior"
  3. Refactor in Small Steps

    Terminal window
    "Step 1: Extract authentication logic into separate service"
    "Step 2: Create interfaces for authentication providers"
    "Step 3: Implement JWT provider"
    "Step 4: Add OAuth providers"
  4. Maintain Backward Compatibility

    Terminal window
    "Create adapter layer for old authentication API"
  5. Migration Strategy

    Terminal window
    "Create migration plan for existing sessions"

Incremental Refactoring Rules

  • Never break existing functionality
  • Each step should be deployable
  • Tests must pass after each change
  • Document decisions in CLAUDE.md
  • Keep PRs under 400 lines

Use Claude to find patterns and inconsistencies:

Terminal window
# Find inconsistent patterns
"Find all different error handling patterns in our codebase"
"Show me all the different ways we're validating email addresses"
"Identify inconsistent naming conventions"
# Locate duplicate code
"Find similar code patterns that could be refactored"
"Show me duplicate business logic across services"
# Architecture violations
"Find all places where the presentation layer directly accesses the database"
"Show me services calling other services synchronously"
# Performance patterns
"Find all N+1 query patterns"
"Locate all synchronous I/O in request handlers"
"Show me all uncached database queries"

A security sweep is one of the highest-leverage uses of pattern recognition on a large codebase. A single prompt can surface every risky query construction at once:

A sweep like this typically surfaces a handful of distinct query-building styles and points you straight at the ones that interpolate untrusted input, instead of you auditing thousands of call sites by hand.

Tip 50: Implement Codebase Exploration Workflows

Section titled “Tip 50: Implement Codebase Exploration Workflows”

Develop systematic approaches for understanding large codebases:

Terminal window
# 1. High-level understanding
"What does this project do? Explain the main purpose and architecture"
# 2. Identify entry points
"Show me the main entry points for this application"
# 3. Trace critical paths
"Trace the flow of a user login request"
# 4. Understand data model
"Explain the core data models and their relationships"
# 5. Identify key patterns
"What design patterns are used in this codebase?"

Large Codebase Checklist

  • Use hierarchical CLAUDE.md files
  • Run multiple parallel instances
  • Focus on specific modules per session
  • Clear context between unrelated tasks
  • Document architectural decisions
  • Create systematic exploration workflows
  • Use incremental refactoring approaches
  • Leverage pattern recognition
  • Maintain comprehensive tests
  • Monitor token usage with /cost

Key principles for success:

  1. Think in Systems: Understand relationships and dependencies
  2. Work Incrementally: Small, verified changes
  3. Maintain Context: Use CLAUDE.md files effectively
  4. Leverage Parallelism: Multiple instances for different concerns
  5. Trust the Search: Let Claude find patterns you might miss

Large codebases fail in specific, recognizable ways. Here’s what to watch for and how to recover.

Context-window overflow on a giant file. You ask Claude to edit a 20k-line file and it loses track, edits the wrong block, or truncates. Recovery: stop loading the whole file. Anchor the change to a function name and line range, or ask for the target region first (“show me just handleSubmit”), then edit that slice in a follow-up.

Agentic search misses dynamically-referenced code. Search finds static imports but not handlers wired up by string keys, reflection, or a registry built at runtime, so a “find every caller” sweep comes back incomplete. Recovery: name the indirection explicitly (“we register routes by string in router.config.ts; trace those too”) and cross-check with a literal grep before you trust the list for a risky refactor.

Parallel instances clobber a shared file. Two sessions both edit shared/types/api.types.ts and the second silently overwrites the first. Recovery: give each instance a non-overlapping directory scope, commit (or stash) between handoffs so the filesystem is the single source of truth, and never let two instances own the same file at once.

Stale CLAUDE.md drift. A CLAUDE.md still describes the old session-based auth after you migrated to JWT, so Claude follows instructions that no longer match reality. Recovery: treat CLAUDE.md as code, review it in PRs, and periodically ask Claude to reconcile it (“compare the auth section of this CLAUDE.md against the actual auth/ package and flag anything out of date”).

The repo is too big to reason about at all. Even scoped prompts wander because the package itself is a tangle. Recovery: don’t ask for a change, ask for a map first (“produce a dependency diagram of this package and identify the three highest-coupling modules”), then refactor against that map one module at a time.

With strategies for large codebases in hand, the next lever is your day-to-day loop. Continue to Workflow Optimization to turn these one-off techniques into repeatable habits, then see Performance and Cost Management for keeping token spend in check at scale.