Large Codebase Management: Tips 36-50

You ask Claude to refactor the auth flow in a 400k-line monorepo, and it confidently edits the wrong service: the deprecated legacy-auth package instead of the live one, because it never loaded the right context. On a large codebase, the binding constraint is no longer “can the model write the code” but “does it have the right slice of the repo in context, and only that slice.”

These 15 tips cover how to navigate, understand, and safely modify enterprise-scale codebases with Claude Code without drowning it in irrelevant context or letting parallel sessions clobber each other.

What You’ll Walk Away With

A repeatable way to onboard onto an unfamiliar codebase using agentic search instead of reading files yourself
Copy-paste prompts for dependency-impact analysis, N+1 audits, and security sweeps across millions of lines
A hierarchical CLAUDE.md layout that gives Claude the right context per package without bloating every prompt
A parallel-instance workflow that lets you work several modules at once without context pollution
The failure modes that bite on large repos, and the recovery move for each

Understanding Claude Code’s Strengths

Tip 36: Leverage Claude’s Codebase Awareness

Claude Code uses agentic search to understand your project structure automatically. Instead of pasting files or explaining the layout yourself, you can ask a question like “Explain how the authentication system works” and Claude will search for the auth-related files, identify the key components, trace the dependencies, follow the flow, and give you a grounded explanation.

Agentic Search Capabilities

Pattern Recognition: Finds similar code patterns across files
Dependency Tracing: Understands import chains and relationships
Context Building: Automatically gathers relevant context
Smart Filtering: Focuses on important files, ignores noise
Cross-Reference: Links related functionality across modules

A typical onboarding question on a data platform looks like this:

Claude searches the repo, follows the import and call chains, and answers with the endpoint handlers, the transformation pipeline, the schema relationships, and the dashboard query patterns, instead of you spending a day reading files manually.

Tip 37: Handle Extremely Large Files

Claude Code excels where other tools fail with massive files:

Suppose you have an 18,000-line legacy React component (the kind that accretes in any long-lived app). The trick is to give Claude a precise anchor instead of asking it to hold the whole file in its head:

Update the handleSubmit function in src/components/CheckoutForm.tsx
(around line 8500) so it debounces duplicate submissions. Show me
just that function before and after, not the whole file.

Find every deprecated `useLegacyFormState` call in CheckoutForm.tsx
and list the line numbers, so I can decide which to migrate first.

First, walk me through the validation logic in this component.
Then, in a follow-up, update only the error-handling branch.

Instead of grepping yourself, let Claude trace structure for you. These are prompts you type into the Claude Code REPL, not shell commands:

“Show me all places where we handle user permissions.”
“How do our microservices communicate?”
“Where is the email validation logic?”
“Find all React components that directly access localStorage.”
“Check if any frontend components import from backend modules.”
“Find all synchronous file operations in our async handlers.”

The two queries that pay off most on a large codebase are dependency-impact analysis (before a risky change) and a cross-cutting performance audit:

Tip 39: Break Down Complex Tasks

Structure large refactoring projects effectively:

Initial Analysis

"Analyze the current authentication system and identify areas for improvement"

Create a Plan

"Create a step-by-step plan to migrate from session-based to JWT authentication"

Implement Incrementally

"Step 1: Create new JWT utility functions"
"Step 2: Update user model to support refresh tokens"
"Step 3: Modify login endpoint"

Verify Each Step

"Write tests for the JWT utilities"
"Verify backward compatibility"

Task Breakdown Strategy

Size Limit: Keep each task under 200 lines of changes
Test First: Write tests before implementation
Checkpoint: Commit after each successful step
Rollback Plan: Always have a way to revert
Document: Update CLAUDE.md with decisions

Tip 40: Provide Code in Focused Chunks

Optimize Claude’s performance with targeted context:

# Less effective: Vague request
"Optimize our application"

# More effective: Focused request
"Optimize the database queries in the UserRepository class"

# Even better: Specific context
"The getUsersWithOrders method in UserRepository has N+1 query issues.
 Optimize it using eager loading."

The real constraint is the context budget, not a line count. On the Anthropic API, Claude Fable 5, Opus 5, and Sonnet 5 carry a 1M-token window; plan and provider access can differ (notably, Opus 1M requires usage credits on Pro, and gateways can budget Sonnet 5 at 200K). Every file you pull in competes for that window, and quality degrades as you fill it with irrelevant code. As a rough rule of thumb:

A focused module or a few related files: name them and let Claude load the lot.
A package spanning many files: scope the prompt to one concern (“the query layer,” “the auth middleware”) so search pulls only what’s relevant.
A whole monorepo: never load it all. Point Claude at one package via --add-dir, lean on hierarchical CLAUDE.md files, and work module by module.

Parallel Development Strategies

Tip 41: Use Multiple Instances for Different Areas

Run parallel Claude Code instances, each scoped to one area of the repo. Start each one in its own terminal with the relevant working directory:

# Terminal 1: Frontend
claude --add-dir ./frontend

# Terminal 2: Backend API
claude --add-dir ./backend

# Terminal 3: Database migrations
claude --add-dir ./database

# Terminal 4: Tests
claude --add-dir ./tests

Then drive each session with a focused prompt, for example “Implement the new user dashboard” in the frontend instance and “Create REST endpoints for the dashboard data” in the backend one.

Benefits of parallel instances:

No Context Switching: Each instance stays focused
Team Simulation: Work like a team of developers
Faster Development: Complete tasks simultaneously
Better Organization: Clear separation of concerns

Tip 42: Leverage Filesystem as Shared Workspace

Use the filesystem as the handoff point between instances. One generates artifacts; the others consume them:

Instance 1: “Generate TypeScript interfaces from our API responses and write them to shared/types/api.types.ts.”
Instance 2: “Create React Query hooks using the types in shared/types/api.types.ts.”
Instance 3: “Document the types in shared/types/ with usage examples.”

Shared Workspace Patterns

project/
├── .claude/
│   ├── generated/    # AI-generated code
│   ├── templates/    # Reference implementations
│   └── workspace/    # Shared working files
├── docs/
│   └── ai-sessions/  # Session documentation

Study external patterns without copying their source into your repo (which drags in licensing baggage). Have Claude summarize the approach, then design your own:

Tip 43: Implement Hierarchical CLAUDE.md Files

Structure documentation for large monorepos:

monorepo/
├── CLAUDE.md                    # Global rules and patterns
├── packages/
│   ├── frontend/
│   │   ├── CLAUDE.md           # Frontend-specific
│   │   └── src/
│   │       └── components/
│   │           └── CLAUDE.md   # Component guidelines
│   ├── backend/
│   │   ├── CLAUDE.md           # Backend patterns
│   │   └── src/
│   │       ├── services/
│   │       │   └── CLAUDE.md   # Service patterns
│   │       └── models/
│   │           └── CLAUDE.md   # Data model rules
│   └── shared/
│       └── CLAUDE.md           # Shared code rules

Example hierarchical documentation:

Root CLAUDE.md
Service CLAUDE.md

# Monorepo Overview

## Architecture Principles
- Microservices with shared libraries
- Event-driven communication
- TypeScript throughout

## Global Standards
- Conventional commits
- 100% test coverage for shared code
- No circular dependencies

# User Service

## Responsibilities
- User authentication
- Profile management
- Permission handling

## Dependencies
- Shared auth library
- Database service
- Event bus

## Patterns
- Repository pattern for data access
- JWT for authentication
- Event sourcing for audit

Tip 44: Document Architecture Patterns

Help Claude understand your system design:

# Architecture Documentation

## System Overview
```mermaid
graph TD
    A[API Gateway] --> B[User Service]
    A --> C[Product Service]
    A --> D[Order Service]
    B --> E[PostgreSQL]
    C --> F[MongoDB]
    D --> E
    D --> G[Redis Cache]
    B --> H[Event Bus]
    C --> H
    D --> H
```

## Design Patterns
1. **Repository Pattern**: All database access through repositories
2. **CQRS**: Separate read/write models for complex domains
3. **Event Sourcing**: Audit trail for critical operations
4. **Circuit Breaker**: For external service calls
5. **Saga Pattern**: For distributed transactions

## Communication Patterns
- **Sync**: REST APIs with OpenAPI specs
- **Async**: RabbitMQ for events
- **Real-time**: WebSockets for live updates

## Data Flow
1. Client → API Gateway (authentication)
2. Gateway → Microservice (authorized request)
3. Service → Database (data operation)
4. Service → Event Bus (state change)
5. Other Services → Event Bus (react to changes)

Tip 45: Use Context-Aware Queries

Ask sophisticated questions about code relationships:

# Dependency analysis
"Show me all services that depend on the User model"
"What would break if I change the authenticate method signature?"
"Find circular dependencies in our import structure"

# Performance analysis
"Identify all database queries in hot code paths"
"Find synchronous operations that could be async"
"Show me all uncached expensive computations"

# Security audit
"Find all user input that isn't validated"
"Show me everywhere we construct dynamic SQL"
"Identify exposed sensitive data in API responses"

# Architecture validation
"Verify all services follow our repository pattern"
"Find direct database access outside of repositories"
"Check if any frontend code imports backend modules"

Token and Performance Optimization

Tip 46: Optimize for Token Efficiency

Manage token usage in large projects:

Token-Saving Strategies
Token Usage Patterns

# 1. Clear frequently
/clear

# 2. Focus conversations
"Work only on the authentication module"

# 3. Use specific file references
@auth/login.service.ts
# Instead of: "the login service file"

# 4. Compress context
/compact

# 5. Remove unnecessary files
"Ignore test files for this task"

Typical token usage by task:

- Simple bug fix: 2K-5K tokens
- Feature implementation: 10K-20K tokens
- Large refactoring: 50K-100K tokens
- Architecture analysis: 20K-50K tokens

Cost optimization:
- Use Claude Sonnet 5 for routine tasks
- Reserve Claude Opus 5 for complex refactors and architecture analysis
- Use Claude Fable 5 (/model fable) when velocity and quality matter most
- Clear between unrelated tasks
- Focus on specific modules

Tip 47: Create Module-Specific Documentation

Document each major module comprehensively:

# Payment Module Documentation

## Overview
Handles all payment processing including credit cards,
PayPal, and cryptocurrency payments.

## Key Files
- `payment.service.ts` - Main service orchestrator
- `processors/` - Payment processor implementations
- `models/transaction.model.ts` - Transaction data model
- `webhooks/` - Payment provider webhooks

## Critical Business Logic
1. **Retry Logic**: 3 attempts with exponential backoff
2. **Idempotency**: Use transaction_id to prevent duplicates
3. **Audit Trail**: Every operation logged to audit_log table
4. **Refunds**: Max 90 days, requires manager approval

## Integration Points
- User Service: For customer data
- Order Service: For order fulfillment
- Notification Service: For payment receipts
- Audit Service: For compliance logging

## Testing Requirements
- Unit tests: Mock all external providers
- Integration tests: Use sandbox environments
- Load tests: 1000 TPS minimum
- Security tests: PCI compliance required

## Common Issues
1. Webhook timeouts - implement async processing
2. Currency conversion - cache rates for 1 hour
3. Failed payments - clear user communication
4. Partial refunds - complex state management

Tip 48: Use Incremental Refactoring

Approach large refactoring systematically:

Analyze Current State

"Analyze the authentication system and create a refactoring plan"

Create Safety Net

"Write comprehensive tests for current authentication behavior"

Refactor in Small Steps

"Step 1: Extract authentication logic into separate service"
"Step 2: Create interfaces for authentication providers"
"Step 3: Implement JWT provider"
"Step 4: Add OAuth providers"

Maintain Backward Compatibility

"Create adapter layer for old authentication API"

Migration Strategy

"Create migration plan for existing sessions"

Incremental Refactoring Rules

Never break existing functionality
Each step should be deployable
Tests must pass after each change
Document decisions in CLAUDE.md
Keep PRs under 400 lines

Tip 49: Leverage Pattern Recognition

Use Claude to find patterns and inconsistencies:

# Find inconsistent patterns
"Find all different error handling patterns in our codebase"
"Show me all the different ways we're validating email addresses"
"Identify inconsistent naming conventions"

# Locate duplicate code
"Find similar code patterns that could be refactored"
"Show me duplicate business logic across services"

# Architecture violations
"Find all places where the presentation layer directly accesses the database"
"Show me services calling other services synchronously"

# Performance patterns
"Find all N+1 query patterns"
"Locate all synchronous I/O in request handlers"
"Show me all uncached database queries"

A security sweep is one of the highest-leverage uses of pattern recognition on a large codebase. A single prompt can surface every risky query construction at once:

A sweep like this typically surfaces a handful of distinct query-building styles and points you straight at the ones that interpolate untrusted input, instead of you auditing thousands of call sites by hand.

Tip 50: Implement Codebase Exploration Workflows

Develop systematic approaches for understanding large codebases:

# 1. High-level understanding
"What does this project do? Explain the main purpose and architecture"

# 2. Identify entry points
"Show me the main entry points for this application"

# 3. Trace critical paths
"Trace the flow of a user login request"

# 4. Understand data model
"Explain the core data models and their relationships"

# 5. Identify key patterns
"What design patterns are used in this codebase?"

# 1. Locate feature code
"Where is the payment processing implemented?"

# 2. Understand dependencies
"What does the payment system depend on?"

# 3. Find related tests
"Show me all tests for payment processing"

# 4. Check documentation
"Is there documentation for the payment system?"

# 5. Identify edge cases
"What edge cases does the payment system handle?"

# 1. Reproduce understanding
"Explain how the user authentication flow works"

# 2. Locate problem area
"Where might a login failure occur?"

# 3. Check recent changes
"What changed recently in authentication?"

# 4. Find similar issues
"Are there similar patterns that might have the same bug?"

# 5. Propose fixes
"Suggest fixes for the authentication timeout issue"

Best Practices for Large Codebases

Large Codebase Checklist

Use hierarchical CLAUDE.md files
Run multiple parallel instances
Focus on specific modules per session
Clear context between unrelated tasks
Document architectural decisions
Create systematic exploration workflows
Use incremental refactoring approaches
Leverage pattern recognition
Maintain comprehensive tests
Monitor token usage with /cost

Key principles for success:

Think in Systems: Understand relationships and dependencies
Work Incrementally: Small, verified changes
Maintain Context: Use CLAUDE.md files effectively
Leverage Parallelism: Multiple instances for different concerns
Trust the Search: Let Claude find patterns you might miss

When This Breaks

Large codebases fail in specific, recognizable ways. Here’s what to watch for and how to recover.

Context-window overflow on a giant file. You ask Claude to edit a 20k-line file and it loses track, edits the wrong block, or truncates. Recovery: stop loading the whole file. Anchor the change to a function name and line range, or ask for the target region first (“show me just handleSubmit”), then edit that slice in a follow-up.

Agentic search misses dynamically-referenced code. Search finds static imports but not handlers wired up by string keys, reflection, or a registry built at runtime, so a “find every caller” sweep comes back incomplete. Recovery: name the indirection explicitly (“we register routes by string in router.config.ts; trace those too”) and cross-check with a literal grep before you trust the list for a risky refactor.

Parallel instances clobber a shared file. Two sessions both edit shared/types/api.types.ts and the second silently overwrites the first. Recovery: give each instance a non-overlapping directory scope, commit (or stash) between handoffs so the filesystem is the single source of truth, and never let two instances own the same file at once.

Stale CLAUDE.md drift. A CLAUDE.md still describes the old session-based auth after you migrated to JWT, so Claude follows instructions that no longer match reality. Recovery: treat CLAUDE.md as code, review it in PRs, and periodically ask Claude to reconcile it (“compare the auth section of this CLAUDE.md against the actual auth/ package and flag anything out of date”).

The repo is too big to reason about at all. Even scoped prompts wander because the package itself is a tangle. Recovery: don’t ask for a change, ask for a map first (“produce a dependency diagram of this package and identify the three highest-coupling modules”), then refactor against that map one module at a time.

Next Steps

With strategies for large codebases in hand, the next lever is your day-to-day loop. Continue to Workflow Optimization to turn these one-off techniques into repeatable habits, then see Performance and Cost Management for keeping token spend in check at scale.