Skip to content

Codebase Exploration & Analysis

Imagine joining a new team and facing a 100,000-line codebase with cryptic variable names, unclear architecture, and minimal documentation. What used to take weeks of painful exploration now takes hours with Claude Code. This lesson transforms you into a codebase archaeologist, uncovering hidden patterns and understanding complex systems at terminal velocity.

Scenario: You’ve just inherited a legacy e-commerce platform. The original team is gone, documentation is sparse, and you need to add a new payment provider by Friday. Where do you even begin?

  1. Launch Claude Code at the project root

    Terminal window
    cd legacy-ecommerce-platform
    claude
  2. Get the lay of the land

    > What does this project do? Give me a high-level overview
  3. Understand the architecture

    > Explain the folder structure and main architecture patterns used
  4. Find the entry points

    > Where is the main entry point? How does the application start?
  5. Map the critical paths

    > Trace the flow of a user making a purchase from start to finish

Within minutes, Claude provides:

  • Project purpose and main features
  • Technology stack identification
  • Architectural patterns (MVC, microservices, etc.)
  • Key directories and their responsibilities
  • Data flow diagrams
  • External dependencies and integrations

Start broad, then drill down:

> Give me a comprehensive overview of this codebase, including:
> - Main technologies and frameworks
> - Project structure and organization
> - Key business domains
> - External integrations
> - Testing approach

Claude analyzes the entire project and provides a structured report:

## Project Overview: Legacy E-Commerce Platform
### Technology Stack
- Backend: Node.js with Express 4.x
- Frontend: React 16.x with Redux
- Database: PostgreSQL 12, Redis for caching
- Payment: Stripe, PayPal integrations
- Infrastructure: Docker, AWS deployment
### Architecture
- Monolithic with service-oriented modules
- RESTful API with JWT authentication
- Server-side rendering for SEO
- Event-driven order processing
### Key Domains
1. **User Management** (`/src/auth`, `/src/users`)
2. **Product Catalog** (`/src/products`, `/src/inventory`)
3. **Order Processing** (`/src/orders`, `/src/payments`)
4. **Fulfillment** (`/src/shipping`, `/src/warehouses`)
### Critical Issues Found
- Outdated dependencies (security vulnerabilities)
- Inconsistent error handling
- Missing test coverage in payment module

When searching for specific functionality:

> I need to find where discount codes are validated.
> Search for discount, coupon, promo code logic

Claude uses its agentic search capabilities to:

  1. Find all relevant files containing discount logic
  2. Trace the execution path
  3. Identify edge cases and business rules
  4. Show related database schemas
  5. Highlight potential issues

Scenario: You need to upgrade a critical library, but you’re unsure what will break.

> Analyze all dependencies of the authentication module.
> Show me what depends on it and what it depends on

Claude generates a comprehensive dependency graph:

Direct Dependencies

  • bcrypt - Password hashing
  • jsonwebtoken - JWT generation
  • passport - Authentication strategies
  • redis - Session storage

Internal Dependents

  • /api/routes/* - All API endpoints
  • /middleware/auth.js - Auth middleware
  • /services/user.js - User service
  • /workers/session-cleanup.js - Background jobs

Database Dependencies

  • users table - User data
  • sessions table - Active sessions
  • auth_logs table - Security audit
  • permissions table - Role-based access

External Integrations

  • OAuth providers (Google, GitHub)
  • Email service for password resets
  • SMS service for 2FA
  • Audit logging service

Before making changes:

> If I change the user authentication to use OAuth only,
> what parts of the codebase would be affected?

Claude provides:

  • List of affected files and functions
  • Required database migrations
  • API contract changes
  • Frontend components needing updates
  • Test suites requiring modification
> Analyze this codebase for common anti-patterns and code smells.
> Focus on maintainability issues

Claude’s analysis might reveal:

// 1. God Object - OrderController doing too much
class OrderController {
// 2000+ lines handling:
// - Order creation
// - Payment processing
// - Inventory management
// - Email notifications
// - Analytics tracking
}
// 2. Copy-Paste Programming
// Found 15 instances of nearly identical error handling
// 3. Magic Numbers
if (order.total > 1000) { // What does 1000 represent?
applyDiscount(0.1); // Why 10%?
}
// 4. Callback Hell
getUserData((user) => {
getOrderHistory(user.id, (orders) => {
calculateLoyaltyPoints(orders, (points) => {
updateUserProfile(user.id, points, (result) => {
// ... deeper nesting
});
});
});
});
> Identify potential performance bottlenecks in database queries

Claude examines your codebase and finds:

// ISSUE 1: N+1 Query Problem
const orders = await db.query('SELECT * FROM orders WHERE user_id = ?', [userId]);
for (const order of orders) {
// This runs a query for EACH order!
order.items = await db.query('SELECT * FROM order_items WHERE order_id = ?', [order.id]);
}
// CLAUDE'S SUGGESTION: Use a single join query
const ordersWithItems = await db.query(`
SELECT o.*,
json_agg(oi.*) as items
FROM orders o
LEFT JOIN order_items oi ON o.id = oi.order_id
WHERE o.user_id = ?
GROUP BY o.id
`, [userId]);
// ISSUE 2: Missing Indexes
// Found queries filtering on unindexed columns:
await db.query('SELECT * FROM products WHERE sku = ?'); // 'sku' not indexed
await db.query('SELECT * FROM users WHERE email = ?'); // 'email' not indexed
// ISSUE 3: Loading Unnecessary Data
const users = await db.query('SELECT * FROM users'); // Loading all columns
// But only using: user.id, user.name, user.email
> Check for potential memory leaks or inefficient memory usage

Claude identifies problematic patterns:

// ISSUE: Event listeners not cleaned up
class OrderMonitor {
constructor() {
// Listeners added but never removed
eventBus.on('order.created', this.handleOrder);
eventBus.on('order.updated', this.handleOrder);
}
// Missing cleanup method
}
// ISSUE: Unbounded cache growth
const cache = {};
function cacheResult(key, value) {
cache[key] = value; // Cache grows forever!
}
// ISSUE: Large arrays kept in memory
let allOrdersEver = []; // Grows indefinitely

Scenario: Your platform uses Node.js for the API, Python for data processing, and Go for real-time services.

> This project uses multiple languages. Give me an overview of:
> - Which language handles what functionality
> - How services communicate
> - Data flow between different language boundaries

Claude maps the polyglot architecture:

Architecture Overview:
Node.js API (./api/)
├── REST endpoints
├── GraphQL server
└── WebSocket handlers
Redis Queue / RabbitMQ
Python Workers (./workers/)
├── Data pipeline
├── ML predictions
└── Report generation
PostgreSQL / S3
Go Services (./services/)
├── Real-time notifications
├── WebRTC signaling
└── High-frequency trading
Communication:
- HTTP/REST between frontend and Node.js
- Message queues between Node.js and Python
- gRPC between Python and Go services
- Shared PostgreSQL for persistent data
- Redis for caching and pub/sub
> Create comprehensive documentation for the order processing module.
> Include architecture decisions, flow diagrams, and API documentation

Claude generates:

# Order Processing Module
## Overview
The order processing module handles the complete lifecycle of customer orders,
from cart to delivery.
## Architecture Decisions
### Event-Driven Design
**Decision**: Use event-driven architecture for order state changes
**Rationale**:
- Decouples order processing from other systems
- Enables real-time notifications
- Supports eventual consistency
- Allows for easy addition of new order events
### State Machine Pattern
**Decision**: Implement orders as a finite state machine
**States**: pending → confirmed → processing → shipped → delivered
**Benefits**:
- Clear state transitions
- Prevents invalid state changes
- Easier testing and debugging
## Component Diagram
```mermaid
graph TD
A[Order API] --> B[Order Service]
B --> C[State Machine]
B --> D[Event Publisher]
D --> E[Notification Service]
D --> F[Inventory Service]
D --> G[Payment Service]
C --> H[(Order Database)]
  • POST /orders - Create new order
  • GET /orders/:id - Get order details
  • PATCH /orders/:id/status - Update order status
  • POST /orders/:id/cancel - Cancel order
</TabItem>
<TabItem label="Flow Diagram">
```mermaid
sequenceDiagram
participant C as Customer
participant API as Order API
participant OS as Order Service
participant PS as Payment Service
participant IS as Inventory Service
participant NS as Notification Service
C->>API: Create Order
API->>OS: Process Order
OS->>PS: Charge Payment
PS-->>OS: Payment Result
alt Payment Success
OS->>IS: Reserve Inventory
IS-->>OS: Inventory Reserved
OS->>NS: Send Confirmation
OS-->>API: Order Confirmed
API-->>C: Success Response
else Payment Failed
OS->>NS: Send Failure Notice
OS-->>API: Order Failed
API-->>C: Error Response
end

Set up Claude to maintain documentation:

> Create a CLAUDE.md file that captures the essential knowledge
> about this codebase for future developers

Claude generates a comprehensive CLAUDE.md:

# E-Commerce Platform Knowledge Base
## Quick Start
- `npm install` - Install dependencies
- `docker-compose up` - Start all services
- `npm run dev` - Start development server
- `npm test` - Run test suite
## Architecture Overview
Monolithic Node.js application with:
- Express.js API
- React frontend
- PostgreSQL database
- Redis caching
- Docker deployment
## Key Business Logic
### Order Processing
Orders follow a state machine pattern:
- Customer adds items to cart
- Checkout validates inventory
- Payment processed via Stripe/PayPal
- Order confirmed and inventory reserved
- Fulfillment triggered
- Shipping updates via webhooks
### Pricing Engine
- Base prices in products table
- Discounts applied via rules engine
- Tax calculated based on shipping address
- Currency conversion via exchange rate service
## Common Tasks
### Adding a Payment Provider
1. Implement provider interface in `/src/payments/providers/`
2. Add configuration to `/config/payments.js`
3. Update PaymentService to include new provider
4. Add provider-specific webhook handler
5. Update frontend payment options
### Debugging Order Issues
1. Check order state in database
2. Review order_events table for history
3. Check payment provider dashboard
4. Review logs: `docker logs ecommerce-api`
## Known Issues
- Memory leak in report generation (workaround: restart worker daily)
- Race condition in inventory reservation (use database locks)
- Slow product search (needs Elasticsearch integration)
## Testing
- Unit tests: `npm run test:unit`
- Integration tests: `npm run test:integration`
- E2E tests: `npm run test:e2e`
- Load tests: `npm run test:load`
## Deployment
- Staging: Automatic on merge to develop
- Production: Manual approval required
- Rollback: `npm run deploy:rollback`
> Perform a security audit of this codebase. Look for:
> - SQL injection vulnerabilities
> - XSS possibilities
> - Authentication bypasses
> - Sensitive data exposure
> - Outdated dependencies with known vulnerabilities

Claude’s security analysis:

For each issue, Claude provides:

  • Exact location in code
  • Severity assessment
  • Proof of concept
  • Remediation steps
  • Prevention strategies

Instead of diving into code randomly:

> What are the most important parts of this codebase to understand first?

Begin broad, then narrow:

  1. Overall architecture
  2. Key subsystems
  3. Specific modules
  4. Individual functions
> Based on my analysis, here's how I think the payment flow works: [description].
> Is this correct? What am I missing?
> Create a diagram showing how these components interact
> What coding patterns and conventions does this team follow?

Scenario: Your company’s system spans 12 repositories.

Terminal window
# Add multiple directories to Claude's context
claude --add-dir ../user-service ../payment-service ../notification-service

Then:

> How do these three services communicate? Trace a user registration
> flow across all three repositories

Claude provides:

  • Service interaction diagrams
  • API contracts between services
  • Shared data models
  • Message queue flows
  • Common libraries and dependencies

You’ve learned how to navigate and understand complex codebases with Claude Code. This skill is fundamental - whether you’re joining a new team, inheriting a project, or trying to optimize existing systems.

Remember: Claude Code isn’t just a search tool. It’s your intelligent guide through the labyrinth of code, helping you understand not just what the code does, but why it was written that way and how it all fits together. Use these techniques to become productive in new codebases in hours instead of weeks.