Skip to content

CLI Debugging Workflow

Debugging is where developers spend half their lives. That cryptic error message at 2 AM. The bug that only happens in production. The “it works on my machine” nightmare. Claude Code transforms debugging from a frustrating treasure hunt into a systematic investigation with an AI detective by your side.

Scenario: Your production API is throwing intermittent 500 errors. Users are complaining, your monitoring is going crazy, and the error logs show a cryptic “TypeError: Cannot read property ‘id’ of undefined” somewhere deep in your authentication middleware. The stack trace spans 15 files. Where do you even start?

Terminal window
# Hours of manual investigation
grep -r "Cannot read property 'id'" .
# 47 results...
# Add console.logs everywhere
console.log('HERE 1');
console.log('user:', user);
console.log('HERE 2');
# Deploy to staging, test, repeat
# Eventually find the issue after 3 hours
  1. Configure verbose logging
    Terminal window
    # Enable Claude's debug mode
    claude --debug
    # Or set environment variable
    export ANTHROPIC_LOG=debug
2. **Set up error capture**
```
> Create a debug helper that captures and formats errors
> for analysis. Include stack traces, environment info,
> and recent log entries
```
3. **Install debugging MCP servers**
```bash
# For log analysis
claude mcp add logs-analyzer
# For system monitoring
claude mcp add system-monitor
```
4. **Create debugging commands**
Save in `.claude/commands/debug-issue.md`:
```markdown
Analyze this error systematically:
1. Parse the error message and stack trace
2. Identify the root cause location
3. Trace through the execution path
4. Find related code and dependencies
5. Suggest specific fixes
6. Create a test to prevent regression
Error details: $ARGUMENTS
```
## The Systematic Debugging Workflow
### Phase 1: Information Gathering
<CardGrid>
<Card title="Error Context">
- Full error message - Complete stack trace - Environment details - Recent code changes - User
actions that trigger it
</Card>
<Card title="System State">
- Memory usage - CPU load - Database connections - External service status - Recent deployments
</Card>
<Card title="Historical Data">
- When it first appeared - Frequency patterns - Affected users - Related errors - Previous fixes
attempted
</Card>
<Card title="Code Context">
- Recent commits - Dependencies updated - Configuration changes - Feature flags modified -
Related pull requests
</Card>
</CardGrid>
### Phase 2: AI-Powered Analysis
```
> Analyze this production error. Stack trace attached.
> It happens when users with expired sessions try to
> access protected routes. Started after yesterday's deploy.
```
Claude's systematic approach:
1. **Parse the Error**
```javascript
// Claude identifies: TypeError in auth.middleware.js:47
// Attempting to access user.id when user is null
```
2. **Trace Execution Path**
```
Request → AuthMiddleware → SessionValidator → UserLoader
Session expired
Returns null
Middleware continues
Accesses null.id → ERROR
```
3. **Identify Root Cause**
```javascript
// Found: Race condition between session expiry and user loading
// Session expires AFTER validation but BEFORE user fetch completes
```
4. **Provide Fix**
```javascript
// Before (buggy):
const session = await validateSession(token);
const user = await loadUser(session.userId);
// Race condition window here!
req.user = user;
// After (fixed):
const session = await validateSession(token);
if (!session || session.isExpired()) {
return res.status(401).json({ error: 'Session expired' });
}
const user = await loadUser(session.userId);
if (!user) {
return res.status(401).json({ error: 'User not found' });
}
req.user = user;
```
## Real-World Debugging Scenarios
### Scenario 1: The Heisenbug
_The bug that disappears when you try to observe it._
```
> We have a bug that only happens in production, never in dev.
> Users report "undefined is not a function" but I can't reproduce it.
> Here are the Sentry logs and user reports.
```
<Tabs>
<TabItem label="Claude's Investigation">
```javascript
// Claude analyzes differences between environments
Production: NODE_ENV=production, Minified code, CDN assets
Development: NODE_ENV=development, Source maps, Local assets
// Discovers: Minification renamed a function that's called by string
// In dev: window['calculateTotal']() works
// In prod: window['a3f']() fails (minified name)
// Solution: Configure minifier to preserve function names
terserOptions: {
keep_fnames: true,
mangle: {
reserved: ['calculateTotal', 'validateInput']
}
}
```
</TabItem>
<TabItem label="Prevention Strategy">
```javascript
// Claude suggests defensive coding
// Instead of dynamic function calls:
const functionName = getUserFunction();
window[functionName](); // Brittle!
// Use a function map:
const functions = {
calculateTotal,
validateInput
};
const fn = functions[getUserFunction()];
if (fn) fn();
else console.error('Unknown function');
```
</TabItem>
</Tabs>
### Scenario 2: The Performance Degradation
_Everything is slow, but nothing is obviously broken._
```
> Our API response times increased from 200ms to 2s over the
> past week. No obvious errors, just everything is slower.
> CPU and memory look normal. Help me find the bottleneck.
```
Claude's performance debugging:
1. **Analyze Recent Changes**
```
> Show me all code changes in the past week that could
> affect performance. Focus on database queries, loops,
> and external API calls.
```
2. **Profile Critical Paths**
```javascript
// Claude adds timing instrumentation
const trace = async (name, fn) => {
const start = performance.now();
try {
const result = await fn();
console.log(`${name}: ${performance.now() - start}ms`);
return result;
} catch (error) {
console.error(`${name} failed:`, error);
throw error;
}
};
// Wraps key operations
const user = await trace('loadUser', () => loadUser(id));
const perms = await trace('checkPerms', () => checkPermissions(user));
```
3. **Identify Bottleneck**
```sql
-- Claude finds: New permission check runs this query
SELECT * FROM permissions p
JOIN roles r ON p.role_id = r.id
JOIN user_roles ur ON r.id = ur.role_id
WHERE ur.user_id = ? AND p.resource = ?
-- Missing index on (user_id, resource)!
```
4. **Optimize**
```sql
-- Add composite index
CREATE INDEX idx_user_resource
ON permissions(user_id, resource);
-- Response time: 2000ms → 180ms
```
### Scenario 3: The Memory Leak
_Your Node.js service crashes every 6 hours with "JavaScript heap out of memory"._
```
> Production keeps crashing with OOM errors. Memory usage
> grows slowly over hours. Here's a heap snapshot diff.
> Help me find what's leaking.
```
<Aside type="tip">
When debugging memory leaks, provide Claude with: - Heap snapshot diffs - Memory usage graphs -
Recent code changes - External service integrations
</Aside>
Claude's memory leak investigation:
```javascript
// Claude analyzes heap snapshot, finds growing array
// The culprit: Event listeners not cleaned up
class OrderMonitor {
constructor() {
// Listeners added on each request!
eventBus.on('order.created', this.handleOrder);
}
handleOrder = (order) => {
this.orders.push(order); // Never cleared!
};
}
// Fixed version:
class OrderMonitor {
constructor() {
this.handleOrder = this.handleOrder.bind(this);
eventBus.on('order.created', this.handleOrder);
}
cleanup() {
eventBus.off('order.created', this.handleOrder);
this.orders = [];
}
handleOrder(order) {
this.orders.push(order);
// Process and clear old orders
if (this.orders.length > 100) {
this.processOrders(this.orders.splice(0, 50));
}
}
}
```
## Advanced Debugging Techniques
### Distributed System Debugging
_When the bug spans multiple services._
```
> Users report orders failing randomly. The error appears in the
> order service, but might involve payment service, inventory,
> or notification service. How do we trace this?
```
Claude implements distributed tracing:
```javascript
// Add correlation IDs
const correlationId = req.headers['x-correlation-id'] || uuid();
// Pass through all service calls
const payment = await paymentService.charge({
...paymentData,
headers: { 'x-correlation-id': correlationId }
});
// Aggregate logs
> Search all service logs for correlation ID abc-123-def
> Show me the complete request flow and where it failed
```
### Time-Travel Debugging
```
> The bug happened 3 hours ago in production. User ID 12345
> couldn't complete checkout. Can we reconstruct what happened?
```
Claude creates a debugging time machine:
```javascript
// Reconstruct state at specific time
const timeTravel = async (userId, timestamp) => {
const logs = await getLogs({ userId, time: timestamp });
const dbState = await getDbSnapshot(timestamp);
const events = await getEvents({ userId, before: timestamp });
return {
userState: dbState.users.find(u => u.id === userId),
recentActions: events.slice(-10),
systemLoad: logs.filter(l => l.type === 'performance'),
errors: logs.filter(l => l.level === 'error')
};
};
// Claude analyzes the reconstructed state
> At 14:23:45, user 12345 had an expired payment method.
> The payment service returned error but order service
> didn't handle it properly, leaving order in limbo state.
```
## Debugging Automation
### Self-Healing Systems
```
> Create a system that automatically detects and fixes
> common errors without human intervention
```
Claude implements:
```javascript
import { exec } from 'child_process';
import { promisify } from 'util';
const execAsync = promisify(exec);
const selfHeal = {
// Pattern matching for known issues
patterns: [
{
error: /Cannot read property .* of undefined/,
action: async (context) => {
// Add null checks to the offending line
const file = context.stack.getFileName();
const line = context.stack.getLineNumber();
// Use Claude Code CLI to fix the issue
const prompt = `Add null check before property access at ${file}:${line}`;
await execAsync(`claude -p "${prompt}" --cwd .`);
},
},
{
error: /ECONNREFUSED.*redis/,
action: async () => {
// Restart Redis connection
await redis.disconnect();
await redis.connect();
},
},
],
monitor: async (error) => {
const pattern = this.patterns.find((p) => p.error.test(error.message));
if (pattern) {
console.log('Self-healing:', error.message);
await pattern.action(error);
}
},
};
```
### Predictive Debugging
```
> Analyze our error patterns and predict what might break next
```
Claude provides predictive analysis:
```javascript
// Analyzes error trends
const predictions = await analyzeTrends({
errorLogs: last30Days,
deployments: recentDeploys,
codeChanges: recentCommits
});
// Output:
{
highRisk: [
{
component: "PaymentService",
reason: "Error rate increasing 15% daily",
prediction: "Likely to fail within 48 hours",
preventiveAction: "Scale payment service, check API limits"
}
],
patterns: [
{
correlation: "Errors spike every Monday 9 AM",
cause: "Weekly report generation overloads database",
solution: "Implement query caching or move to off-hours"
}
]
}
```
## Debugging Best Practices
### 1. Structured Logging
```javascript
// Claude-friendly log format
logger.error({
message: 'Payment processing failed',
error: error.stack,
context: {
userId: user.id,
orderId: order.id,
amount: payment.amount,
provider: payment.provider,
},
metadata: {
correlationId,
timestamp: new Date().toISOString(),
environment: process.env.NODE_ENV,
},
});
```
### 2. Error Boundaries
```javascript
// Graceful error handling
const debugWrapper =
(fn) =>
async (...args) => {
const start = Date.now();
const context = { fn: fn.name, args, start };
try {
const result = await fn(...args);
logger.debug({ ...context, duration: Date.now() - start });
return result;
} catch (error) {
logger.error({ ...context, error: error.stack });
// Let Claude analyze immediately
if (process.env.AUTO_DEBUG) {
// Use Claude Code CLI for automatic debugging
const debugPrompt = `Debug this error: ${error.message}\nContext: ${JSON.stringify(context)}`;
await execAsync(`claude -p "${debugPrompt}" --output-format json`);
}
throw error;
}
};
```
### 3. Debugging Artifacts
Always save debugging context:
```
> After debugging this issue, create:
> 1. A runbook for similar errors
> 2. A test case to prevent regression
> 3. Monitoring alert for early detection
> 4. Documentation of the root cause
```
## Tools and Integrations
### Log Analysis with Claude
```bash
# Pipe logs directly to Claude
tail -f app.log | claude -p "Watch for anomalies and alert me"
# Analyze historical logs
cat errors.log | claude -p "Find patterns in these errors"
# Real-time debugging
journalctl -f -u myapp | claude -p "Debug any errors in real-time"
```
### Integration with Monitoring
```javascript
// Sentry integration
Sentry.init({
beforeSend(event) {
// Send complex errors to Claude
if (event.level === 'error' && event.complexity === 'high') {
claude.analyze({
error: event,
context: Sentry.getContext(),
});
}
return event;
},
});
```
## Related Lessons
<LinkCard
title="Testing Strategies"
description="Prevent bugs before they happen with comprehensive testing"
href="/en/claude-code/lessons/testing"
/>
<LinkCard
title="Performance Analysis"
description="Deep dive into performance debugging and optimization"
href="/en/claude-code/lessons/performance"
/>
<LinkCard
title="Error Recovery"
description="Build resilient systems that handle failures gracefully"
href="/en/claude-code/lessons/error-recovery"
/>
## Next Steps
You've learned how to transform debugging from a time sink into a systematic process. With Claude Code as your debugging partner, you can tackle the most complex issues with confidence.
Remember: Great debugging isn't about being clever - it's about being systematic. Let Claude handle the pattern matching and code analysis while you focus on understanding the problem and designing the solution. Together, you'll squash bugs faster than ever before.