Error Recovery & Resilience

Learn to handle failures gracefully, recover from errors, and build resilient workflows with Claude Code. This guide covers everything from simple retries to complex recovery strategies, ensuring your AI-assisted development stays productive even when things go wrong.

Understanding Claude Code Errors

Common Error Categories

Claude Code errors typically fall into these categories:

Permission Errors - Tool execution blocked
Context Errors - Token limits or memory issues
Execution Errors - Command failures or syntax issues
Connection Errors - Network or MCP server issues
Model Errors - AI confusion or incorrect outputs

Error Indicators

Claude provides clear visual feedback when errors occur:

Red error messages - Direct command failures
Yellow warnings - Non-critical issues
Tool execution failures - Clear error output
Retry suggestions - Claude often suggests fixes

Permission-Based Recovery

The Permission Problem

By default, Claude asks permission for every potentially risky operation:

# This gets interrupted constantly:
claude "Refactor the authentication system"
# → "Can I edit auth/login.js?" [y/n]
# → "Can I run git add?" [y/n]
# → "Can I edit auth/session.js?" [y/n]

Solution 1: Skip Permissions (Recommended for Development)

YOLO Mode

# Skip all permission prompts for uninterrupted workflow
claude --dangerously-skip-permissions

This mode is safe for:

Development environments
Containerized workspaces
Projects with good backups

Solution 2: Configure Allowed Tools

Use /permissions command to manage allowed tools
Add specific tools to allowlist: Edit, View, Bash(git:*)
Save preferences in .claude/settings.json:

{
  "allowedTools": [
    "Edit",
    "View",
    "Bash(git:*)",
    "Bash(npm:*)",
    "mcp__git__*"
  ]
}

Solution 3: Session-Specific Permissions

# Allow specific tools for this session only
claude --allowedTools "Edit,View,Bash(git:*)"

Execution Error Recovery

Command Failures

When commands fail, Claude typically:

Shows the error - Full stderr output
Analyzes the cause - Identifies what went wrong
Suggests fixes - Proposes solutions
Retries automatically - If you approve

Manual Recovery Techniques

# Undo the last change
/undo

# Or ask Claude to revert
"Please undo the last file edit"

# Git-based recovery
"git checkout -- filename.js"

# If an edit goes wrong
"The last change broke the syntax, please fix it"

# Claude will re-read the file and correct errors

# Fork from a previous point
# Press Esc twice, then select a previous message

# Or start fresh
/clear

Context and Memory Errors

Token Limit Recovery

When hitting context limits:

Clear unnecessary context: /clear
Start a new session with focused context
Export important state to CLAUDE.md
Use modular memory files

Context Management Pattern

# In CLAUDE.md
## Current Task State
- Completed: Authentication refactor
- In Progress: Session management
- Next: Testing implementation

## Key Decisions
- Using JWT for sessions
- Redis for session storage
- 24-hour expiration

Memory Overflow Prevention

# Before context gets too large
claude "Save our current progress to CLAUDE.md"

# Then clear and continue
/clear
claude "Continue from CLAUDE.md progress"

Connection and MCP Errors

MCP Server Recovery

# Check MCP server status
claude mcp list

# Restart specific server
# (Restart the application providing the server)

# Debug MCP issues
claude --mcp-debug

# Increase timeout for slow operations
export MCP_TIMEOUT=30000  # 30 seconds

# Or configure in settings
{
  "mcpTimeout": 30000
}

# Validate MCP configuration
cat .mcp.json | jq .

# Test individual server
claude "Test the GitHub MCP connection"

Model Confusion Recovery

When Claude Gets Confused

Signs of model confusion:

Repeating the same errors
Misunderstanding requirements
Going in circles
Suggesting incorrect solutions

Recovery Strategies

Clear and restart with better context
Break down the problem into smaller parts
Provide explicit examples of desired outcome
Switch models if available
Use structured prompts with clear steps

Confusion Recovery Template

Let's restart. Here's what we need:

1. Current state: [describe what exists]
2. Desired outcome: [describe what you want]
3. Constraints: [any limitations]
4. Example: [provide a concrete example]

Please approach this step-by-step.

Advanced Recovery Patterns

Checkpoint-Based Recovery

Create recovery checkpoints during complex operations:

# Before risky changes
git add -A && git commit -m "Checkpoint: before auth refactor"

# Or use Claude
claude "Create a git checkpoint before we proceed"

Structured Error Handling

Implement error-aware workflows:

# Tell Claude to be defensive
claude "Implement the API client with comprehensive error handling"

# Claude will add try-catch blocks, validation, and fallbacks

Recovery Scripts

Create recovery commands in .claude/commands/:

When recovering from errors:
1. Run git status to check changes
2. Run tests to verify functionality
3. Check logs for recent errors
4. Restore from last known good state if needed

Execute: $ARGUMENTS

Troubleshooting Guide

Common Issues and Solutions

NPM Permission Errors:

# Fix npm prefix
npm config set prefix ~/.npm-global
export PATH=$PATH:~/.npm-global/bin

# Or migrate to local install
claude migrate-installer

Missing Commands:

# Verify installation
which claude
claude doctor

# Reinstall if needed
npm install -g @anthropic-ai/claude-code

Login Problems:

# Clear auth and restart
rm -rf ~/.config/claude-code/auth.json
claude

# Or use /logout command
/logout

API Key Issues:

# Check environment
echo $ANTHROPIC_API_KEY

# Set if missing
export ANTHROPIC_API_KEY="your-key"

High CPU/Memory:

# Check Claude processes
ps aux | grep claude

# Restart if needed
pkill -f claude
claude

Slow Responses:

Clear context with /clear
Reduce file includes
Use focused prompts
Check network connectivity

Building Resilient Workflows

Error-Tolerant Patterns

Resilient Development Flow

Always work in branches - Easy rollback
Commit frequently - More recovery points
Test incrementally - Catch errors early
Use defensive prompts - Ask for error handling
Maintain backups - External to git

Automated Recovery

Configure hooks for automatic recovery:

{
  "hooks": {
    "postToolUse": [
      {
        "match": "Error",
        "command": "git status && npm test"
      }
    ]
  }
}

Team Error Patterns

Document common errors in CLAUDE.md:

## Common Errors and Fixes

### Module Not Found
- Cause: Missing npm install
- Fix: Run `npm install` or check package.json

### Type Errors
- Cause: TypeScript strict mode
- Fix: Add proper type annotations

### Permission Denied
- Cause: File permissions
- Fix: Check ownership with `ls -la`

Emergency Recovery

When everything goes wrong:

Stop Claude: Press Esc or Ctrl+C
Assess damage: git status and git diff
Revert if needed: git checkout .
Check backups: Time Machine, snapshots, etc.
Start fresh: New terminal, new session
Report issues: If Claude bug, report to Anthropic

Error Prevention Tips

Proactive Error Prevention

Use containers for isolated environments
Set up CI/CD to catch errors early
Configure linters that Claude respects
Document gotchas in CLAUDE.md
Test in staging before production
Use version control religiously

What’s Next?

You’ve completed the Claude Code quick-start guide! Here are your next steps:

Advanced Techniques Master expert-level Claude Code features

MCP Ecosystem Explore advanced MCP server configurations

Team Workflows Scale Claude Code across your organization

Remember: errors are learning opportunities. Each failure teaches Claude (and you) how to handle similar situations better in the future. Embrace the iterative process and build increasingly resilient workflows.