Deep Reasoning: /think and Extended Thinking

You are debugging a race condition that only appears under load. The error logs show intermittent database connection timeouts, but only when two specific API endpoints are called simultaneously. You describe the problem to Claude and get a surface-level answer about adding retry logic. What you actually need is for Claude to reason through the connection pool lifecycle, transaction isolation levels, and request concurrency model — the kind of deep analysis that requires more than a quick response.

Extended thinking gives Claude the space to work through complex problems before answering, producing significantly better results on hard technical questions.

What You’ll Walk Away With

Understanding of when extended thinking actually helps (and when it does not)
The practical difference between standard, /think, and deep thinking modes
Prompts optimized for extended thinking on architecture and debugging tasks
Configuration for controlling thinking depth and token budget

How Extended Thinking Works

By default, Claude Code uses extended thinking on every turn — Claude reasons through the problem internally before producing a visible response. You can influence how deep this reasoning goes.

Standard mode — Claude uses its default reasoning depth. Good for most tasks.

Extended thinking — Claude spends more tokens on internal reasoning before responding. Activated per-session with the /think command or toggled via /config. On Claude Opus 4.6, thinking depth is controlled by the effort level.

The thinking happens inside Claude’s context window but is not shown to you (you may see a “Thinking…” indicator). The result is that Claude’s visible response is better-reasoned, especially for problems that require multi-step analysis.

When to Use Extended Thinking

Extended thinking shines on problems with these characteristics:

Multiple interacting systems — Authentication flows, distributed transactions, event-driven architectures
Debugging without clear reproduction — Intermittent failures, race conditions, memory leaks
Architecture decisions with trade-offs — Choosing between approaches where the right answer depends on constraints
Security analysis — Finding vulnerabilities that require understanding data flow across components
Performance optimization — Identifying bottlenecks that span multiple layers

Extended thinking does NOT help much for:

Simple code generation (“write a function that sorts an array”)
Straightforward refactoring (“rename this variable”)
Questions with obvious answers (“what does this error message mean”)

The cost is higher token usage and slightly longer response times. Use it selectively.

Activating Extended Thinking

Per-Session Toggle

Inside a Claude Code session:

/think

This enables extended thinking for the rest of the session. Toggle it off with /think again.

Always-On Configuration

In settings.json:

{
  "alwaysThinkingEnabled": true
}

Effort Level (Opus 4.6)

For Claude Opus 4.6, the depth of thinking is controlled by effort level rather than a token budget:

# Values: low, medium, high (default)
export CLAUDE_CODE_EFFORT_LEVEL=high

Effort Level	Behavior	Best For
`low`	Minimal reasoning, fast responses	Simple tasks, quick questions
`medium`	Moderate reasoning	Everyday development
`high` (default)	Deep reasoning	Complex architecture, debugging

Token Budget (Other Models)

For models other than Opus 4.6, you can control the thinking budget directly:

# Default is 31,999 tokens
export MAX_THINKING_TOKENS=10000

# Disable thinking entirely
export MAX_THINKING_TOKENS=0

Prompting for Deep Analysis

The way you phrase your prompt significantly affects thinking quality. Give Claude the context and constraints that require deep reasoning.

Copy-paste prompt for architecture decisions:

I need to decide between two approaches for [specific problem].

Option A: [describe approach]
Option B: [describe approach]

Consider these constraints:
- [constraint 1, e.g., "must handle 10k concurrent users"]
- [constraint 2, e.g., "runs on Cloudflare Workers with 128MB memory"]
- [constraint 3, e.g., "needs to be backwards-compatible with v1 API"]

Analyze the trade-offs for each approach. Think through edge cases,
failure modes, and long-term maintenance implications. Recommend
one option with detailed reasoning.

Copy-paste prompt for debugging complex issues:

I have an intermittent bug that I cannot reproduce consistently.

Symptoms:
- [describe what happens]
- [frequency: "happens about 1 in 20 requests"]
- [conditions: "only under concurrent load"]

Relevant code: [reference files with @]
Error logs: [paste or reference relevant logs]

Think through every possible cause systematically. Consider:
- Race conditions and concurrency issues
- Resource exhaustion (connections, memory, file handles)
- Timing-dependent behavior
- State corruption across requests

For each potential cause, explain how to verify or rule it out.

Copy-paste prompt for security review:

Review the authentication and authorization flow in this project.

Think deeply about:
1. Every place where user input reaches a database query
2. Token/session lifecycle and expiration handling
3. CSRF, XSS, and injection attack surfaces
4. Privilege escalation paths
5. Information leakage in error messages

For each vulnerability found, rate severity (critical/high/medium/low)
and provide a specific fix, not just a description.

Combining Extended Thinking with Plan Mode

The most powerful workflow for complex features combines Plan mode with extended thinking:

Enable extended thinking: /think
Switch to plan mode (via /config or the VS Code mode selector)
Describe your feature or problem
Claude reasons deeply about the approach, then presents a plan
You review the plan and approve or refine

This forces Claude to spend its thinking budget on planning rather than rushing to implementation.

Extended Thinking in Practice

Here is what the experience looks like for a real debugging session:

/think

I'm seeing intermittent 504 Gateway Timeouts on our /api/orders
endpoint. It only happens during peak hours (2-4pm EST) and
affects about 3% of requests. Our monitoring shows:

- Database query time is normal (< 50ms)
- The timeout happens after the query completes
- Memory usage on the server stays flat
- The issue started after we deployed the new payment
  integration last Tuesday

Read @src/pages/api/orders.ts and @src/lib/payments.ts
and think through what could be causing this.

With extended thinking enabled, Claude is more likely to:

Notice that the payment integration makes a synchronous HTTP call to an external API
Realize that the external API has variable response times during peak hours
Identify that the 504 comes from the gateway timeout, not the database
Suggest moving the payment verification to an async background job

Without extended thinking, Claude might give a more superficial answer about database connection pooling or caching.

When This Breaks

Thinking takes too long and you get impatient — Lower the effort level to medium or reduce MAX_THINKING_TOKENS. Not every task needs deep reasoning. Reserve it for genuinely complex problems.

Claude’s thinking seems to go in circles — This can happen with extremely ambiguous problems. Provide more constraints or narrow the question: “Focus specifically on the database connection pooling behavior, not the entire request lifecycle.”

Token costs spike with extended thinking — Extended thinking uses significantly more tokens. If you are on API billing, use /cost to monitor session spend. Consider switching to Claude Sonnet 4.5 for simpler tasks and reserving Opus 4.6 with thinking for the hard problems.

Thinking is enabled but responses are not noticeably better — The problem might not benefit from extended thinking. Simple, well-defined tasks produce similar results with or without it. Save thinking for genuinely ambiguous, multi-factor problems.

What’s Next

With deep reasoning in your toolkit, connect external tools via MCP to give Claude access to your databases, issue trackers, and monitoring systems.

MCP Setup Connect external tools via the Model Context Protocol

Development Workflow Build your first feature end to end