AI Model Comparison Guide
You open the model picker and see several options. Each has different strengths, context windows, and price points. This guide tells you which model to use for which task, when to switch, and how much it costs.
What You Will Walk Away With
Section titled “What You Will Walk Away With”- A clear default model recommendation for each tool
- Decision criteria for when to switch models
- Pricing breakdowns per request type
- A model routing strategy you can use immediately
Quick Selection Guide
Section titled “Quick Selection Guide”| Task | Recommended Model | Why |
|---|---|---|
| Hardest refactors, app builds, long-running tasks | Claude Fable 5 | New tier above Opus — exceeds any generally available model |
| Complex coding (default) | Claude Opus 4.8 | Top SWE-Bench scores, excellent agentic performance |
| Everyday coding (budget) | Claude Sonnet 4.6 | Excellent quality at a fraction of the cost |
| Cheap parallel / bulk work | Claude Haiku 4.5 | Drives subagents and codemods at ~1/5 of Sonnet cost |
| All Codex tasks | GPT-5.5 | Default model across all Codex and ChatGPT surfaces |
| Fast iteration (Cursor) | Cursor Composer 2.5 | In-house frontier-speed coding model |
| Large codebase (>200K tokens) | Opus 4.8, GPT-5.5, Sonnet 4.6, or Gemini 3.1 Pro | 1M token context windows |
| Multimodal (images, video) | Gemini 3.1 Pro | Best image/video analysis |
| Architecture and design | Claude Opus 4.8 | Deep reasoning capabilities |
| Budget | Primary Model | Alternative |
|---|---|---|
| Premium (best quality) | Claude Fable 5 | Claude Opus 4.8 |
| Standard | Claude Sonnet 4.6 | Cursor Composer 2.5 |
| Speed-focused (Cursor) | Cursor Composer 2.5 | Sonnet 4.6 |
| Cost-sensitive | Claude Haiku 4.5 | Cursor Composer 2.5 |
| Enterprise/Multimodal | Gemini 3.1 Pro | Sonnet 4.6 |
Model Specifications
Section titled “Model Specifications”Full Comparison Table
Section titled “Full Comparison Table”| Model | Provider | Context | Output Limit | SWE-Bench | Input $/1M | Output $/1M | Speed |
|---|---|---|---|---|---|---|---|
| Claude Fable 5 | Anthropic | 1M | 128K | — | $10 | $50 | Standard |
| Claude Opus 4.8 | Anthropic | 1M | 128K | Best | $5 | $25 | Standard |
| Claude Sonnet 4.6 | Anthropic | 1M | 64K | Strong | $3 | $15 | Standard |
| Claude Haiku 4.5 | Anthropic | 200K | 64K | Good | $1 | $5 | Fast |
| GPT-5.5 | OpenAI | 1M | 128K | Strong | $5 | $30 | Standard |
| Cursor Composer 2.5 | Cursor | 200K | — | Fast-frontier | $0.50 | $2.50 | Fast |
| Gemini 3.1 Pro | 1M | — | Good | $2 | $12 | Standard |
Claude Fable 5 (Anthropic)
Section titled “Claude Fable 5 (Anthropic)”The new top tier above Opus for the hardest work.
- Released: June 9, 2026
- Context window: 1M tokens with a 128K output limit
- Key strength: Much better than Opus 4.8 at complex multi-file refactorings, bug-fixing, building applications from scratch, and long-running tasks demanding peak intelligence. On Cognition’s FrontierCode benchmark it posts the highest score among frontier models at medium effort.
- Available in: Claude Code v2.1.170+ (
/model fable), Cursor (model picker), Claude API (claude-fable-5)
When to use: When budget matters less than velocity and quality, set Fable 5 as your default model — subagents still auto-run on Opus, Sonnet, and Haiku, so cost stays contained while the main loop gets maximum intelligence. When budget matters, use Fable 5 for planning (Plan mode), Opus 4.8 or Sonnet 4.6 for implementation, then Fable 5 again for the final verification pass.
Pricing: $10 / $50 per 1M tokens (input/output) — exactly 2x Opus 4.8. Effort levels run low, medium, high, xhigh, and max; thinking is adaptive only.
During early testing, Stripe reported that Fable 5 compressed months of engineering into days. In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.
Fable 5 is the generally-available, safety-tuned member of the Mythos class — in Anthropic’s words, “a Mythos-class model that we’ve made safe for general use.” Its sibling, Claude Mythos 5, is the same underlying model with safeguards lifted in some areas; initial access is restricted to Project Glasswing cyber defenders and critical-infrastructure providers.
Claude Opus 4.8 (Anthropic)
Section titled “Claude Opus 4.8 (Anthropic)”The default recommendation for complex coding tasks.
- Released: May 28, 2026
- Context window: 1M tokens with a 128K output limit
- Key strength: Around four times less likely than Opus 4.7 to leave flaws in its own code unflagged; beats GPT-5.5 on coding benchmarks
- Available in: Claude Code (default), Cursor (model picker), Anthropic API, Bedrock, Vertex AI
When to use: Architecture decisions, complex debugging, multi-step autonomous tasks, security audits, system design. This is the Opus-tier flagship and Claude Code’s default model — one tier below Fable 5. Start here and only switch when you have a specific reason. Tune the speed/reasoning trade-off with the effort level (low, medium, high) via /model or /effort, and lean on its automatic dynamic workflows for long multi-step tasks.
Pricing: $5 / $25 per 1M tokens (input/output) — unchanged from Opus 4.7. Fast mode runs at 2× the standard rate for 2.5× the speed.
Claude Sonnet 4.6 (Anthropic)
Section titled “Claude Sonnet 4.6 (Anthropic)”The budget-conscious workhorse with a massive context window.
- Released: early 2026
- Context window: 1M tokens
- Key strength: Excellent coding at a fraction of Opus cost. Best value per token for everyday work.
- Available in: Claude Code, Cursor, Anthropic API
When to use: Everyday coding tasks, when budget matters, when you need more than 200K tokens of context (large codebase analysis), or when your Opus quota is running low.
Pricing: $3 / $15 per 1M tokens (input/output).
Claude Haiku 4.5 (Anthropic)
Section titled “Claude Haiku 4.5 (Anthropic)”The cheap, fast tier that powers parallel work.
- Released: October 2025
- Context window: 200K tokens
- Key strength: Fast and inexpensive enough to drive subagents, codemods, and bulk file edits at roughly one-fifth of Sonnet cost
- Available in: Claude Code (subagents and
/model), Anthropic API
When to use: Read-only exploration, bulk scans, fan-out subagents, and simple formatting where you don’t need frontier reasoning. The Tier 1 model in a model-routing strategy.
Pricing: $1 / $5 per 1M tokens (input/output).
GPT-5.5 (OpenAI)
Section titled “GPT-5.5 (OpenAI)”The default model across all Codex and ChatGPT surfaces.
- Released: April 2026
- Context window: Up to 1M tokens with a 128K output limit
- Key strength: OpenAI’s newest frontier model for complex coding, computer use, and research. Leads Terminal-Bench 2.0 and is competitive on SWE-bench Verified.
- Available in: Codex App, Codex CLI, Codex IDE, Codex Cloud, ChatGPT, API
When to use: All Codex workflows — this is the recommended default. Also strong for computer-use tasks and knowledge work. A GPT-5.5 Pro variant is available for maximum performance.
Pricing: $5 / $30 per 1M tokens (input/output); GPT-5.5 Pro is $30 / $180; Batch requests run at 50% of standard. Prompts above 272K input tokens are billed at 2× input / 1.5× output for the session. Also available via Codex subscription plans.
Gemini 3.1 Pro (Google)
Section titled “Gemini 3.1 Pro (Google)”Best multimodal model with extreme context.
- Released: February 2026
- Context window: 1M tokens
- Key strength: Best image, audio, and video analysis. Deep Think mode for complex reasoning.
- Available in: Cursor (model picker), direct API
When to use: Tasks requiring more than 200K tokens of context, multimodal analysis (diagrams, screenshots, video walkthroughs), or when you need Deep Think reasoning mode.
Pricing: $2 / $12 per 1M tokens (input/output).
Cursor Composer 2.5 (Cursor)
Section titled “Cursor Composer 2.5 (Cursor)”Frontier coding model built in-house by Cursor.
- Released: May 18, 2026
- Architecture: Mixture-of-Experts, enhanced with Cursor’s own continued pretraining and reinforcement learning
- Context window: 200K tokens
- Key strength: A substantial step up over Composer 2 — better at sustained work on long-running tasks and more reliable at following complex instructions
- Available in: Cursor only
When to use: Fast local iteration in Cursor. Optimized for multi-file edits, code generation, refactoring, and long task chains across hundreds of actions.
Pricing: $0.50 / $2.50 per 1M tokens (standard); $3.00 / $15.00 (fast variant, the default).
Model Routing Strategy
Section titled “Model Routing Strategy”Use this decision tree for day-to-day work:
- Start with your tool’s default: Opus 4.8 for Claude Code, GPT-5.5 for Codex
- Velocity and quality outweigh budget? Set Fable 5 as your default model — subagents still auto-run on Opus/Sonnet/Haiku, so cost stays contained while the main loop gets maximum intelligence. On a budget, route Fable 5 to Plan mode and the final verification pass only, with Opus or Sonnet doing the implementation
- Need speed in Cursor? Switch to Composer 2.5
- Need budget savings? Switch to Sonnet 4.6, or Haiku 4.5 for bulk/parallel work
- Context exceeds 200K? Use Opus 4.8, GPT-5.5, Sonnet 4.6, or Gemini 3.1 Pro (1M context)
- Multimodal analysis? Gemini 3.1 Pro
- Everything else? Stay with the default
Cost Analysis
Section titled “Cost Analysis”Average Cost Per Request
Section titled “Average Cost Per Request”| Request Type | Opus 4.8 | Sonnet 4.6 | GPT-5.5 | Composer 2.5 |
|---|---|---|---|---|
| Simple completion (1K tokens) | ~$0.03 | ~$0.02 | ~$0.03 | ~$0.003 |
| Standard refactor (10K tokens) | ~$0.30 | ~$0.18 | ~$0.35 | ~$0.03 |
| Large analysis (50K tokens) | ~$1.50 | ~$0.90 | ~$1.75 | ~$0.15 |
| Complex architecture (100K tokens) | ~$3.00 | ~$1.80 | ~$3.50 | ~$0.30 |
A Claude Fable 5 request costs exactly 2x the Opus 4.8 column — $10 / $50 per 1M tokens versus $5 / $25.
Subscription Context
Section titled “Subscription Context”| Plan | Price | Models Included | Best For |
|---|---|---|---|
| Pro | $20/month | All models, ~500 fast requests | Everyday development |
| Ultra | $200/month | All models, ~10K requests | Power users |
Model switching is free within your plan. You pay per request, not per model choice.
| Plan | Price | Primary Model | Usage |
|---|---|---|---|
| Pro | $20/month | Sonnet 4.6 (Opus limited) | Lowest limits |
| Max 5x | $100/month | Full Opus 4.8 | Higher Opus limits |
| Max 20x | $200/month | Full Opus 4.8 | Highest Opus limits |
To use Opus 4.8 extensively, Max 5x or higher is recommended. Limits are rate-based and change frequently — treat the relative ordering as the takeaway and verify current allowances on Anthropic’s pricing page. From June 9 through June 22, 2026, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost; on June 23, 2026 it is removed from those plans and further use requires usage credits.
| Plan | Price | Model | Access |
|---|---|---|---|
| Plus | $20/month | GPT-5.5 | Basic Codex access |
| Pro | $200/month | GPT-5.5 | Full Codex with Cloud |
Codex uses GPT-5.5 as the default across all surfaces.
Performance Benchmarks
Section titled “Performance Benchmarks”| Category | Fable 5 | Opus 4.8 | Sonnet 4.6 | Haiku 4.5 | GPT-5.5 | Gemini 3.1 Pro | Composer 2.5 |
|---|---|---|---|---|---|---|---|
| SWE-Bench | — | Best | Strong | Good | Strong | Good | Strong |
| Code generation | Best | Excellent | Very good | Good | Very good | Good | Very good |
| Bug detection | Best | Excellent | Very good | Good | Very good | Good | Good |
| Architecture | Best | Excellent | Very good | Fair | Very good | Good | Good |
| Computer use | — | Yes | No | No | Yes | No | No |
| Context window | 1M | 1M | 1M | 200K | 1M | 1M | 200K |
| Cost efficiency | $10/$50 | Premium | Best value | Cheapest (Claude) | Premium | Good value | Cheapest |
Model Selection Checklist
Section titled “Model Selection Checklist”-
Identify your primary tool: Cursor, Claude Code, or Codex
-
Start with the default model: Opus 4.8 (Claude Code), GPT-5.5 (Codex), or best available (Cursor)
-
Evaluate task complexity: Simple tasks do not need the most expensive model
-
Check context requirements: Files exceeding 200K tokens need Opus 4.8, Sonnet 4.6, GPT-5.5, or Gemini 3.1 Pro
-
Consider budget: Track with
/cost(Claude Code), Settings > Usage (Cursor), or Codex dashboard -
Adjust as needed: Switch models based on task, not habit
Best Practices
Section titled “Best Practices”- Default to the best model for tasks that matter — architecture, security review, complex debugging
- Downgrade for routine work — simple fixes, boilerplate, and formatting do not need Opus 4.8
- Use speed models for iteration — Composer 2.5 in Cursor for rapid trial-and-error cycles
- Route bulk work to Haiku 4.5 — subagents, codemods, and fan-out scans cost a fraction of Opus
- Monitor costs weekly — track which models provide the best ROI for your workflow
- Stay updated — model capabilities and pricing change frequently. Check the Updates page