Model Selection Strategy

You are mid-feature, the agent just produced a solid implementation plan, and now you need it to write the actual code. Do you leave Auto on or choose a model manually? Opus 5 is expensive, Sonnet 5 may be enough, Gemini 3.1 Pro offers large context, and Cursor now has two first-party choices in different weight classes: Grok 4.5 and Composer 2.5. This article gives you a decision framework so model selection becomes a two-second choice, not a five-minute debate.

What You’ll Walk Away With

A clear default model recommendation (and why it is the default)
A decision tree for when to switch models based on task type, not guesswork
Relative cost profiles so you can budget usage without treating a volatile monthly allowance as fixed
Practical prompting adjustments for each model’s strengths
The keyboard shortcut to switch models instantly without leaving your flow

The Quick Reference

Model	Input / Output Cost	Normal Cursor Context	Max / Model Limit	Best For
Composer 2.5	$3 / $15 (Fast) or $0.50 / $2.50 (Standard) per 1M tokens	200k	—	Speed-critical coding; Fast is the default Composer variant when selected
Claude Fable 5	$10 / $50 per 1M tokens	200k	1M	Highest-capability tier — tasks that outlast a single sitting
Claude Opus 5	$5 / $25 per 1M tokens	200k	1M	The default: agentic coding, complex reasoning, computer use, full low-to-max effort range
Claude Opus 4.8	$5 / $25 per 1M tokens	200k	1M	Previous Opus; superseded by Opus 5 at the same price
Claude Sonnet 5	$2 / $10 introductory; $3 / $15 after Aug 31	200k	1M	Budget-conscious daily work, extended thinking
Claude Haiku 4.5	$1 / $5 per 1M tokens	200k	200k	Fastest Claude, near-frontier quality on focused tasks
Gemini 3.1 Pro	$2 / $12 up to 200k input; $4 / $18 above	200k	1M	Extreme context, multimodal (image/diagram analysis)
GPT-5.6 Sol	$5 / $30 per 1M tokens	200k	~1.05M	OpenAI’s latest frontier — strong agentic coding, computer use, research
Grok 4.5	$2 / $6 standard; $4 / $18 fast per 1M tokens	200k	500k API	Long-running coding and broad computer work across Cursor surfaces
Grok Code	$0.20 / $1.50 per 1M tokens	200k	256k	Very budget-friendly simple tasks

Composer 2.5: Cursor’s Speed-Focused Specialist

Composer 2.5 was released May 18, 2026 as Cursor’s coding specialist. It builds on a Kimi K2.5 checkpoint with Cursor’s continued pretraining and reinforcement learning and is tuned for fast, low-cost iteration. Fast is the default variant when Composer is selected; current Cursor documentation does not identify Composer as Auto’s fixed default.

In the current Artificial Analysis Coding Agent Index v1.1, Composer 2.5 scores 52 in Cursor CLI (about 16% DeepSWE, 67% Terminal-Bench v2, and 72% SWE-Atlas-QnA). The older score of 62 came from the May v1.0 index and is not directly comparable because the benchmark basket changed.

When to Stick with Composer 2.5

Rapid iteration and style tweaks where you’ll refine 5-10 times anyway
Well-scoped refactors inside a single file or module
“Quick question” inline edits while working on a larger plan
Running multiple agents in parallel in the Agents Window — Composer 2.5’s speed keeps all panes responsive

When to Switch Away

Task touches cross-module architecture or security-sensitive code — Opus 5
You need the model to reason about a 500k-line codebase — Gemini 3.1 Pro or Opus 5 in 1M mode
The current model keeps getting a detail wrong — a fresh model often breaks the loop

Grok 4.5: Long-Running Computer Work

Cursor and SpaceXAI released Grok 4.5 on July 8, 2026 after jointly training the mixture-of-experts model on Cursor interaction data plus a broader STEM and knowledge-work mix. It targets long-running coding and wider computer work and is available in the desktop app, web, iOS, CLI, and SDK. Standard mode costs $2/$6 per million input/output tokens; fast mode costs $4/$18. Cursor excludes CursorBench from the launch comparison because Grok 4.5 training accidentally included an older Cursor code snapshot.

Grok 4.5 is not Composer 2.5’s product successor. Cursor says they occupy different weight classes, Composer will remain available, and future models of Composer’s size will continue. Use Grok for broader, harder, longer work and Composer for cheaper, faster coding loops. Cursor’s launch results report 64.7 vs 54.0 on SWE-Bench Pro, 83.3 vs 73.0 on Terminal-Bench 2.1, and 62 vs 18 on DeepSWE 1.0 for Grok 4.5 and Composer 2.5 respectively. Artificial Analysis measured Grok 4.5 + Grok Build at 76 on its current coding-agent index versus 52 for Composer 2.5 + Cursor CLI. Treat both comparisons as system-level snapshots with stated benchmark versions and different agent harnesses, not isolated measurements of model weights.

On June 16, SpaceX signed an agreement to acquire Anysphere/Cursor at an implied $60B equity value. This was not a completed xAI acquisition as of July 11: the deal remained subject to closing conditions and regulatory approval, with closing expected in Q3 2026. xAI had previously joined SpaceX, and the combined AI operation is branded SpaceXAI.

Claude Fable 5: The Highest-Capability Tier

Released June 9, 2026 and available in Cursor’s model picker, Claude Fable 5 (claude-fable-5) is Anthropic’s highest-capability tier — described as “a Mythos-class model that we’ve made safe for general use”. Anthropic states its lead over the rest of the family grows with task length and complexity, which is the honest way to read it: Fable is for work that outlasts a single sitting, not for routine throughput. Opus 5 now leads it on most published measurements, including the Artificial Analysis Intelligence Index. It has a 1M context window and exposes the full effort range: low, medium, high, xhigh, and max, with thinking always on.

The catch is cost: at $10 / $50 per 1M tokens it is exactly 2x Opus 5, so it burns through your usage budget twice as fast. Before paying that, raise Opus 5’s effort level — low and medium on the current generation often beat xhigh on prior models, which makes an effort sweep cheaper than a tier upgrade. On Claude plans, Fable 5 has been permanently included on Max and Team Premium since July 20, 2026, capped at 50% of weekly usage limits; Pro and Team Standard reach it through usage credits — see the model comparison appendix for details.

Claude Opus 5: The Default for Serious Work

Claude Opus 5 (claude-opus-5) is Anthropic’s current Opus, released July 24, 2026 at the same $5 / $25 price as Opus 4.8. It offers state-of-the-art agentic performance — Anthropic reports SOTA results on Frontier-Bench v0.1 and GDPval-AA v2, a win over Fable 5 on OSWorld 2.0 at roughly a third of the cost, and a CursorBench 3.2 score within 0.5 points of Fable 5’s peak at max effort. It supports high-resolution image input (up to 2576px on the long edge / 3.75MP) for screenshot/artifact analysis, and its May 2026 knowledge cutoff is the freshest of any Claude model. In Cursor, select it explicitly in the model picker; use Max Mode only when the larger context is worth the added latency and cost.

When to Use Opus 5

Complex multi-file refactoring
Architecture design and system planning
Security audits and code review
Test generation for nuanced business logic
Computer-use / screenshot workflows (2576px resolution with 1:1 pixel mapping)
Any task where getting it right the first time saves more money than the model costs

Prompting Tips for Opus 5

Opus 5 exposes the full low/medium/high/xhigh/max effort range and defaults to high; xhigh suits most agentic coding. Three habits to adjust from earlier models: it verifies its own work without being told, so delete “double-check your answer” scaffolding rather than rewriting it — those instructions now cause over-verification. It writes longer responses by default, and effort is not the lever that shortens them; ask for concision explicitly. And it can quietly widen a task’s scope, so state the boundary when you want exactly what you asked for and nothing adjacent.

Analyze the authentication system across all files in src/auth/ and src/middleware/.
Identify security vulnerabilities and architectural issues.
Propose a refactoring plan that addresses each issue.
Before implementing, explain your reasoning and ask if I want to adjust the approach.

Claude Sonnet 5: The Budget Workhorse

At its introductory $2/$10 rate, Sonnet 5 costs 60% less per input and output token than Opus 5 at $5/$25; after August 31, its $3/$15 rate will be 40% lower. It keeps a full 1M context window and is often a cost-effective choice for straightforward tasks such as utilities, form fields, and standard CRUD endpoints — verify on your own workload rather than assuming identical quality.

When to Switch to Sonnet 5

Routine coding where the pattern is well-established
Tasks where you will review and iterate regardless
When your monthly usage budget is running low
Long conversations where you need 1M context but want to manage cost

The Practical Test

If you can describe the task in one sentence and the expected output is predictable, Sonnet 5 will handle it. If the task requires weighing tradeoffs or understanding subtle architectural implications, stick with Opus 5.

Create a TypeScript utility function that debounces async functions.
It should support cancellation and return the result of the last invocation.
Include comprehensive tests using Vitest.

Claude Haiku 4.5: Fast, Cheap, Near-Frontier

Haiku 4.5 (claude-haiku-4-5) is the fastest Claude at $1/$5 per MTok with 200k context. It’s meaningfully better than anything in its tier for focused tasks: short refactors, code explanation, lint-style feedback. In the Agents Window, drop it into one pane as a “fast reviewer” that inspects Opus 5’s output while Opus keeps iterating.

Gemini 3.1 Pro: The Context Specialist

Gemini 3.1 Pro’s headline feature is its 1M token context window (accessible via Max mode). When you need the AI to understand your entire codebase at once — not just the files you manually reference — Gemini 3.1 Pro is the model to reach for.

When to Switch to Gemini 3.1 Pro

Analyzing large codebases (50k+ lines) where cross-module understanding matters
Working with images — paste a screenshot of a UI bug or a Figma design directly into chat
Reviewing architectural diagrams or documentation that includes visual elements
Tasks where context volume matters more than reasoning depth

Multimodal Advantage

Gemini 3.1 Pro handles images natively. Drag a screenshot into the Cursor chat and ask it to reproduce the layout, identify the visual bug, or implement a design from a mockup.

I've attached a screenshot of our current dashboard. The spacing between the
metric cards is inconsistent and the chart legend overlaps on mobile.
Fix the responsive layout in src/components/Dashboard.tsx to match the
spacing from our design system (8px base grid).

GPT-5.6 Sol and Gemini 3.1 Pro: The Alternative Frontier

GPT-5.6 Sol reached general availability on July 9, 2026 and is OpenAI’s flagship model, available in Cursor’s model picker. It excels at agentic coding, computer use, knowledge work, and research workflows. At $5/$30 per MTok with roughly 1.05M context, it competes directly with premium frontier models. Use it when you want a different “perspective” — sometimes switching model families unsticks a problem that one family keeps getting wrong. /best-of-n runs a task across several parallel worktrees (agents) so you can compare diffs and merge the best one — point each worktree at a different model (say Composer 2.5, Opus 5, and GPT-5.6 Sol) to compare families head-to-head.

Gemini 3.1 Pro remains the headline choice for multimodal work. Drag a screenshot into the Agents Window or Design Mode and ask it to reproduce the layout, identify a visual bug, or implement from a mockup.

The Decision Tree

When a new task comes in, run through this:

Do you want Cursor to route dynamically?

Yes: leave Auto on. If predictable low cost and fast iteration matter more than dynamic routing, select Composer 2.5 manually.
Is this a complex, multi-file task or architectural decision?

Yes: Use Claude Opus 5 (raise effort to xhigh or max for the hardest reasoning). For the very hardest of these — codebase-wide migrations, building apps from scratch — use Claude Fable 5 if budget allows. Or try /best-of-n to compare against GPT-5.6 Sol and Composer 2.5.
Is this a simple, well-defined task with predictable output?

Yes: Use Claude Sonnet 5 or stick with Composer 2.5 (if speed matters). Claude Haiku 4.5 is faster-and-cheaper still for focused tasks.
Do I need to analyze more than 200k tokens of context?

Yes: Enable Max Mode and choose Opus 5, Sonnet 5, GPT-5.6 Sol, or Gemini 3.1 Pro — subject to the limit Cursor exposes for that model.
Am I working with images, screenshots, or diagrams?

Yes: Opus 5 (high-resolution 2576px support), Gemini 3.1 Pro (native multimodal), or GPT-5.6 Sol (computer use).
Am I stuck and the current model keeps making the same mistake?

Yes: Switch model family. /best-of-n is the fastest way to try three options at once.

Switching Models in Cursor

The model picker is in the agent panel, right next to the mode selector. You can switch models mid-conversation — the new model picks up the existing context. In the Cursor 3.0 Agents Window you can run different models per agent tab simultaneously.

Keyboard shortcut: Press Cmd+. (macOS) or Ctrl+. (Windows/Linux) to quickly cycle through modes. For model selection, click the model name in the agent panel.

Auto Mode (Dynamic Routing)

Auto mode in Cursor 3.x selects a premium model that it considers reliable for the immediate task and can switch when a provider’s output degrades. Cursor does not publicly guarantee a fixed per-request model or document Composer 2.5 as the permanent Auto default. If you need a particular model, latency profile, or predictable per-token rate, select it manually and check the usage dashboard afterward.

For beginners, Auto is a reasonable starting point. Once you develop a feel for which model suits which task, manual selection gives you more control and often better results.

Cost Optimization in Practice

Relative Usage Patterns

Developer Style	Primary Model	Relative Cost Profile
All Opus 5 / `xhigh`	Claude Opus 5	Highest; premium tokens and higher effort compound usage
Mixed (recommended)	Composer 2.5 selected manually for routine, Opus 5 for complex	Moderate; spend premium tokens only where they change the result
Budget-conscious	Composer 2.5 Standard + Sonnet 5 + Haiku 4.5	Lowest of these patterns; prefer lower-rate models and shorter context

Cost-Saving Strategies

Select Composer 2.5 manually for cheap iteration; use Auto when dynamic reliability matters more than a fixed model
Start conversations fresh — long conversations accumulate context that costs money on every message
Use @ references instead of pasting large blocks of code — Cursor handles file references more efficiently
Reserve Opus 5 xhigh for genuinely hard problems — the higher effort level spends materially more tokens
Enable Max mode only when needed — do not leave it on permanently

When This Breaks

Model seems to have gotten worse: API performance, host configuration, and aliases can change. Check Cursor’s status and changelog, retry a controlled prompt, and switch models temporarily if the result is consistently worse.

Switching models mid-conversation loses context: Rare but can happen with very long conversations. If you notice degraded quality after a switch, start a new chat with the new model and @-reference specific files.

Auto mode keeps selecting a model you dislike: Disable Auto and select manually. The two seconds it takes to choose a model are worth the consistency.

Usage runs out before month end: Check your usage in Settings. If you are burning through Opus 5 tokens on tasks that Sonnet 5 or Composer 2.5 could handle, shift your default for routine work.

Opus 5 responses feel too terse: Ask explicitly for the level of detail you need or raise the effort level to xhigh for harder reasoning. If the response style still does not fit the task, try a different model family in a fresh, controlled comparison.

Verification Sources

Cursor: current model and Auto documentation, Grok 4.5 launch, and Composer 2.5 launch
Independent results: Artificial Analysis coding-agent leaderboard, Grok 4.5 analysis, and v1.1 methodology
Transaction status: SpaceX Form 8-K filed with the SEC

What’s Next

Project Rules Set up .cursor/rules/ so the AI generates code that matches your team's conventions from the very first prompt.