Skip to content

Model Selection

Learn to select the right AI model for each task. This 10-minute guide will help you balance speed, cost, and capability to maximize productivity while controlling expenses.

ModelSpeedCostContextBest Use Case
Claude 4 Sonnet⚡⚡⚡$128kDaily coding (80% of tasks)
Claude 4 Opus⚡⚡$$$$$200kArchitecture & planning
OpenAI o3$$$200kComplex debugging
Gemini 2.5 Pro⚡⚡⚡$$1MLarge codebase work
GPT 4.1⚡⚡$$1MGeneral purpose backup

Monthly Budget Impact

Typical Developer Usage

  • Sonnet 4 only: ~$50/month
  • Mixed strategy: ~$100/month
  • Heavy Opus 4: ~$300/month
  • Unoptimized: ~$500+/month

Token Economics

Per Million Tokens

  • Sonnet 4: $3
  • Gemini 2.5: $7
  • Opus 4: $15 (5x Sonnet)
  • o3: Variable pricing
  • Feature implementation
  • Bug fixes
  • Code reviews
  • Refactoring
  • Test writing
  • Documentation
  • API integration
  • Database queries
  1. Clear, specific prompts yield better results than vague requests
  2. Include examples in your prompts for consistent output
  3. Break complex tasks into smaller, focused requests
  4. Use project rules to maintain consistency
// GOOD: Specific request for Sonnet 4
"Create a TypeScript function that validates email addresses
using regex, returns a Result<string, ValidationError> type,
and includes unit tests"
// POOR: Vague request that might need a stronger model
"Make the authentication system better"

Decision Framework

Use Opus 4 when you need:

  1. System Design: Architecture decisions affecting multiple modules
  2. Complex Refactoring: Changes touching 50+ files
  3. Algorithm Design: Non-trivial algorithms requiring deep reasoning
  4. Technical Planning: Breaking down complex features into tasks
  5. Code Generation: Creating entire modules from specifications
  1. Start with Planning

    "Analyze our current authentication system and propose
    a migration plan to OAuth 2.0 with backward compatibility"
  2. Generate Architecture

    "Design a scalable event-driven architecture for our
    notification system supporting email, SMS, and push"
  3. Complex Problem Solving

    "Optimize this graph traversal algorithm for finding
    shortest paths in a weighted directed graph with
    negative edges"

o3 excels at tasks requiring extended reasoning:

Debugging

Complex race conditions, memory leaks, performance bottlenecks

Algorithms

Dynamic programming, graph algorithms, optimization problems

Logic Puzzles

Business rule engines, constraint satisfaction, state machines

Mathematics

Statistical analysis, ML algorithms, cryptographic implementations

  1. Provide extensive context - o3 thrives on information
  2. Ask for reasoning - “Explain your thought process”
  3. Iterate on solutions - o3 improves with feedback
  4. Verify outputs - Complex reasoning can have edge cases
PROMPT: "Our application has a memory leak that occurs after
approximately 1000 API calls. The leak seems related to our
caching layer. Here's the relevant code and memory profiler
output. Please analyze and provide a fix."
WHY o3: This requires deep analysis of code execution patterns,
memory management, and identifying subtle issues that simpler
models might miss.
  • Analyzing entire repositories
  • Cross-module refactoring
  • Large-scale code reviews
  • Documentation generation from code
  • Understanding legacy systems
  • Dependency analysis
  1. Load Large Context

    @codebase "Analyze our entire authentication module
    including all services, controllers, and tests"
  2. Cross-Reference Analysis

    "Find all places where UserRole enum is used and
    suggest a migration to a more flexible permission system"
  3. Generate Documentation

    "Create comprehensive API documentation for all
    endpoints in the /api/v2 directory"
graph TD Start[New Task] --> Size{Codebase Size?} Size -->|< 10 files| Simple{Task Complexity?} Size -->|10-50 files| Medium{Need Reasoning?} Size -->|> 50 files| Large[Gemini 2.5 Pro] Simple -->|Basic| Sonnet[Claude 4 Sonnet] Simple -->|Complex| Reason{Type of Complexity?} Medium -->|No| Sonnet Medium -->|Yes| Opus[Claude 4 Opus] Reason -->|Architecture| Opus Reason -->|Debugging| o3[OpenAI o3] Reason -->|Algorithm| o3
  1. Start with Sonnet 4 for initial implementation
  2. Switch to Opus 4 only for planning/architecture
  3. Use o3 only for specific complex problems
  4. Enable Gemini only for large-scale analysis

Reduce Context

  • Clear chat regularly
  • Use focused @mentions
  • Exclude irrelevant files
  • Summarize long discussions

Reuse Context

  • Save useful prompts
  • Create project rules
  • Build prompt templates
  • Use memory feature
# EXPENSIVE: Multiple Opus 4 calls
"Refactor the auth service" (Opus 4)
"Now refactor the user service" (Opus 4)
"Now refactor the profile service" (Opus 4)
# EFFICIENT: Single comprehensive call
"Refactor auth, user, and profile services to follow
our new architecture pattern. Provide implementation
plan first, then execute." (Opus 4 once)
  1. Exploration Phase (Sonnet 4)

    • Understand the problem
    • Gather context
    • Initial attempts
  2. Planning Phase (Opus 4)

    • Architecture design
    • Break down complex tasks
    • Create implementation plan
  3. Implementation Phase (Sonnet 4)

    • Execute the plan
    • Write code
    • Create tests
  4. Debug Phase (o3 if needed)

    • Solve complex issues
    • Optimize algorithms
    • Fix edge cases
Task: Add user notifications
Models Used:
- Sonnet 4: Initial implementation (90%)
- Opus 4: System design (10%)
Total Cost: ~$5
Time Saved: 4 hours
Task: Fix memory leak in production
Models Used:
- Sonnet 4: Initial investigation (20%)
- o3: Deep analysis and fix (80%)
Total Cost: ~$15
Time Saved: 8 hours debugging
Task: Migrate to new framework
Models Used:
- Opus 4: Planning (20%)
- Gemini 2.5: Analysis (30%)
- Sonnet 4: Implementation (50%)
Total Cost: ~$40
Time Saved: 20 hours
Opus 4: "Create a detailed plan for implementing OAuth"
↓ (Save plan to file)
Sonnet 4: "Implement step 1 from oauth-plan.md"
Sonnet 4: "Implement step 2 from oauth-plan.md"

Open multiple Cursor instances:

  • Instance 1: Opus 4 for architecture
  • Instance 2: Sonnet 4 for implementation
  • Instance 3: o3 for testing edge cases
- Direct and specific
- Include code examples
- Reference file paths
- Clear success criteria
  1. Check usage: Settings → Usage
  2. Set budget alerts
  3. Review weekly patterns
  4. Optimize based on data
  • High Opus 4 usage: Consider better planning
  • Repeated similar tasks: Create rules/templates
  • Long conversations: Clear context more often
  • Failed attempts: Switch models earlier

Before starting a task, ask:

  • Can Sonnet 4 handle this? (Start here)
  • Do I need deep reasoning? (Consider Opus 4/o3)
  • Is context size the issue? (Consider Gemini)
  • Am I using the right prompting style?
  • Can I break this into smaller tasks?

Continue to Project Rules

Now let’s set up project rules to ensure consistent AI behavior across all models.

Project Rules →

Time: 10 minutes