When to Use Agent vs Ask Mode

You need to understand how your payment processing module works before refactoring it. You open the AI in its default agent mode, ask “how does payment processing work in this codebase?” and the AI immediately starts reading files, running grep commands, and consuming 40,000 tokens of context before giving you a three-sentence answer. You could have gotten the same answer with a fraction of the context usage if you had used the right mode.

Every tool — Cursor, Claude Code, and Codex — provides modes that control how much autonomy the AI has. The spectrum ranges from “read and analyze only” to “execute everything autonomously.” Understanding this spectrum and when to use each setting is fundamental to working efficiently.

What You’ll Walk Away With

A clear mapping of modes across all three tools
Decision criteria for choosing the right mode for each task type
Prompts optimized for each mode
Strategies for transitioning between modes during a single workflow

The Mode Spectrum

All three tools share the same fundamental spectrum, even though they use different terminology:

Capability	Cursor	Claude Code	Codex
Read-only analysis	Ask mode (or Plan mode to draft a plan)	Plan Mode (Shift+Tab)	`--ask-for-approval untrusted` + `--sandbox read-only`
Guided execution	Agent mode (default)	Normal Mode (default)	`--ask-for-approval on-request` (or the `--full-auto` preset)
Full autonomy	Auto-Run “Run Everything” / Cloud Agent	`--dangerously-skip-permissions` / Sandbox	`--ask-for-approval never`

Read-Only / Analysis Mode

Use this when you want to understand code without modifying it. The AI reads files and answers questions but cannot make changes.

Switch to Ask mode in the mode selector. The AI will analyze code and answer questions without making changes. For investigations that should end in a written plan rather than edits, Plan mode is the sibling to reach for:

How does the authentication flow work? Trace the request from
the login endpoint through middleware to the session store.
Show me the key files and functions involved.

Ask mode is token-efficient because the AI focuses on answering your question rather than exploring broadly. It uses the codebase index for retrieval rather than reading files one by one.

Toggle Plan Mode with Shift+Tab. Claude will read files and explore the codebase but cannot write to any file or run destructive commands:

Explain the authentication flow. Start from the login endpoint
and trace through middleware, session management, and token
refresh. Identify the key files and any potential issues.

Plan Mode is particularly useful for code reviews and architecture analysis. Claude can read as many files as needed without risk of accidental modification.

Pair --ask-for-approval untrusted (or on-failure) with --sandbox read-only so Codex browses and reasons but cannot touch files:

Explain the authentication flow in this codebase. Trace the
request lifecycle from login to session creation. Identify
key files and potential security concerns.

In the Codex IDE extension, switch the approval mode to Chat so Codex reads, reasons, and plans without making any changes — the read-only equivalent.

Best for: Code review, architecture analysis, onboarding to a new codebase, understanding unfamiliar code, investigating bugs before fixing them.

Guided Execution Mode

The default mode for all three tools. The AI can read files, make changes, and run commands, but asks for permission at key points. This is the workhorse mode for most development tasks.

Agent mode is the default. Cursor’s agent reads files, proposes edits, and runs terminal commands. You can review changes in the diff view before accepting:

Implement the rate limiter middleware following the pattern in
@src/middleware/auth.ts. Write tests in @src/middleware/__tests__/.
Run the tests after implementation.

Configure which tools the agent can use in Cursor Settings. You can allow file edits but require approval for terminal commands, or vice versa.

Normal Mode is the default. Claude asks permission for file writes and potentially destructive commands, but reads files freely:

Implement the rate limiter middleware following the pattern in
src/middleware/auth.ts. Write tests and run them. Fix any
failures before finishing.

Tune the permission level with /permissions. Allow specific safe commands (like npm test) to reduce interruptions while keeping approval for destructive operations.

--ask-for-approval on-request lets Codex run read-only commands freely but pause for your approval before writing files or running anything outside the sandbox. The --full-auto preset bundles this with a workspace-write sandbox for low-friction local work:

Implement the rate limiter middleware following existing patterns.
Write tests and run them. Fix any failures.

In the IDE extension, Codex shows inline diffs for each proposed change. Accept or reject individual hunks rather than whole files.

Best for: Feature implementation, bug fixes, refactoring, test writing, most day-to-day development work.

Full Autonomy Mode

The AI runs without interruption. Powerful for well-defined, low-risk tasks. Dangerous for anything touching sensitive code.

Auto-Run set to Run Everything (informally called YOLO mode) auto-accepts all changes and commands. Cloud Agent runs tasks asynchronously in an isolated cloud environment:

Fix all ESLint warnings in src/components/. Run npm run lint
after each fix to verify. Commit each fix separately with a
descriptive message.

Cloud Agent is the safer option for autonomous work. It runs on a clone of your repo in an isolated cloud environment, so your local working directory stays untouched until you review and merge the changes.

Use --dangerously-skip-permissions for batch operations, or enable sandbox mode (/sandbox) for safer autonomy:

claude --dangerously-skip-permissions -p \
  "Fix all ESLint warnings in src/components/. Commit each fix separately."

Sandbox mode is preferred because it provides autonomy within defined boundaries (filesystem and network restrictions) rather than bypassing all safety checks.

--ask-for-approval never gives Codex full autonomy. Cloud threads run in isolated VMs for maximum safety:

Fix all ESLint warnings in src/components/. Commit each fix
with a descriptive message. Create a PR when done.

Cloud threads are the safest way to use full autonomy because Codex works on a clone of your repo in an isolated environment. Nothing touches your local code until you merge the resulting PR.

Best for: Lint fixes, formatting, bulk renames, documentation generation, test boilerplate, migration scripts across many files.

Decision Framework

Use this flowchart when choosing a mode:

Are you trying to understand code, not change it? Use read-only/analysis mode.
Is the task well-defined with clear verification? Use guided execution mode and let the AI work through the task with your periodic review.
Is the task mechanical and low-risk? Consider full autonomy mode with appropriate isolation (cloud agent, sandbox, cloud thread).
Is the task touching sensitive code? Use guided execution with per-file approval.
Are you unsure what mode to use? Start with guided execution. You can always loosen permissions mid-session.

Phase 1 (Analysis - use ask/plan mode):
Read the payment processing module in src/payments/. Explain the
current architecture and identify the three riskiest areas for
our planned refactoring.

Phase 2 (Planning - stay in ask/plan mode):
Based on your analysis, create a refactoring plan. Save it to
docs/payment-refactor-plan.md. Include risk assessment for each step.

Phase 3 (Implementation - switch to agent/normal mode):
Implement step 1 from the plan. Write tests first, then implement.
Run tests after each change.

Parallel Mode Usage

Advanced teams use different modes simultaneously across multiple sessions:

Run a Cloud Agent on a lint-fix task (autonomous) while you use the main Agent for feature development (guided). Review the Cloud Agent’s changes when they are ready, without interrupting your feature work.

Run a headless Claude session to fix lint warnings in one terminal tab while working interactively on a feature in another. Use --continue to pick up any session later:

# Terminal 1: Autonomous lint fixes
claude -p "Fix all ESLint warnings in src/. Commit each fix." --allowedTools "Edit,Bash(npm run lint)"

# Terminal 2: Interactive feature development
claude

Launch a cloud thread for the mechanical task and work locally on the feature. Codex runs both in parallel:

Cloud thread: Fix all TypeScript strict mode errors in src/utils/.
Commit each fix with type: "fix(types)" message.

Meanwhile, work on your feature locally in the IDE extension or CLI.

Fix all [ISSUE TYPE] in [DIRECTORY]. For each fix:

1. Make the minimal change needed
2. Run [VERIFICATION COMMAND] to confirm the fix
3. Commit with message: [COMMIT FORMAT]
4. Move to the next instance

Only modify files in [DIRECTORY]. Do not change test files.
Do not install new dependencies. Do not modify configuration files.

When This Breaks

You stay in analysis mode too long. If you spend 30 minutes asking the AI about the codebase before writing any code, you have consumed context that could have been used for implementation. Set a time box for analysis (5-10 minutes), then switch to implementation.

You use full autonomy on complex tasks. Autonomous mode works for mechanical tasks with clear verification. For tasks requiring judgment (API design, error handling strategy, performance optimization), guided mode with human review produces better results.

You mix modes within a single prompt. Asking the AI to “analyze the auth module, then refactor it” in a single prompt forces it to switch modes internally, which often leads to it skipping the analysis and jumping straight to refactoring. Separate analysis and implementation into distinct prompts.

You forget to switch back. After using analysis mode for investigation, some developers forget to switch back to execution mode and wonder why the AI is not making changes. Check your current mode if the AI seems unresponsive to implementation requests.

What’s Next

Human in the Loop Deeper patterns for supervision and review across all modes.

Context Windows How mode choice affects context consumption and session length.

Cost per Context The cost implications of different modes and autonomy levels.

Grill Me & Grill With Docs Ask mode taken to its logical end — the agent interviews you one question at a time until you align.