Skip to content

Your First Codex Task

You have installed everything, authenticated, configured your approval mode, written an AGENTS.md, and connected MCP servers. Now it is time to actually use Codex for real work. This guide walks through the same task on all four surfaces so you can feel the difference and pick your preferred workflow.

  • A completed end-to-end task: prompt, execution, review of changes
  • Experience running the same task on the App, CLI, IDE extension, and Cloud
  • Understanding of how approval modes, worktrees, and sandboxing affect the workflow
  • A set of reusable prompt patterns for common Codex tasks
  • Confidence to start delegating real work to Codex

We will ask Codex to do something practical that every project needs: find a bug or missing edge case in error handling, fix it, and add a test. This exercises file reading, code analysis, code generation, command execution (running tests), and Git awareness.

The exact prompt we will use:

Find an error handling gap in this project's API routes -- a place where an
exception could crash the server or return a misleading response. Fix it with
proper error handling and add a test that proves the fix works. Run the tests
to confirm they pass.

This is deliberately open-ended. Codex has to explore your codebase, identify a real issue, implement a fix, and validate it. If your project does not have API routes, adjust the prompt to target your codebase.

  1. Open the Codex App and select your project.

  2. Click New Thread. Choose Local mode to work directly in your project directory, or Worktree to isolate changes in a Git worktree.

  3. Paste the prompt into the composer and hit send.

  4. Watch Codex work. The App shows real-time progress: files being read, analysis, code edits, command execution. If your approval policy is on-failure, Codex writes code autonomously but pauses before running commands like npm test.

  5. When Codex finishes, switch to the Review pane to see a full diff of all changes.

  6. Use the built-in terminal (Cmd + J) to manually verify the changes if you want: run tests, check the server, inspect the files.

App-specific features during a task:

  • Multiple threads can run in parallel. Start this task, then open a new thread for a different task while you wait.
  • If you chose Worktree mode, changes are isolated. Your main branch stays clean.
  • Voice dictation works: hold Ctrl + M to speak your prompt instead of typing.
  • Notifications appear when a background task finishes.

Regardless of which surface you use, Codex follows the same general pattern:

  1. Exploration: Codex reads your project structure, identifies relevant files, and understands the codebase layout.

  2. Analysis: It searches for the specific pattern you asked about (error handling gaps in this case), evaluating multiple candidates.

  3. Implementation: Codex edits the files, adding proper error handling, try-catch blocks, validation, or whatever the fix requires.

  4. Testing: It creates or updates test files, then runs the test suite to verify the fix works.

  5. Summary: Codex reports what it found, what it changed, and the test results.

The quality depends heavily on your AGENTS.md. If you specified your test framework (vitest, jest, pytest), test runner command, and code conventions, Codex follows them. If you did not, it guesses — and sometimes guesses wrong.

Here are prompts that work well for common first tasks. Use these as templates.

After running the same task across all four surfaces, you will notice each one has strengths:

Task typeBest surfaceWhy
Quick bug fix in one fileCLI or IDE ExtensionFast, minimal context switching
Multi-file refactorApp (Worktree mode)Isolated changes, parallel threads
Long-running migrationCloudRuns in background, does not block your machine
PR-driven workflowCloud + GitHubCreates PRs directly, integrates with reviews
Exploratory debuggingCLI (interactive TUI)Quick iteration, /undo for rollback
Visual changes (UI work)IDE ExtensionSee changes in editor, use Playwright MCP for screenshots

Codex edits the wrong files: Your AGENTS.md may be missing architecture context. Add explicit paths: “API routes are in src/routes/, NOT in src/api/.”

Tests fail after Codex’s fix: Codex may have used the wrong test framework or runner. Check your AGENTS.md has the exact test command (pnpm vitest, not npm test).

Codex gets stuck in a loop: It might be retrying a failing command. Press Esc (CLI) or stop the thread (App) and rephrase your prompt with more constraints.

Cloud task takes forever: Check your environment setup. If npm install runs on every task because node_modules is not cached, tasks take much longer. Optimize your setup commands.

Approval prompts interrupt the flow: Switch to auto-edit or never mode. See the configuration guide for details.