Your First Codex Task

You have installed everything, authenticated, configured your approval mode, written an AGENTS.md, and connected MCP servers. Now it is time to actually use Codex for real work. This guide walks through the same task on all four surfaces so you can feel the difference and pick your preferred workflow.

What You’ll Walk Away With

A completed end-to-end task: prompt, execution, review of changes
Experience running the same task on the App, CLI, IDE extension, and Cloud
Understanding of how approval modes, worktrees, and sandboxing affect the workflow
A set of reusable prompt patterns for common Codex tasks
Confidence to start delegating real work to Codex

The Task

We will ask Codex to do something practical that every project needs: find a bug or missing edge case in error handling, fix it, and add a test. This exercises file reading, code analysis, code generation, command execution (running tests), and Git awareness.

The exact prompt we will use:

Find an error handling gap in this project's API routes -- a place where an
exception could crash the server or return a misleading response. Fix it with
proper error handling and add a test that proves the fix works. Run the tests
to confirm they pass.

This is deliberately open-ended. Codex has to explore your codebase, identify a real issue, implement a fix, and validate it. If your project does not have API routes, adjust the prompt to target your codebase.

Running the Task in Each Surface

The App Workflow

Open the Codex App and select your project.
Click New Thread. Choose Local mode to work directly in your project directory, or Worktree to isolate changes in a Git worktree.
Paste the prompt into the composer and hit send.
Watch Codex work. The App shows real-time progress: files being read, analysis, code edits, command execution. If your approval policy is on-failure, Codex writes code autonomously but pauses before running commands like npm test.
When Codex finishes, switch to the Review pane to see a full diff of all changes.
Use the built-in terminal (Cmd + J) to manually verify the changes if you want: run tests, check the server, inspect the files.

App-specific features during a task:

Multiple threads can run in parallel. Start this task, then open a new thread for a different task while you wait.
If you chose Worktree mode, changes are isolated. Your main branch stays clean.
Voice dictation works: hold Ctrl + M to speak your prompt instead of typing.
Notifications appear when a background task finishes.

Prompt for a parallel second task in the App:

While the first task runs, open a new thread and try:

Review the project's README and update it to reflect the current API endpoints, dependencies, and setup instructions

Both tasks run simultaneously without interfering with each other (especially in Worktree mode).

The CLI Workflow

Navigate to your project directory in the terminal.
Start the Codex TUI:
Terminal window
```
codex
```
Paste the prompt and press Enter.
Codex works in the terminal: you see file reads, analysis, code edits streaming in, and command execution requests. With auto-edit approval mode, edits happen automatically but commands prompt for approval.
After Codex finishes, use /review to run a local code review of the changes.
Check the diff with Git:
Terminal window
```
git diff
```

CLI-specific features during a task:

Use /model to switch models mid-session if you want to try a different approach.
Use /undo to roll back the last turn’s changes.
Pipe the task for non-interactive execution: codex exec "your prompt here".
Image input: drag and drop screenshots into the terminal if you need to show Codex a visual bug.

Prompt for non-interactive mode (scripting):

Run the same task without the TUI:

codex exec "Find an error handling gap in this project's API routes. Fix it and add a test."

This is useful for CI/CD pipelines, cron jobs, or scripting Codex into larger workflows.

The IDE Extension Workflow

Open your project in VS Code (or Cursor/Windsurf).
Open the Codex sidebar panel.
Ensure Agent mode is selected (not Chat mode). Agent mode lets Codex read files, run commands, and write changes.
Paste the prompt and send.
Codex works within your IDE: file edits appear in your editor, diffs are shown inline, and commands run in the integrated terminal.
Review the changes in VS Code’s built-in Git diff view.

IDE Extension-specific features during a task:

@file references: type @src/routes/auth.ts to point Codex at a specific file for context.
Auto Context: if the App is also open, the IDE extension syncs which files you are viewing.
Cloud delegation: click the Cloud icon to offload the task to Codex Cloud instead of running locally.
Approval mode maps to IDE-specific names: Chat (ask everything), Agent (auto-edit), Agent (Full Access) (never ask).

Prompt using file references:

For more targeted work, reference files directly:

Look at @src/routes/users.ts and @src/routes/orders.ts -- find error handling gaps, fix them, and add tests

The @file syntax gives Codex immediate context without searching.

The Cloud Workflow

Open chatgpt.com/codex.
Select your configured environment (the repository you connected earlier).
Click New Task and paste the prompt.
Codex launches a cloud environment, clones your repo, installs dependencies (using your environment setup commands), and starts working.
Monitor progress in real time via the logs view, or close the browser and let it run in the background.
When the task completes, review the proposed changes in the diff view.
Click Create PR to turn the changes into a pull request, or Check out locally to test the branch on your machine:
Terminal window
```
git fetch
git checkout <branch-name>
```

Cloud-specific features during a task:

Tasks run in the background. Close your laptop and the task keeps going.
The diff view shows all changes before you commit to anything.
You can iterate: send follow-up messages in the same thread to refine the result.
PRs created from Cloud include a clean diff and description.

Prompt to trigger a cloud task from GitHub instead:

Instead of using chatgpt.com/codex, go to a GitHub issue and comment:

@codex find and fix error handling gaps in the API routes, add tests, and open a PR

Codex runs the same workflow but creates the PR automatically.

What to Expect

Regardless of which surface you use, Codex follows the same general pattern:

Exploration: Codex reads your project structure, identifies relevant files, and understands the codebase layout.
Analysis: It searches for the specific pattern you asked about (error handling gaps in this case), evaluating multiple candidates.
Implementation: Codex edits the files, adding proper error handling, try-catch blocks, validation, or whatever the fix requires.
Testing: It creates or updates test files, then runs the test suite to verify the fix works.
Summary: Codex reports what it found, what it changed, and the test results.

The quality depends heavily on your AGENTS.md. If you specified your test framework (vitest, jest, pytest), test runner command, and code conventions, Codex follows them. If you did not, it guesses — and sometimes guesses wrong.

Reusable Prompt Patterns

Here are prompts that work well for common first tasks. Use these as templates.

Bug hunting prompt:

Scan this codebase for potential null pointer exceptions, unhandled promise rejections,
or missing error boundaries. Focus on the most critical user-facing code paths. Fix
the top 3 issues you find and add tests for each fix.

Refactoring prompt:

The authentication middleware in src/middleware/auth.ts has grown to 200+ lines.
Refactor it into smaller, testable functions. Each function should have a single
responsibility. Update existing tests to match the new structure and add tests
for any previously untested paths.

Feature addition prompt:

Add rate limiting to the POST /api/login endpoint. Use a sliding window algorithm
with a limit of 5 attempts per minute per IP address. Return a 429 response with
a Retry-After header when the limit is exceeded. Add integration tests that verify
both the happy path and the rate-limited path.

Choosing the Right Surface for the Task

After running the same task across all four surfaces, you will notice each one has strengths:

Task type	Best surface	Why
Quick bug fix in one file	CLI or IDE Extension	Fast, minimal context switching
Multi-file refactor	App (Worktree mode)	Isolated changes, parallel threads
Long-running migration	Cloud	Runs in background, does not block your machine
PR-driven workflow	Cloud + GitHub	Creates PRs directly, integrates with reviews
Exploratory debugging	CLI (interactive TUI)	Quick iteration, `/undo` for rollback
Visual changes (UI work)	IDE Extension	See changes in editor, use Playwright MCP for screenshots

When This Breaks

Codex edits the wrong files: Your AGENTS.md may be missing architecture context. Add explicit paths: “API routes are in src/routes/, NOT in src/api/.”

Tests fail after Codex’s fix: Codex may have used the wrong test framework or runner. Check your AGENTS.md has the exact test command (pnpm vitest, not npm test).

Codex gets stuck in a loop: It might be retrying a failing command. Press Esc (CLI) or stop the thread (App) and rephrase your prompt with more constraints.

Cloud task takes forever: Check your environment setup. If npm install runs on every task because node_modules is not cached, tasks take much longer. Optimize your setup commands.

Approval prompts interrupt the flow: Switch to auto-edit or never mode. See the configuration guide for details.

What’s Next

Review Agent Work Learn to review, refine, and merge Codex's changes using the review pane