Skip to content

Debugging with Codex Across Surfaces

The on-call pager fires at 2 PM. Users report that saving their profile settings “works” — they see the success toast — but the changes do not persist after a page refresh. The logs show 200 responses. The database shows stale data. Somewhere between the frontend optimistic update and the backend commit, something is silently failing. You need to find it, fix it, and verify the fix without breaking anything else. Codex gives you multiple surfaces to attack this from simultaneously.

  • A reproduction-first debugging workflow that works across all Codex surfaces
  • Prompts for CLI debugging with tight feedback loops, App debugging with persistent context, and GitHub delegation for CI failures
  • Techniques for using cloud tasks to reproduce environment-specific bugs
  • The @codex GitHub pattern for delegating bug fixes directly from issues and PRs

Different bugs call for different surfaces:

Best when you can reproduce the bug locally and want a fast investigate-fix-verify cycle. The CLI lets you pipe error output directly into Codex, run commands inline, and iterate rapidly.

Terminal window
codex

The single most important debugging step is getting a reliable reproduction. Give Codex explicit reproduction steps, not just a bug description.

In the CLI, Codex will run the dev server, attempt the reproduction, read the relevant source files, and trace the issue. The tight feedback loop — command, output, reasoning, next command — is where the CLI shines for debugging.

Once you have a reproduction, the investigation phase benefits from the Codex App’s persistent context. Open the project in the App, create a Local thread, and describe what you have found so far:

The App keeps the full conversation, so you can ask follow-up questions without re-explaining context. If Codex identifies the issue in the handler, you can leave an inline comment on the specific line in the review pane, then ask it to fix just that part.

Never fix bugs directly in your working directory if you have other uncommitted work. Switch to Worktree mode and base it on the branch with the bug:

Fix the profile settings persistence bug. The root cause is that the Drizzle update call in src/routes/profile.ts uses .set() but does not call .where() with the user ID, so it matches zero rows and returns silently.
Fix the query, add a check that the update affected exactly one row (throw 404 if zero), and write a regression test that:
1. Creates a user
2. Updates their profile via PUT
3. Reads the profile via GET
4. Asserts the updated values are returned
Run the full test suite after the fix.

When a CI pipeline fails on a pull request, you do not need to check out the branch locally. Comment directly on the PR:

@codex fix the CI failures

Codex creates a cloud task, reads the PR diff and CI logs, identifies the failure, and proposes a fix. It posts the results back on the PR as a comment with a link to the task. If the fix involves code changes, you can open a PR from the cloud task.

For more targeted investigation, be specific:

@codex The integration tests for the payment webhook handler are failing with "connection refused" on the Redis mock. Investigate and fix.

Step 5: Debug Environment-Specific Issues in Cloud

Section titled “Step 5: Debug Environment-Specific Issues in Cloud”

Some bugs only reproduce in specific environments. Cloud tasks run in the codex-universal container where you can pin Node.js versions, install system dependencies, and configure environment variables.

From the CLI:

Terminal window
codex cloud exec --env production-mirror "The cron job that processes expired subscriptions is silently skipping records when run with Node 20. Reproduce the issue using the test data in tests/fixtures/expired-subscriptions.json, identify the root cause, and propose a fix."

Use --attempts 3 for best-of-N when the bug is intermittent:

Terminal window
codex cloud exec --env production-mirror --attempts 3 "Reproduce the race condition in the WebSocket reconnection handler. It happens approximately 1 in 5 times when the server restarts during an active connection."

Codex cannot reproduce the bug. If the reproduction requires state that does not exist in the dev environment (specific user data, third-party API responses, production traffic patterns), the bug will not reproduce locally. Provide Codex with the exact error messages, stack traces, and relevant log lines instead of reproduction steps. Use cloud environments with seeded test data for closer-to-production reproduction.

The fix breaks other tests. This is why you always include “Run the full test suite after the fix” in your prompt. If Codex runs only the test it wrote and not the existing suite, you will discover regressions after merging. Be explicit: “Run npm test (the full suite), not just the new test.”

@codex on GitHub does not respond. Codex must be enabled for code review and cloud tasks on your repository. Check your Codex settings at chatgpt.com/codex/settings. Also ensure the comment uses @codex (lowercase) — casing matters.

Cloud task produces a fix that works in the container but fails locally. Environment differences between the universal container and your local machine can cause this. Pin versions in your cloud environment settings to match your local setup, or add the specific versions to your setup script.