Skip to content

Efficient Agent Work Review and Approval

You have five Codex threads that just finished. Each one produced a multi-file diff. Reviewing every line of every diff defeats the purpose of delegation — you would be faster writing the code yourself. But blindly merging agent output is irresponsible. The sweet spot is a structured review workflow that catches real issues quickly while trusting the agent for routine correctness.

  • A tiered review framework that adjusts depth based on risk level
  • The built-in /review workflow for local code review before committing
  • Techniques for inline commenting in the Codex App diff pane
  • Approval mode strategies that balance safety with velocity

For tasks where the agent ran tests and they passed:

  • Scan the diff summary (files changed, lines added/removed)
  • Verify the test suite passed
  • Spot-check one or two representative changes
  • Merge

Examples: Documentation updates, test additions, lint fixes, straightforward refactors.

For tasks involving business logic or API changes:

  • Read the full diff file by file
  • Check error handling and edge cases
  • Verify the test covers the new behavior
  • Use inline comments in the App diff pane to request changes
  • Let Codex address comments in a follow-up turn

Examples: New endpoints, database schema changes, authentication logic.

For tasks touching security, payments, or data migration:

  • Read the diff and the surrounding context
  • Manually test the changes in a local environment
  • Run security scanning tools
  • Pair review with another human
  • Consider using Codex’s code review feature for an independent second opinion

Examples: Payment processing, user data handling, security middleware, database migrations.

The /review command launches a dedicated reviewer that reads diffs and reports prioritized findings without touching your working tree:

  1. Type /review in the CLI
  2. Choose a review mode:
    • Review against a base branch: Finds the merge base and diffs your work
    • Review uncommitted changes: Inspects staged, unstaged, and untracked files
    • Review a commit: Picks a specific SHA
    • Custom review instructions: Your own prompt (e.g., “Focus on accessibility regressions”)
  3. Read the findings, which are prioritized by severity
  4. Address issues and re-run /review to verify fixes

The Codex App’s diff pane supports inline comments. When you see something that needs attention:

  1. Click the line number in the diff view
  2. Add your comment (e.g., “This needs null checking” or “Use the centralized error handler instead”)
  3. Codex addresses your comments in the next turn

This is faster than writing a follow-up prompt because the agent sees exactly which line you are referring to.

Match your approval mode to the task risk:

ModeDescriptionBest For
Auto (default)Codex reads, edits, and runs commands within the workspace. Asks before going outside scope.Most development work
Read-onlyCodex browses files but cannot make changes or run commands until you approveExploratory analysis, understanding unfamiliar code
Full AccessNo approval prompts. Codex works across the machine including networkTrusted repos in isolated environments only

Switch modes mid-session with /permissions in the CLI.

For automations, use approval_policy = "never" only when your sandbox mode is workspace-write or stricter. Never combine approval_policy = "never" with danger-full-access unless the machine is fully isolated.

For GitHub-hosted PRs, Codex can review automatically:

  • Automatic reviews: Triggered when you open a PR for review
  • Reactive reviews: Mention @Codex in a PR comment to ask for specific feedback

Configure at Settings > Code review. Reviews run in cloud environments and count toward your code review limits.

  • Review fatigue with too many threads: Batch your reviews. Let threads accumulate for an hour, then review them all in one session using the Triage inbox.
  • Agent keeps making the same mistake: The prompt or AGENTS.md is missing a constraint. Add it explicitly and re-run.
  • Inline comments not picked up: Ensure the comment is on a line that was actually changed. Comments on unchanged context lines may be ignored.
  • False sense of security from passing tests: Tests only catch what they cover. For Tier 3 reviews, manually test edge cases that the agent’s tests may not cover.