Skip to content

Credits, Usage Limits, and Cost Optimization

You are three days into a sprint and your team’s Codex usage dashboard shows you have burned through 80% of your monthly credits. Two developers are running expensive cloud tasks for exploratory work that could have been local. One automation is running hourly when daily would suffice. Without visibility and discipline, Codex costs can surprise you. This article gives you the controls and strategies to keep spending predictable.

  • A clear understanding of Codex pricing tiers, credit costs, and usage limits
  • Concrete strategies that reduce credit consumption by 30-50% without sacrificing productivity
  • Monitoring and alerting patterns using the usage dashboard and Analytics API
  • Decision frameworks for when to use local vs cloud, GPT-5.3-Codex vs GPT-5.1-Codex-Mini
PlanPriceLocal Messages / 5hCloud Tasks / 5hCode Reviews / week
Plus$20/mo45-22510-6010-25
Pro$200/mo300-150050-400100-250
Business$30/user/mo45-22510-6010-25
EnterpriseContact salesCredit-based (no fixed limits)Credit-basedCredit-based
API KeyUsage-basedPer-token pricingN/AN/A

Local and cloud limits share a five-hour rolling window. Additional weekly limits may apply.

Credits extend your usage after hitting included limits. Cost per message varies by task complexity:

SurfaceUnitGPT-5.3-CodexGPT-5.1-Codex-Mini
Local Tasks1 message~5 credits~1 credit
Cloud Tasks1 message~25 creditsN/A
Code Review1 PR~25 creditsN/A

GPT-5.1-Codex-Mini provides roughly 4x more usage per credit for local tasks. Cloud tasks and code reviews are not available with Mini.

The biggest cost driver is task complexity. Every message Codex processes includes your prompt, AGENTS.md, MCP tool definitions, and accumulated context.

Every Codex message includes your AGENTS.md content. For large projects, use nested AGENTS.md files:

# Root AGENTS.md (100 lines - loaded for all tasks)
AGENTS.md
# Service-specific (50 lines - loaded only when working in payments/)
services/payments/AGENTS.md
# Frontend-specific (50 lines - loaded only when working in frontend/)
packages/frontend/AGENTS.md

This way, a task in services/payments/ loads 150 lines of guidance instead of a monolithic 500-line file.

Every configured MCP server adds tool definitions to your context. Disable MCP servers you are not actively using:

~/.codex/config.toml
[mcp_servers.sentry]
enabled = false # Re-enable when debugging production issues
[mcp_servers.linear]
enabled = true # Always useful for issue context

Strategy 4: Use GPT-5.1-Codex-Mini for Simple Tasks

Section titled “Strategy 4: Use GPT-5.1-Codex-Mini for Simple Tasks”

Reserve GPT-5.3-Codex for complex reasoning. Switch to Mini for:

  • Simple refactors and renames
  • Straightforward test writing
  • Documentation updates
  • Linting and formatting fixes

In the CLI: codex --model gpt-5.1-codex-mini "add docstrings to all public functions in src/utils/"

In the App: switch models in the thread composer dropdown.

Local tasks cost ~5 credits vs ~25 credits for cloud tasks. Use cloud only when you need:

  • Remote execution (delegating from Slack, mobile, or another timezone)
  • Complete environment isolation
  • Best-of-N parallel attempts
  • Work on a branch you have not pushed yet (use codex cloud to delegate from CLI)

The Codex usage dashboard shows:

  • Current usage against your limits
  • Credit consumption over time
  • Breakdown by surface (local, cloud, code review)

For enterprise teams, build automated alerts:

// Check daily credit burn rate
const response = await fetch("https://chatgpt.com/codex/api/analytics/daily", {
headers: { Authorization: `Bearer ${adminToken}` }
});
const data = await response.json();
const dailyCredits = data.total_credits_used;
if (dailyCredits > DAILY_BUDGET * 0.8) {
await sendSlackAlert(`Codex credit usage at ${dailyCredits}/${DAILY_BUDGET} (80% of daily budget)`);
}

Automations are a hidden cost driver because they run unattended:

  • Review cadence: Does your automation need to run hourly? Daily is often sufficient.
  • Scope prompts tightly: A broad “scan the entire codebase” automation costs much more than “scan files changed in the last 24 hours.”
  • Use read-only mode: Reporting automations do not need write access, and read-only mode prevents unnecessary tool calls.
  • Archive completed runs: Old automation worktrees consume disk space, and their creation consumed credits. Archive what you have reviewed.
  • Hitting limits mid-sprint: Purchase additional credits through the usage dashboard. Or switch to GPT-5.1-Codex-Mini to stretch remaining limits 4x.
  • Unexpected cloud task costs: Review which integrations (Slack, Linear) are creating cloud tasks. Consider restricting cloud access to specific user groups via RBAC.
  • Automation credit drain: Check the Automations section in the sidebar for runs that are firing too frequently or producing low-value results. Adjust cadence or disable.
  • API key usage surprises: API key usage is billed at standard API rates per token. Set spending limits in the OpenAI platform dashboard.