Cost Optimization with AI

Your AWS bill jumped 38% last month. Finance is asking why, the new staging cluster nobody tagged is buried somewhere in EC2-Other, and the Cost Explorer console takes twenty clicks to answer a question you’ll have to re-answer next week. You don’t need a FinOps platform. You need your coding agent to read the billing data, rank the waste, and hand you the exact commands to fix it.

That’s what this guide does: it wires the AWS Billing & Cost Management MCP into Cursor, Claude Code, and Codex, then walks the real loop — find the cost drivers, rightsize the worst offenders, forecast the trend, and keep a human in the loop before anything destructive runs.

What You’ll Walk Away With

A working AWS Billing & Cost Management MCP setup, configured identically across Cursor, Claude Code, and Codex
A copy-paste prompt that returns your top cost drivers grouped by service, ranked rightsizing actions, and a risk note per action
A 20-line script that pipes Cost Explorer JSON into Claude Opus 5 for ranked recommendations — when you’d rather script it than chat
A forecasting prompt that uses your real 90-day history instead of a wishful linear projection
A When This Breaks checklist for the failures you’ll actually hit: MCP auth, stale data, and an agent that wants to terminate prod

Step 1: Connect the AWS Billing & Cost Management MCP

The work starts and ends with one MCP server. AWS Labs publishes the official Billing and Cost Management MCP server (the successor to the older cost-analysis server). It exposes Cost Explorer, budgets, Compute Optimizer right-sizing, Cost Optimization Hub, Savings Plans recommendations, and month-over-month comparisons — all through your existing AWS credentials.

It runs via uvx, so install uv first and make sure aws configure (or AWS_PROFILE) resolves to a role with ce:Get*, compute-optimizer:Get*, and cost-optimization-hub:* read permissions.

The server config itself is identical across all three tools — same command, same args, same env. Only the way you register it differs.

Add the server to .cursor/mcp.json (project) or your global Cursor MCP settings, then enable it in Settings → MCP. Cursor surfaces the tools to agent mode automatically:

{
  "mcpServers": {
    "awslabs.billing-cost-management-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.billing-cost-management-mcp-server@latest"],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_PROFILE": "your-aws-profile",
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

Open the agent panel (Cmd/Ctrl+I), switch to Agent mode, and confirm the billing tools appear in the MCP tool list before prompting.

claude mcp add --transport stdio \
  --env FASTMCP_LOG_LEVEL=ERROR \
  --env AWS_PROFILE=your-aws-profile \
  --env AWS_REGION=us-east-1 \
  aws-cost \
  -- uvx awslabs.billing-cost-management-mcp-server@latest

Use --scope project to share it with the repo via .mcp.json, or leave the default local scope for a personal, credential-bearing setup. Verify with claude mcp list.

Add the server to ~/.codex/config.toml under an [mcp_servers.<id>] table:

[mcp_servers.aws-cost]
command = "uvx"
args = ["awslabs.billing-cost-management-mcp-server@latest"]

[mcp_servers.aws-cost.env]
FASTMCP_LOG_LEVEL = "ERROR"
AWS_PROFILE = "your-aws-profile"
AWS_REGION = "us-east-1"

Codex reads config.toml on startup across App, CLI, and IDE surfaces. Confirm the tools loaded by typing / in the TUI and checking the MCP tool list.

Step 2: Find the Cost Drivers

With the MCP connected, the first job is visibility: which services moved, by how much, and what’s behind the change. This is where a vague prompt (“analyze my costs”) wastes a turn and a sharp one gets you a ranked action list.

The prompt below is deliberately opinionated — it names the grouping, the metric, the window, and the output shape. Paste it as-is into any of the three tools.

Copy-paste prompt — rank the cost drivers and the fixes:

Use the AWS Billing & Cost Management MCP. Pull UnblendedCost from Cost Explorer for the last 30 days at DAILY granularity, grouped by SERVICE. Then:

List the top 10 services by spend with their dollar amount and percent of total.

For any service whose daily cost rose more than 20% versus the prior 30 days, call out the increase and the likely driver (new resources, region, usage type).

Pull Compute Optimizer and Cost Optimization Hub recommendations. Output a table of rightsizing actions ranked by estimated monthly savings, each with: resource, current → recommended, estimated $/month saved, and a risk note (Low/Medium/High) explaining what could break.

Do not run any modifications. Give me the AWS CLI command for each action so I can review and apply them myself.

The do not run any modifications line matters. Without it, an agent in auto-approve mode may try to apply a rightsizing through a write tool. We want the analysis and the commands — the apply decision stays human.

Once you have the ranked list, the follow-up that earns its keep is the untagged-spend hunt — the single most common reason a bill is “unexplained”:

Step 3: Rightsize a Module (Three Tools, One Workflow)

Visibility tells you what to fix. The next step is having the agent turn a recommendation into a reviewed change in your IaC. The recommendation is identical across tools — “downsize this over-provisioned m5.2xlarge to an m5.large” — but how you drive each tool to edit the Terraform differs.

Open the Terraform module in the editor. In Agent mode, reference the file and the MCP finding so Cursor edits inline and shows you a diff to accept or reject:

@main.tf The AWS cost MCP flagged aws_instance.api (m5.2xlarge) at 9% average CPU over 30 days. Change it to the Compute Optimizer recommended size, add a comment with the date and the % savings, and show me a terraform plan summary of what changes. Do not apply.

Use Cursor’s checkpoint before accepting so you can roll the edit back in one click if the plan looks wrong.

From the repo root, let the CLI read the module and the live recommendation in one pass:

claude "Read infra/main.tf. Using the aws-cost MCP, get the Compute Optimizer \
recommendation for the instance behind aws_instance.api, apply the recommended \
instance_type in the file, and run 'terraform plan' to show the diff. \
Stop before apply and summarize the plan."

For a recurring sweep, wire it into a script or a pre-deploy hook so every release checks for over-provisioned resources flagged since the last run.

Run Codex in the repo with on-request approval so it pauses before touching files or running terraform:

codex --ask-for-approval on-request \
  "Use the aws-cost MCP to get the rightsizing recommendation for the instance \
  defined as aws_instance.api in infra/main.tf, update the instance_type, then \
  run terraform plan and show me the diff. Do not apply."

Because Codex spans App, CLI, IDE, and Cloud, you can hand the same task to a Cloud task for a long-running multi-module sweep and review the resulting PR.

Step 4: Forecast Instead of Guess

A linear “we spent X last month so we’ll spend X again” projection is wrong the moment usage has any seasonality. When you want a real forecast, the agent has two honest options: ask the MCP for Cost Explorer’s own forecast, or pull the raw 90-day history and fit it.

For most teams the MCP’s built-in forecast is enough — ask for it directly:

When you’d rather own the model — for example to feed it into a dashboard — skip the chat and pipe the data into Claude directly. This 20-line script pulls real Cost Explorer history and asks Claude Opus 5 for a ranked, structured set of recommendations. It runs:

// rank-cost-drivers.ts — run: npx tsx rank-cost-drivers.ts
import { CostExplorer } from '@aws-sdk/client-cost-explorer';
import Anthropic from '@anthropic-ai/sdk';

const ce = new CostExplorer({ region: 'us-east-1' });
const anthropic = new Anthropic(); // reads ANTHROPIC_API_KEY

const end = new Date().toISOString().slice(0, 10);
const start = new Date(Date.now() - 30 * 864e5).toISOString().slice(0, 10);

const { ResultsByTime } = await ce.getCostAndUsage({
  TimePeriod: { Start: start, End: end },
  Granularity: 'DAILY',
  Metrics: ['UnblendedCost'],
  GroupBy: [{ Type: 'DIMENSION', Key: 'SERVICE' }],
});

const msg = await anthropic.messages.create({
  model: 'claude-opus-5',
  max_tokens: 1500,
  messages: [{
    role: 'user',
    content: `Here is 30 days of AWS daily cost grouped by service as JSON. Return the top
5 cost drivers, any service trending up more than 20% week-over-week, and one concrete
rightsizing or scheduling action per driver with an estimated monthly saving.
\n\n${JSON.stringify(ResultsByTime)}`,
  }],
});

console.log(msg.content);

For high-volume, simple classification passes — for example labelling thousands of resources by environment from their names — drop to the cheapest model instead of the flagship:

const msg = await anthropic.messages.create({
  model: 'claude-haiku-4-5', // cheapest tier, ~$1/$5 per Mtok — fine for bulk tagging
  max_tokens: 256,
  messages: [{ role: 'user', content: tagInferencePrompt }],
});

Step 5: Kubernetes and Multi-Cloud

If your spend lives in Kubernetes rather than raw EC2, swap the AWS MCP for a cluster MCP. The community Kubernetes MCP server (mcp-server-kubernetes on npm) lets the agent read resource requests versus actual usage — the data behind almost every pod-level waste finding.

Add it the same way (claude mcp add --transport stdio k8s -- npx -y mcp-server-kubernetes, or the matching .cursor/mcp.json / config.toml block), then:

For genuine multi-cloud cost comparison, be honest about tooling: there is no single trustworthy “arbitrage” MCP. Use each provider’s own MCP (AWS billing MCP above; GCP and Azure have their own billing exports) and have the agent normalize the numbers — and treat cross-cloud migration savings as a model, not a guarantee, because egress and re-architecture costs routinely erase the headline difference.

When This Breaks

The MCP tools never appear / every call returns an auth error. The server starts but Cost Explorer 403s. Almost always AWS_PROFILE resolves to the wrong account or a role without ce:Get*. Run aws sts get-caller-identity with that profile, confirm the account, and check the IAM policy includes Cost Explorer, Compute Optimizer, and Cost Optimization Hub read actions. Cost Explorer also has to be enabled in the billing console before any API returns data.
uvx isn’t found or the server fails to launch. The config is correct but uv isn’t installed or isn’t on the PATH the tool sees. Install uv, restart the tool so it picks up the new PATH, and test the raw command in a terminal: uvx awslabs.billing-cost-management-mcp-server@latest should start and wait on stdio.
The numbers look a day or two stale. Cost Explorer data lags 24–48 hours and updates up to three times a day — it is not real-time. If the agent reports yesterday’s spike as missing, that’s expected. For anything closer to live, you need CUR exports or CloudWatch billing metrics, not Cost Explorer.
The agent wants to apply a destructive change. It proposes terminating an instance or downsizing prod and — in auto-approve mode — tries to run it. This is why every prompt above ends with “do not apply” and why the MCP profile is read-only. Keep approvals on (--ask-for-approval on-request in Codex, accept/reject diffs in Cursor, review before running in Claude Code) for anything that mutates infrastructure.
A rightsizing recommendation is based on a quiet window. Compute Optimizer says “downsize” because the lookback period missed your monthly peak. Always make the agent state the lookback window and reconcile it against known seasonality before applying.

What’s Next

Security Operations with AI — wire security-scanning MCPs into the same three-tool loop
Compliance Automation — use tagging and policy data to keep cost attribution and audit trails in sync
Deployment & Operations overview — the rest of the production operations workflows