Cloud Cost Management & FinOps

Your cloud bill jumped 40% this month. Finance wants an explanation by Friday, engineering swears nothing changed, and the only tool you have is a Cost Explorer dashboard with 200 line items and no story. You could spend two days exporting CSVs and pivoting in a spreadsheet — or you could point an AI assistant at a cost MCP server and have it tell you which five services moved, by how much, and why.

This guide shows FinOps practitioners, DevOps engineers, and platform teams how to wire Cursor, Claude Code, and Codex to real cost-management MCP servers and turn raw billing data into right-sizing plans, anomaly alerts, and forecasts you can actually defend in a budget review.

What You’ll Walk Away With

A working MCP setup for Vantage (multi-cloud) and the AWS Labs Cost Explorer server, configured identically across Cursor, Claude Code, and Codex
A copy-paste prompt that surfaces your top 5 cost-savings opportunities for the current month
A right-sizing prompt that returns a phased plan tagged with risk levels, not a flat list of instances
An anomaly-detection prompt that distinguishes expected growth from genuine bill shock
A clear sense of when this workflow breaks — API charges, rate limits, stale tags — and how to recover

Set Up the Cost MCP Servers

These servers expose billing and usage APIs as tools your AI assistant can call directly. The config is identical across Cursor, Claude Code, and Codex — only the file each tool reads differs (.cursor/mcp.json or Cursor Settings, .mcp.json for Claude Code, ~/.codex/config.toml for Codex). Set up these two and you cover most multi-cloud cases.

Vantage (multi-cloud cost platform)

Vantage aggregates AWS, Azure, GCP, Kubernetes, and SaaS spend behind one API. The official MCP server runs via npx and authenticates with a read-only bearer token.

{
  "mcpServers": {
    "vantage": {
      "command": "npx",
      "args": ["-y", "vantage-mcp-server"],
      "env": {
        "VANTAGE_TOKEN": "your-read-only-vantage-token"
      }
    }
  }
}

Generate the token from your Vantage account under API access and scope it read-only — the AI never needs write access to analyze spend.

AWS Cost Explorer (AWS Labs MCP server)

The AWS Labs Cost Explorer server is a Python package distributed on PyPI and run with uvx, not a bare npm binary. It reads your existing AWS credentials via a named profile.

{
  "mcpServers": {
    "aws-cost-explorer": {
      "command": "uvx",
      "args": ["awslabs.cost-explorer-mcp-server@latest"],
      "env": {
        "FASTMCP_LOG_LEVEL": "ERROR",
        "AWS_REGION": "us-east-1",
        "AWS_PROFILE": "default"
      }
    }
  }
}

For Azure and GCP, the equivalents are @azure/mcp (npx -y @azure/mcp@latest server start) and @google-cloud/gcloud-mcp (npx -y @google-cloud/gcloud-mcp), each reading their native credential chain. Add them only when you actually operate in those clouds.

Codex reads the same config in TOML

Codex stores MCP servers in ~/.codex/config.toml rather than JSON. The same two servers look like this:

[mcp_servers.vantage]
command = "npx"
args = ["-y", "vantage-mcp-server"]
env = { VANTAGE_TOKEN = "your-read-only-vantage-token" }

[mcp_servers.aws-cost-explorer]
command = "uvx"
args = ["awslabs.cost-explorer-mcp-server@latest"]
env = { AWS_REGION = "us-east-1", AWS_PROFILE = "default" }

The Workflow: From Bill Shock to a Defensible Plan

The pattern is the same regardless of tool: connect a cost MCP server, ask a focused question, then verify the recommendation against the actual resource before you act. The AI is fast at finding candidates; you own the decision to change production.

Open the agent panel and reference the servers by name. Cursor keeps the analysis in your editor so you can drop findings straight into a runbook or Terraform change.

@vantage @aws-cost-explorer Pull last month's spend grouped by
service and linked account. For the five services that grew the
most versus the prior month, tell me the dollar delta, the likely
driver (usage vs. price vs. new resources), and whether the growth
looks expected for a product scaling its user base. Output a table,
then a short prioritized list of what to investigate first.

Cursor returns a table you can iterate on inline — ask follow-ups like “drill into RDS” without restating context.

Run it from the terminal so the analysis lives next to your infra repo and can feed a script or a PR.

claude "Using the vantage and aws-cost-explorer MCP servers, find my
top 5 cost-growth drivers for last month vs. the prior month. For each,
give the dollar delta, the likely cause, and an expected-vs-anomalous
verdict. End with a prioritized investigation list."

Claude Code coordinates both servers, aggregates the data, and writes a Markdown summary you can commit to your ops docs.

Codex reads the same servers from ~/.codex/config.toml and works across CLI, IDE, and Cloud.

codex "Analyze our multi-cloud spend via the vantage and
aws-cost-explorer MCP servers and produce a phased cost-reduction
plan: top 5 growth drivers, dollar deltas, likely causes, and an
expected-vs-anomalous call for each."

Run it in a Codex Cloud task to keep the analysis off your laptop, or in the IDE extension to fold results into a change you’re already drafting.

A typical response groups spend, ranks the movers, and flags which growth is benign. For example, it might report that RDS grew because three read replicas were added (expected for a traffic ramp) while a 45% jump in inter-region transfer has no matching deploy and warrants investigation. Treat the dollar figures as a starting point — confirm them against the provider console before you brief finance, because tag coverage and account boundaries shape what the API returns.

Copy-Paste Prompts

These are the reusable recipes. They name real services and ask for opinionated output, so they work with minimal editing — swap the provider or threshold and run.

Find this month’s top 5 savings opportunities:

Using @vantage and @aws-cost-explorer, analyze the current month's
spend and return my top 5 cost-savings opportunities ranked by
monthly dollar impact. For each: the resource or service, why it's
wasteful (idle, over-provisioned, wrong tier, no commitment), the
estimated monthly saving, and the effort to capture it (low/medium/high).
Exclude anything that would risk an SLA. Give me a table I can paste
into a ticket.

Generate a phased right-sizing plan with risk levels:

Using @aws-cost-explorer, analyze actual CPU, memory, and IOPS
utilization vs. provisioned capacity for our EC2 and RDS fleet over
the last 30 days. Produce a right-sizing plan in three phases:
Phase 1 = zero-risk (remove orphaned volumes, idle load balancers),
Phase 2 = low-risk downsizing (instances under 30% sustained
utilization), Phase 3 = commitment changes (Savings Plans / Reserved
Instances). For each recommendation include the target size, the
monthly saving, and a risk level with the specific failure mode to
watch (e.g. throttling under burst load).

Set up anomaly alerts that ignore expected growth:

Using @vantage, design a cost anomaly-detection setup that minimizes
false positives. Define which dimensions to monitor (service, linked
account, region), how to set thresholds relative to a trailing baseline
rather than a fixed dollar amount, and rules to suppress alerts that
correlate with known events (a deploy, a marketing campaign, a planned
backfill). Output the alert conditions and an example of a true
positive vs. a benign spike I should not be paged for.

Cost Allocation and Forecasting

Two follow-on jobs round out a FinOps practice: making spend traceable to teams, and turning history into a forward budget.

Design a tagging and allocation strategy

Ask the assistant to translate your org structure into an enforceable tag schema and a method for splitting shared resources (databases, load balancers, NAT gateways) that no single team owns.

Using @vantage, design a cost-allocation strategy for an org with
product teams, shared platform teams, and dev/staging/production
environments. Define a required tag set, a fallback for untagged
spend, and a defensible method to split shared-resource costs
(by request volume, by CPU/memory share). Flag where allocation
will be approximate so I can set expectations with finance.

Roll history forward into next year’s budget

Feed the trailing four quarters and your growth assumptions; ask for a forecast with explicit buffers rather than a single number.

Using @vantage and @aws-cost-explorer, take our last four quarters
of actual spend and build a next-year monthly forecast. Inputs:
expected user growth, planned feature launches, and one new region.
Output a baseline projection, a growth allowance per team, a buffer
for unplanned cost, and the savings target needed to stay flat.
Show the assumptions so I can challenge them.

Verify before you commit

Spot-check the AI’s allocation against one team’s real invoice and confirm the forecast’s growth multiplier matches your product plan. Anchoring on relative history (“last four quarters → next year”) keeps the analysis evergreen instead of pinned to a calendar year that will go stale.

When This Breaks

Cost analysis fails in predictable ways. Recognize these early:

MCP auth or permission errors. The most common failure is a missing or wrong-scoped token. Vantage needs VANTAGE_TOKEN (read-only is enough); the AWS Labs server needs a valid AWS_PROFILE with ce:Get* IAM permissions. If the server starts but every query returns empty, you almost certainly have a credentials or permissions gap, not a code problem.
Cost Explorer API charges and rate limits. Each request is $0.01 and the API throttles under bursty load. An agent that fans out hundreds of daily-granularity calls can rack up cost and start failing with throttling errors. Scope queries to monthly granularity first and drill into daily only where it matters.
Stale or missing cost-allocation tags. Allocation and chargeback are only as accurate as your tags. A large “untagged” bucket silently distorts every per-team figure — treat tag coverage as a prerequisite, not a nice-to-have.
Right-sizing that causes throttling. Downsizing an instance that looks idle on average can starve it during traffic bursts. Always check peak (not just mean) utilization, change one tier at a time, and watch latency and error rates after each change.
Hallucinated server names. If a prompt references a server you never configured (a bare aws-cost-mcp binary, a @kubernetes/mcp-server scope), the tool call silently does nothing and the AI may invent plausible numbers. Use the exact server keys from your config, and for Kubernetes the real package is the unscoped kubernetes-mcp-server (npx -y kubernetes-mcp-server@latest).

What’s Next

Infrastructure as Code with AI — turn a right-sizing plan into reviewed Terraform changes
Monitoring & Observability — correlate cost spikes with the deploys and traffic that caused them
CI/CD Pipelines — add cost-estimate gates to pull requests so spend is reviewed before it ships

Key Takeaways

Wire up real MCP servers. Vantage via npx and the AWS Labs Cost Explorer server via uvx — the config is identical across Cursor, Claude Code, and Codex.
Ask focused questions, not “optimize everything.” The top-5, phased right-sizing, and anomaly prompts above return decisions you can act on.
Always verify before you change production. The AI finds candidates fast; you own the call. Check peak utilization, confirm dollar figures in the console, and watch the system after each change.
Mind the meter. Cost Explorer requests cost $0.01 each and the API rate-limits — scope queries and keep the server out of autonomous loops.
Tags gate everything. Allocation and forecasting are only as trustworthy as your tag coverage, so fix that first.