Cloud Cost Management & FinOps
Your cloud bill jumped 40% this month. Finance wants an explanation by Friday, engineering swears nothing changed, and the only tool you have is a Cost Explorer dashboard with 200 line items and no story. You could spend two days exporting CSVs and pivoting in a spreadsheet — or you could point an AI assistant at a cost MCP server and have it tell you which five services moved, by how much, and why.
This guide shows FinOps practitioners, DevOps engineers, and platform teams how to wire Cursor, Claude Code, and Codex to real cost-management MCP servers and turn raw billing data into right-sizing plans, anomaly alerts, and forecasts you can actually defend in a budget review.
What You’ll Walk Away With
Section titled “What You’ll Walk Away With”- A working MCP setup for Vantage (multi-cloud) and the AWS Labs Cost Explorer server, configured identically across Cursor, Claude Code, and Codex
- A copy-paste prompt that surfaces your top 5 cost-savings opportunities for the current month
- A right-sizing prompt that returns a phased plan tagged with risk levels, not a flat list of instances
- An anomaly-detection prompt that distinguishes expected growth from genuine bill shock
- A clear sense of when this workflow breaks — API charges, rate limits, stale tags — and how to recover
Set Up the Cost MCP Servers
Section titled “Set Up the Cost MCP Servers”These servers expose billing and usage APIs as tools your AI assistant can call directly. The config is identical across Cursor, Claude Code, and Codex — only the file each tool reads differs (.cursor/mcp.json or Cursor Settings, .mcp.json for Claude Code, ~/.codex/config.toml for Codex). Set up these two and you cover most multi-cloud cases.
Vantage (multi-cloud cost platform)
Section titled “Vantage (multi-cloud cost platform)”Vantage aggregates AWS, Azure, GCP, Kubernetes, and SaaS spend behind one API. The official MCP server runs via npx and authenticates with a read-only bearer token.
{ "mcpServers": { "vantage": { "command": "npx", "args": ["-y", "vantage-mcp-server"], "env": { "VANTAGE_TOKEN": "your-read-only-vantage-token" } } }}Generate the token from your Vantage account under API access and scope it read-only — the AI never needs write access to analyze spend.
AWS Cost Explorer (AWS Labs MCP server)
Section titled “AWS Cost Explorer (AWS Labs MCP server)”The AWS Labs Cost Explorer server is a Python package distributed on PyPI and run with uvx, not a bare npm binary. It reads your existing AWS credentials via a named profile.
{ "mcpServers": { "aws-cost-explorer": { "command": "uvx", "args": ["awslabs.cost-explorer-mcp-server@latest"], "env": { "FASTMCP_LOG_LEVEL": "ERROR", "AWS_REGION": "us-east-1", "AWS_PROFILE": "default" } } }}For Azure and GCP, the equivalents are @azure/mcp (npx -y @azure/mcp@latest server start) and @google-cloud/gcloud-mcp (npx -y @google-cloud/gcloud-mcp), each reading their native credential chain. Add them only when you actually operate in those clouds.
Codex reads the same config in TOML
Section titled “Codex reads the same config in TOML”Codex stores MCP servers in ~/.codex/config.toml rather than JSON. The same two servers look like this:
[mcp_servers.vantage]command = "npx"args = ["-y", "vantage-mcp-server"]env = { VANTAGE_TOKEN = "your-read-only-vantage-token" }
[mcp_servers.aws-cost-explorer]command = "uvx"args = ["awslabs.cost-explorer-mcp-server@latest"]env = { AWS_REGION = "us-east-1", AWS_PROFILE = "default" }The Workflow: From Bill Shock to a Defensible Plan
Section titled “The Workflow: From Bill Shock to a Defensible Plan”The pattern is the same regardless of tool: connect a cost MCP server, ask a focused question, then verify the recommendation against the actual resource before you act. The AI is fast at finding candidates; you own the decision to change production.
Open the agent panel and reference the servers by name. Cursor keeps the analysis in your editor so you can drop findings straight into a runbook or Terraform change.
@vantage @aws-cost-explorer Pull last month's spend grouped byservice and linked account. For the five services that grew themost versus the prior month, tell me the dollar delta, the likelydriver (usage vs. price vs. new resources), and whether the growthlooks expected for a product scaling its user base. Output a table,then a short prioritized list of what to investigate first.Cursor returns a table you can iterate on inline — ask follow-ups like “drill into RDS” without restating context.
Run it from the terminal so the analysis lives next to your infra repo and can feed a script or a PR.
claude "Using the vantage and aws-cost-explorer MCP servers, find mytop 5 cost-growth drivers for last month vs. the prior month. For each,give the dollar delta, the likely cause, and an expected-vs-anomalousverdict. End with a prioritized investigation list."Claude Code coordinates both servers, aggregates the data, and writes a Markdown summary you can commit to your ops docs.
Codex reads the same servers from ~/.codex/config.toml and works across CLI, IDE, and Cloud.
codex "Analyze our multi-cloud spend via the vantage andaws-cost-explorer MCP servers and produce a phased cost-reductionplan: top 5 growth drivers, dollar deltas, likely causes, and anexpected-vs-anomalous call for each."Run it in a Codex Cloud task to keep the analysis off your laptop, or in the IDE extension to fold results into a change you’re already drafting.
A typical response groups spend, ranks the movers, and flags which growth is benign. For example, it might report that RDS grew because three read replicas were added (expected for a traffic ramp) while a 45% jump in inter-region transfer has no matching deploy and warrants investigation. Treat the dollar figures as a starting point — confirm them against the provider console before you brief finance, because tag coverage and account boundaries shape what the API returns.
Copy-Paste Prompts
Section titled “Copy-Paste Prompts”These are the reusable recipes. They name real services and ask for opinionated output, so they work with minimal editing — swap the provider or threshold and run.
Cost Allocation and Forecasting
Section titled “Cost Allocation and Forecasting”Two follow-on jobs round out a FinOps practice: making spend traceable to teams, and turning history into a forward budget.
-
Design a tagging and allocation strategy
Ask the assistant to translate your org structure into an enforceable tag schema and a method for splitting shared resources (databases, load balancers, NAT gateways) that no single team owns.
Using @vantage, design a cost-allocation strategy for an org withproduct teams, shared platform teams, and dev/staging/productionenvironments. Define a required tag set, a fallback for untaggedspend, and a defensible method to split shared-resource costs(by request volume, by CPU/memory share). Flag where allocationwill be approximate so I can set expectations with finance. -
Roll history forward into next year’s budget
Feed the trailing four quarters and your growth assumptions; ask for a forecast with explicit buffers rather than a single number.
Using @vantage and @aws-cost-explorer, take our last four quartersof actual spend and build a next-year monthly forecast. Inputs:expected user growth, planned feature launches, and one new region.Output a baseline projection, a growth allowance per team, a bufferfor unplanned cost, and the savings target needed to stay flat.Show the assumptions so I can challenge them. -
Verify before you commit
Spot-check the AI’s allocation against one team’s real invoice and confirm the forecast’s growth multiplier matches your product plan. Anchoring on relative history (“last four quarters → next year”) keeps the analysis evergreen instead of pinned to a calendar year that will go stale.
When This Breaks
Section titled “When This Breaks”Cost analysis fails in predictable ways. Recognize these early:
- MCP auth or permission errors. The most common failure is a missing or wrong-scoped token. Vantage needs
VANTAGE_TOKEN(read-only is enough); the AWS Labs server needs a validAWS_PROFILEwithce:Get*IAM permissions. If the server starts but every query returns empty, you almost certainly have a credentials or permissions gap, not a code problem. - Cost Explorer API charges and rate limits. Each request is $0.01 and the API throttles under bursty load. An agent that fans out hundreds of daily-granularity calls can rack up cost and start failing with throttling errors. Scope queries to monthly granularity first and drill into daily only where it matters.
- Stale or missing cost-allocation tags. Allocation and chargeback are only as accurate as your tags. A large “untagged” bucket silently distorts every per-team figure — treat tag coverage as a prerequisite, not a nice-to-have.
- Right-sizing that causes throttling. Downsizing an instance that looks idle on average can starve it during traffic bursts. Always check peak (not just mean) utilization, change one tier at a time, and watch latency and error rates after each change.
- Hallucinated server names. If a prompt references a server you never configured (a bare
aws-cost-mcpbinary, a@kubernetes/mcp-serverscope), the tool call silently does nothing and the AI may invent plausible numbers. Use the exact server keys from your config, and for Kubernetes the real package is the unscopedkubernetes-mcp-server(npx -y kubernetes-mcp-server@latest).
What’s Next
Section titled “What’s Next”- Infrastructure as Code with AI — turn a right-sizing plan into reviewed Terraform changes
- Monitoring & Observability — correlate cost spikes with the deploys and traffic that caused them
- CI/CD Pipelines — add cost-estimate gates to pull requests so spend is reviewed before it ships
Key Takeaways
Section titled “Key Takeaways”- Wire up real MCP servers. Vantage via
npxand the AWS Labs Cost Explorer server viauvx— the config is identical across Cursor, Claude Code, and Codex. - Ask focused questions, not “optimize everything.” The top-5, phased right-sizing, and anomaly prompts above return decisions you can act on.
- Always verify before you change production. The AI finds candidates fast; you own the call. Check peak utilization, confirm dollar figures in the console, and watch the system after each change.
- Mind the meter. Cost Explorer requests cost $0.01 each and the API rate-limits — scope queries and keep the server out of autonomous loops.
- Tags gate everything. Allocation and forecasting are only as trustworthy as your tag coverage, so fix that first.