Security Standards and Compliance

Your SOC 2 Type II audit is in three weeks. The auditor wants evidence that AI-generated code is reviewed before it ships, that no customer PHI leaks to a model provider, and that the MCP servers your team installed last sprint aren’t quietly exfiltrating files. Meanwhile your developers are merging Cursor- and Claude-Code-authored PRs at three times last year’s pace. The controls have to keep up without becoming the bottleneck everyone routes around.

This article shows the concrete pieces that make AI-assisted development auditable: turning on the audit/telemetry surfaces these tools actually ship, scanning MCP servers for the supply-chain risks that plague them, and enforcing review and policy gates in CI so compliance is a passing check, not a quarterly fire drill.

What You’ll Walk Away With

The real audit and telemetry configuration for Cursor, Claude Code, and Codex — no invented settings keys
A working MCP supply-chain scan using Snyk Agent Scan (formerly Invariant MCP-Scan) wired into all three tools
A copy-paste prompt that produces a runnable Semgrep ruleset for OWASP findings on a concrete stack
An Open Policy Agent (Rego) gate that blocks unreviewed AI-generated database migrations in CI
A failure-modes section covering the gotchas auditors and red teams actually find

Where AI Tooling Touches Your Compliance Boundary

Three surfaces matter for an audit:

Data egress. Your prompts contain source code and sometimes secrets. For SOC 2 confidentiality and HIPAA, you need a documented zero-retention posture with each vendor. Cursor’s Privacy Mode and Enterprise plan guarantee zero data retention (see trust.cursor.com); Anthropic and OpenAI offer the same for API and enterprise tiers. Document which mode each tool runs in — that document is the control evidence.
Audit trail. You need to show who ran what. Each tool exposes a different mechanism (below). None of them has a magic audit_logging: true flag — the real surfaces are OpenTelemetry, shell-command wrapping, and platform admin logs.
Supply chain. MCP servers run with your tools’ privileges. In Equixly’s 2025 assessment of popular open-source MCP server implementations, 43% contained command-injection flaws, 30% allowed unrestricted URL fetches, and 22% leaked files outside their intended directories. Treat every MCP server like an unvetted dependency.

Turning On Audit Evidence

This is the part most teams get wrong by pasting in config keys that don’t exist. Here is what each tool actually supports.

Claude Code has no audit_logging setting. Observability is OpenTelemetry, and command-level auditing is a shell prefix. Enable both in ~/.claude/settings.json (managed settings on a controlled host for enterprise):

{
  "env": {
    "CLAUDE_CODE_ENABLE_TELEMETRY": "1",
    "OTEL_METRICS_EXPORTER": "otlp",
    "OTEL_LOGS_EXPORTER": "otlp",
    "OTEL_EXPORTER_OTLP_ENDPOINT": "https://otel-collector.internal:4317",
    "CLAUDE_CODE_SHELL_PREFIX": "/usr/local/bin/audit-logger.sh"
  }
}

CLAUDE_CODE_ENABLE_TELEMETRY=1 streams metrics and logs to your collector (then to your SIEM); CLAUDE_CODE_SHELL_PREFIX wraps every Bash command so audit-logger.sh <command> records it. That command log is your SOC 2 access/activity evidence.

Cursor exposes no per-user cursor.audit.* or cursor.compliance.* keys — compliance is platform-level. Your evidence comes from the Enterprise plan and Trust Center, not a JSON file:

Privacy Mode / zero data retention — enforce org-wide so no code is stored or trained on. This is your data-handling control.
SSO + SCIM provisioning and admin analytics/audit in the team dashboard — your access-control and activity evidence.
SOC 2 Type II + GDPR attestations and subprocessor list at trust.cursor.com — attach these to your vendor-management file.

Document “Cursor Enterprise, Privacy Mode enforced, SSO via Okta” as the control; the Trust Center report is the third-party evidence.

Codex auditing lives in ~/.codex/config.toml (managed centrally for teams). There is no audit-logging key; you constrain and record behavior through the sandbox and approval policy, plus your own shell wrapper:

# Block network egress and writes outside the workspace by default
sandbox_mode = "workspace-write"
approval_policy = "on-request"

[sandbox_workspace_write]
network_access = false

Pair this with Codex Cloud’s per-task run history (every Cloud task is logged with its diff and command output) for the activity trail. For a hard egress boundary, run the CLI inside a container whose outbound traffic is logged at the network layer.

Scanning MCP Servers Before You Trust Them

Before any MCP server reaches a developer’s machine, scan it. The tool to use is Snyk Agent Scan (the former Invariant Labs MCP-Scan, now maintained by Snyk). It runs through uvx — it’s a Python tool, so do not npm install it.

Scan a candidate server’s config (or your whole mcp.json) for tool-poisoning, prompt-injection, and toxic-flow patterns:
Terminal window
```
uvx snyk-agent-scan@latest ~/.cursor/mcp.json
```
The legacy entrypoint uvx mcp-scan@latest still works — it’s now a redirect that installs snyk-agent-scan and forwards the CLI.
Read the findings. A flagged server might show an unrestricted fetch tool that accepts arbitrary URLs, or a tool description containing hidden instructions (a tool-poisoning attack). Don’t install it until that’s resolved.
Register the scanner itself as an MCP server so the agent can re-scan on demand:

claude mcp add security-analyzer -- uvx snyk-agent-scan
claude mcp list

Add it in Settings → MCP → Add Server, or edit ~/.cursor/mcp.json directly so the config is reviewable in version control:

{
  "mcpServers": {
    "security-analyzer": {
      "command": "uvx",
      "args": ["snyk-agent-scan"],
      "env": { "SNYK_TOKEN": "${SNYK_TOKEN}" }
    }
  }
}

Only SNYK_TOKEN (for authenticated scans) is consumed — don’t invent SECURITY_SCAN_MODE or SEMGREP_APP_TOKEN here. Run Semgrep as its own step (below) rather than as env on the scanner.

codex mcp add security-analyzer -- uvx snyk-agent-scan

Or declare it in ~/.codex/config.toml:

[mcp_servers.security-analyzer]
command = "uvx"
args = ["snyk-agent-scan"]

Copy-paste prompt to triage a flagged MCP server before approving it for the org:

I ran `uvx snyk-agent-scan` against our mcp.json and it flagged the
`web-tools` server for an unrestricted URL-fetch tool and a tool
description containing instructions to read ~/.aws/credentials.

1. Explain the concrete attack: how a malicious page or tool description
   could turn this into data exfiltration through our agent.
2. Tell me whether this server can be safely sandboxed (allowlisted
   domains, no filesystem scope) or must be rejected outright.
3. If salvageable, give me the exact mcp.json entry that pins it to an
   allowlist and drops filesystem access. If not, say so plainly.
Do not soften the risk to make the server usable.

Generating Real Scanning Rules, Not Documents

The point of AI here is shippable artifacts — a ruleset, a fix, a policy file — not a Word doc describing one. Anchor prompts to a concrete stack so the output is runnable.

Copy-paste prompt for an OWASP-focused Semgrep ruleset on an Express + Postgres API:

This is a Node.js 22 / Express 5 API using the `pg` library with raw SQL
in src/db/*.ts and JWT auth middleware in src/middleware/auth.ts.

Write a runnable Semgrep rule file (semgrep --config ./owasp.yaml .) that
flags, for THIS codebase specifically:
- string-concatenated SQL passed to pg.query (SQL injection)
- routes that read req.body fields straight into a DB write without a
  zod/validator schema (mass assignment)
- res.json that includes a raw Error object or stack (info disclosure)
- jwt.verify calls that don't pin algorithm to RS256/HS256

For each rule give id, severity, message, and a fix suggestion. Then show
one example finding from a plausible src/db/users.ts and the corrected
parameterized query.

Copy-paste prompt for a security-focused review of a single PR (paste in the diff or run inside the tool on a branch):

Do a security review of this PR for our Express + Postgres API. For each
changed file, check: input validation on every new req param, parameterized
queries (flag any string-built SQL), authorization on new routes (not just
authentication), secrets/PII in logs or error responses, and any new
dependency's known CVEs.

Output a table: file, line, OWASP category, severity (CVSS), and a concrete
code fix. End with a one-line verdict: SHIP, SHIP WITH FIXES, or BLOCK, and
which of our SOC 2 change-management controls this PR touches. Be specific;
no generic 'validate inputs' advice.

Run the same prompt across tools — the workflow is identical; only the entry point differs. In Cursor, open the diff and invoke Agent mode on it. In Claude Code, run claude -p "<prompt>" on the branch so it reads the working tree, or wire it into a hook. With Codex, hand it the PR via the GitHub integration or codex exec in CI. Pin the model: use Claude Fable 5 for the highest-stakes reviews, Opus 5 for lower-cost premium reasoning, or the appropriate GPT-5.6 Sol/Terra/Luna tier in ChatGPT Codex. See model comparison for the full tier breakdown.

Enforcing Policy in CI

Audit evidence is strongest when it’s automatic. The highest-leverage gate for AI-assisted teams is blocking unreviewed schema changes — the migrations an agent generates are exactly where a confused-deputy mistake becomes a data-exposure incident.

Copy-paste prompt for an OPA/Rego policy that blocks unreviewed AI-generated migrations:

Write an Open Policy Agent (Rego) policy, package ci.migrations, evaluated
by `conftest test migrations/` in GitHub Actions, that DENIES a PR when:
- a file under migrations/ is added or changed, AND
- the PR lacks the label "db-reviewed", OR has fewer than 1 human approval
  from the @data-platform team (the GitHub event JSON is the input).

Also deny any migration containing DROP TABLE or DROP COLUMN unless the PR
body has a line "MIGRATION-RISK: accepted by <name>". Give me the .rego
file, a passing and a failing input fixture, and the exact GitHub Actions
step that runs conftest and fails the build on a deny.

Wire the result into .github/workflows/ so the migration gate and the Semgrep scan run on every PR. The job’s pass/fail history becomes your continuous-compliance evidence — point the auditor at the Actions log instead of assembling a spreadsheet by hand. For dependency CVEs and license checks, add a Snyk or npm audit --audit-level=high step in the same workflow rather than relying on a once-a-quarter manual scan.

When This Breaks

“Telemetry is on but the SIEM is empty.” CLAUDE_CODE_ENABLE_TELEMETRY=1 must be set before the OTEL exporter variables take effect, and the collector endpoint must be reachable from the developer’s host. Test with a local collector first; a blocked egress firewall silently drops the data.
uvx snyk-agent-scan reports nothing on a server you suspect. Static scans miss runtime behavior. Pair the scan with the sandbox controls above (no network, no filesystem scope) and watch the server’s actual tool calls — a clean scan is necessary, not sufficient.
The Rego gate blocks legitimate hotfixes. Build the escape hatch into the policy itself (the MIGRATION-RISK: accepted by line), not an admin override. An override that bypasses the control is an audit finding; a documented, labeled exception is the control working.
Cursor “audit settings” don’t appear. They don’t exist at the user level. If you need per-developer audit, you need the Enterprise plan’s admin surface plus host-level command logging — there is no client-side JSON for it.
An agent leaks a secret into a prompt. This is a data-egress incident, not a code bug. Your .gitignore/secret-scanning deny rules (e.g. Claude Code’s deny permission for Read(./.env*)) prevent it at the source; treat any breach as reportable under your incident-response plan.

What’s Next

Security Scanning and Vulnerability Testing — the testing-side companion: building and running the scans this article gates on
MCP Optimization and Troubleshooting — hardening, scoping, and debugging the MCP servers you’ve vetted
AI-Powered Code Quality Gates — extending the CI gate pattern beyond security to quality and review enforcement