Compliance Automation with AI

The auditor wants evidence that every production deploy runs through code review, that no GPL-licensed dependency shipped, and that you can produce a data-flow diagram showing where EU personal data lives. You have a SOC 2 Type II window closing in three weeks and a codebase nobody documented for compliance. Screenshotting GitHub settings by hand is not going to scale.

This is exactly the kind of repetitive, evidence-heavy work an AI coding agent is good at: it can read the repo, draft the scripts that collect evidence, wire the CI gates that enforce a control, and turn code into the narrative an auditor expects. You stay the reviewer; the agent does the typing.

What You’ll Walk Away With

A prompt that audits a repo against a specific SOC 2 control (CC6.1, CC8.1) and returns a gap table with file:line evidence
A working GitHub Actions job that fails the build on disallowed dependency licenses
A pre-commit secret-scanning gate the agent wires for you
A GDPR data-flow document the agent drafts by tracing personal-data fields through the codebase
The same workflow shown in Cursor, Claude Code, and Codex, plus the MCP servers that make evidence collection one step instead of ten

The Workflow

Compliance automation with an agent is four moves: map the control to your code, generate the evidence collector, enforce the control in CI, then write the human-readable narrative. Treat the agent’s output as a first draft you review, never as the auditor’s word.

Map a control to your codebase

Start narrow. Pick one control and ask the agent to find where you do and don’t satisfy it. The trick is forcing file:line citations so you can verify every claim instead of trusting a confident summary.

Audit a repo against a SOC 2 control with evidence:

You are a SOC 2 auditor reviewing this repository against control CC6.1
(logical access controls). Inspect the code, CI config, and IaC. Produce a
markdown table with columns: Requirement | Status (Met / Partial / Gap) |
Evidence (file:line or config path) | Remediation. Only mark "Met" when you
can cite a concrete file and line. Cover authentication, authorization,
secret handling, and least-privilege IAM. Do not invent controls we don't have.

The “do not invent” clause matters. Compliance prompts are where models are most tempted to hallucinate a tidy, fully-compliant answer. Demanding citations turns “trust me” into something you can spot-check in 30 seconds.

The mechanics of pointing the agent at your repo differ per tool:

Open the repo and switch the Agent to a planning-grade model (Fable 5 or Opus 5 for thorough audits, Sonnet 5 for everyday passes). Paste the prompt in Agent mode and add @Codebase so it searches the whole project rather than just open files. Cursor renders the gap table inline; click each file:line citation to jump straight to the evidence and confirm it.

From the repo root, run claude and paste the prompt. Claude Code reads files with Read, Glob, and Grep as it builds the table, so it cites real paths. To capture the result as an artifact for your audit folder, run it headless:

claude -p "Audit this repo against SOC 2 CC6.1 and output a markdown gap table with file:line evidence. Only mark Met with a citation." \
  --output-format json > soc2-cc6.1-gap.json

Run codex in the repo and paste the prompt. Keep approvals strict for a read-only audit so it never edits files:

codex --ask-for-approval untrusted --sandbox read-only

read-only sandbox guarantees the agent can inspect everything but cannot touch the working tree while it builds the evidence table.

Generate the evidence collector

Auditors want repeatable evidence, not a one-off chat. Have the agent write a script that pulls the proof on demand — branch-protection settings, the list of who can merge, deploy approvals — so you can re-run it the morning of the audit.

Generate a SOC 2 evidence-collection script:

Write a Node.js script (evidence-collect.mjs) that gathers SOC 2 change-management
evidence for this GitHub repo using the gh CLI and Octokit. For the default branch,
collect: required-reviews count, required status checks, whether force-push and
deletions are blocked, and the last 20 merged PRs with reviewer logins and merge
timestamps. Write the result to evidence/change-management-$(date +%F).json and
exit non-zero if branch protection is disabled. Include a top comment mapping each
field to SOC 2 control CC8.1.

Review what it produces. Confirm it calls real gh api / Octokit endpoints (not invented ones), reads the token from the environment rather than hardcoding it, and that the control mapping in the header comment is honest. Then run it once and eyeball the JSON against what you see in the GitHub UI.

Enforce the control in CI

Evidence proves a control existed in the past. A CI gate keeps it true going forward. Two of the highest-leverage gates: a dependency-license check and a secret scan. Have the agent write both.

Add a license-policy gate to CI:

Add a GitHub Actions job called "license-check" to .github/workflows/ci.yml that
fails the build if any production dependency uses a copyleft license (GPL-2.0,
GPL-3.0, AGPL-3.0, or LGPL with static linking). Use the license-checker-rseidelsohn
npm package, allow our existing MIT/Apache-2.0/BSD/ISC stack, and print the
offending package and license on failure. Run it on pull_request only.

For secret scanning, prefer wiring an established tool over asking the agent to invent regexes — it should configure gitleaks, not write its own scanner:

Ask the Agent: “Add a pre-commit hook using gitleaks that blocks commits containing secrets, plus a CI job that runs gitleaks detect on every PR.” Cursor edits .pre-commit-config.yaml and the workflow file. Review the diff in the Source Control panel, then stage a fake AWS_SECRET_ACCESS_KEY=... line locally to confirm the hook actually blocks it before you trust it.

Claude Code shines at multi-file plumbing like this. After it writes the hook and workflow, lock the behavior in with a hook of your own in .claude/settings.json so secret scanning runs before every commit it makes:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash(git commit:*)",
        "hooks": [{ "type": "command", "command": "gitleaks protect --staged --redact" }]
      }
    ]
  }
}

Now the agent itself is gated: a staged secret aborts its commit.

Codex can do this end-to-end on a worktree. Let it edit files but keep command approval on so you see the gitleaks install:

codex --ask-for-approval on-request --sandbox workspace-write

Ask it to add the gitleaks GitHub Action and a .pre-commit-config.yaml entry, then have it run gitleaks detect --no-git once to prove the config parses.

The last mile of most audits is prose: a data-flow description an auditor or DPO can read. The agent can trace personal-data fields through the codebase far faster than you can grep for them — as long as you make it cite sources and flag uncertainty.

Draft a GDPR data-flow doc from the codebase:

Act as a privacy engineer. Trace every field that could be personal data under
GDPR (names, emails, IPs, device IDs, location) through this codebase. For each:
where it enters (endpoint/file:line), where it's stored (table/collection), where
it leaves the system (third-party SDK, webhook, analytics), and its retention.
Output a markdown data-flow doc with a Mermaid diagram and a table of
third-party processors. Mark any flow you're unsure about as "NEEDS REVIEW"
rather than guessing.

The Mermaid diagram renders directly in most docs tools and gives your DPO a picture instead of a wall of text. The “NEEDS REVIEW” tag is the safety valve — it surfaces the fields the agent couldn’t fully trace so a human closes the gap.

MCP Servers and Skills That Help

Evidence collection gets dramatically shorter when the agent can query your systems directly instead of shelling out to CLIs. The relevant connections here are all real, first-party MCP servers:

GitHub MCP — the agent reads branch protection, PR reviews, and Actions runs natively. Use the hosted server at https://api.githubcopilot.com/mcp/ (the legacy local server @modelcontextprotocol/server-github is deprecated but still installable). Add it to Claude Code with:
Terminal window
```
claude mcp add --transport http github https://api.githubcopilot.com/mcp/
```
Sentry MCP (https://mcp.sentry.dev/mcp) — pull incident and error history as evidence for availability and incident-response controls.
Filesystem MCP — for scoping the agent to a specific evidence directory when generating reports.

If you genuinely need to script an MCP client (rather than letting the agent drive one), the SDK construction is specific — the package exports Client, not a root MCPClient, and it connects over a transport, never a bare server-name string:

// Connect a programmatic MCP client to the GitHub server
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';

const client = new Client({ name: 'compliance-evidence', version: '1.0.0' });
await client.connect(
  new StreamableHTTPClientTransport(new URL('https://api.githubcopilot.com/mcp/'))
);

On the Skills side, a single-purpose code review skill from the open skills marketplace (browse skills.sh and install with npx skills add <owner/repo>) is a lighter alternative to a full MCP server when all you want is a consistent compliance-flavored review on each PR. Reach for a skill when you need repeatable behavior; reach for an MCP server when the agent needs a live connection to a system of record.

When This Breaks

The agent marks a control “Met” with no real evidence. This is the failure mode that gets you a finding. Always require file:line citations, then spot-check three of them. If a citation points at a file that doesn’t contain the claimed control, discard the whole table and re-run with a stricter prompt.
It invents an MCP server or npm package. Do not paste regulatory-mcp-server, @compliance/*, or similar — none exist. Verify any suggested package with npm view <pkg> version before wiring it in, and stick to the GitHub/Sentry/Filesystem MCPs above.
Generated regexes miss real secrets. Don’t let the agent hand-roll a secret scanner. Use gitleaks or gh secret-scanning; their rule sets are maintained and tested. Use the agent to wire the tool, not replace it.
License data is wrong for transitive deps. license-checker reads declared licenses, which are sometimes mislabeled upstream. For anything you’re about to ship under audit, confirm flagged copyleft packages manually before failing or unblocking a build.
The data-flow doc leaks a real secret or PII sample. Tell the agent to redact values and reference field names only. Review the diff before committing anything to evidence/.

What’s Next

Security Operations with AI — the threat-detection and secret-rotation side of the same DevSecOps story
Cost Optimization — keep the agent and CI spend in check while you automate