Continuous Delivery Best Practices
Your PR has sat for two days waiting on a reviewer. The deploy needs three manual approvals across two Slack channels, the release notes are still a TODO, and the one time CI went red overnight nobody triaged it until standup. Continuous Delivery promises to shrink that gap to minutes, but the glue work, reviews, YAML, gates, changelog, failure triage, is exactly the tedium nobody wants to own.
That glue work is where an AI assistant earns its keep. Not “AI writes your app,” but AI as a tireless reviewer, YAML generator, and first responder wired directly into your pipeline.
What you’ll walk away with
Section titled “What you’ll walk away with”- A real GitHub Actions step that runs Claude Code headless on every PR diff and comments inline
- Copy-paste prompts to generate pipeline YAML, gate a deploy, and draft release notes, one per tool
- The Cursor / Claude Code / Codex split for where each tool fits in CD
- A “when this breaks” checklist for the failure modes AI-generated pipelines hit in production
Where AI fits in the pipeline
Section titled “Where AI fits in the pipeline”The highest-leverage place to start is automated PR review, it is low-risk (comments only, no deploys) and pays off on day one. From there you move outward: generating the workflow files, gating the deploy, and triaging red builds.
The three tools occupy different surfaces of the pipeline. Pick based on where your team already lives.
Cursor’s BugBot reviews PRs automatically once enabled on the repo and posts inline comments on likely bugs. Re-trigger a review on demand by commenting bugbot run on the PR. When it flags something, Autofix (GA since February 2026) can spawn a background Cloud Agent that opens a follow-up PR with the proposed fix, so a reviewer approves a diff instead of writing one. As of May 2026 BugBot bills per review (roughly $1.20 for a default-effort pass, more for large diffs) on Teams and Individual plans instead of the old flat per-seat fee.
Use Cursor when your team reviews in the GitHub UI and wants fixes proposed as PRs they can eyeball.
Claude Code shines in headless CI. Run claude -p inside a GitHub Action to review a diff, gate a deploy, or draft a changelog, scripted, no TUI. Pair it with a PreToolUse hook locally so a risky command (a raw kubectl apply, a force-push) pauses for confirmation before the agent runs it.
Use Claude Code when CD lives in your .github/workflows and you want the agent invoked from a script with explicit allowed tools.
Codex spans App, CLI, IDE, and Cloud. Codex Cloud runs tasks in the background on a dedicated worktree; the GitHub integration lets it open and review PRs, and the Slack integration lets a teammate kick off or approve a deploy task from a channel. Automations run a recurring prompt (e.g. a nightly dependency-audit) and drop findings into your inbox.
Use Codex when you want async, cloud-side tasks and chat-driven approvals rather than a local terminal loop.
Run a reviewer on every PR
Section titled “Run a reviewer on every PR”Here is a real, minimal GitHub Actions step that runs Claude Code headless against the PR diff and writes inline review comments. The flags are the load-bearing part, --allowedTools (not --allow-tools) restricts what the agent may touch, and --output-format json (not --json) makes the result parseable downstream.
name: AI PR Reviewon: pull_requestpermissions: contents: read pull-requests: writejobs: review: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 with: fetch-depth: 0 - name: Run Claude Code review env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} run: | git diff origin/${{ github.base_ref }}...HEAD > /tmp/pr.diff npx -y @anthropic-ai/claude-code -p \ "Review the diff in /tmp/pr.diff for security issues, logic bugs, and missing error handling. Be specific and cite file:line. Skip style nits." \ --allowedTools "Read,Grep,Bash(git diff:*)" \ --output-format json > review.jsonThe point is the inversion: you don’t paste a diff into a chat window, the pipeline feeds the diff to the agent and captures structured output you can post as a comment or fail the job on.
Generate the pipeline, don’t hand-write it
Section titled “Generate the pipeline, don’t hand-write it”Nobody should write CI YAML from a blank file. Describe the pipeline in plain English and let the AI emit it, then review the result against your runner and secret names.
In Cursor’s agent mode, open the repo and prompt the agent to create the workflow file. Because it can read your package.json and existing .github/workflows, it will match your real scripts and Node version instead of guessing.
From the terminal, let Claude Code read the project and write the file in one shot, then diff it before committing.
Run Codex with workspace write so it can create the file, but keep approvals on so a surprising command pauses. The correct flags are --sandbox workspace-write and --ask-for-approval on-request (not --approval-mode).
Gate the deploy with a human in the loop
Section titled “Gate the deploy with a human in the loop”Full auto-deploy is the last thing to adopt, not the first. Start with the AI preparing the deploy and running pre-flight checks, then handing off to a human for the final yes. The approval can live in Slack, in a GitHub environment protection rule, or in a chat with the agent itself.
For the release-notes step, give the AI the commit range and a format, not a vague “summarize.”
When this breaks
Section titled “When this breaks”AI-assisted CD fails in specific, recognizable ways. Watch for these:
What’s next
Section titled “What’s next”- CI/CD Pipelines — deeper patterns for building the pipeline itself
- Incident Response — when a deploy goes wrong and you need the AI on triage
- Test-Driven Development — the tests that make the gate above meaningful