Skip to content

Security Scanning with Codex

A security researcher just filed a responsible disclosure report: your API returns internal error details in production responses, and the password reset flow is vulnerable to enumeration attacks. You have 48 hours before the report goes public. You need to audit the entire codebase for similar issues, fix the reported ones, and set up continuous scanning so this does not happen again. Codex can scan the codebase from multiple angles simultaneously, fix the critical issues, and automate ongoing security monitoring.

  • Prompts for comprehensive security audits across API routes, authentication, and data handling
  • A parallel scanning workflow using worktrees to audit different security domains simultaneously
  • AGENTS.md review guidelines that catch security issues on every PR
  • An automation recipe for daily security sweeps

For the reported issues, use the CLI for a fast, focused investigation:

Step 2: Comprehensive Scan in Parallel Worktrees

Section titled “Step 2: Comprehensive Scan in Parallel Worktrees”

While you fix the reported issues, launch parallel security audits in worktrees to catch similar problems across the codebase.

Worktree 1: Input Validation Audit

Audit every API endpoint in src/routes/ for input validation vulnerabilities:
1. SQL injection: any raw SQL or string interpolation in database queries
2. XSS: any user input rendered without sanitization
3. Path traversal: any file operations using user-provided paths
4. Command injection: any shell commands using user input
5. Mass assignment: any endpoints that pass request body directly to database updates
For each finding, report:
- File path and line number
- Severity (Critical, High, Medium, Low)
- Proof of concept showing how it could be exploited
- Recommended fix
Fix all Critical and High severity findings. Add input validation tests for each fix.
Run tests after all fixes to verify nothing broke.

Worktree 2: Authentication and Authorization Audit

Audit the authentication and authorization system:
1. Check every protected route has auth middleware applied
2. Verify JWT token validation (algorithm, expiration, issuer)
3. Check for broken access control (user A accessing user B's resources)
4. Verify session management (token rotation, revocation, expiry)
5. Check password handling (hashing algorithm, salt, minimum complexity)
6. Verify rate limiting on auth endpoints (login, register, reset-password)
7. Check for timing attacks on login (constant-time comparison for passwords)
For each finding, report severity and recommended fix.
Fix Critical and High issues. Add tests for each fix.

Worktree 3: Dependency and Configuration Audit

Audit dependencies and configuration for security issues:
1. Run npm audit and report all vulnerabilities with severity
2. Check for outdated dependencies with known CVEs
3. Verify no secrets in source code (API keys, passwords, tokens in .ts/.js files)
4. Check .env.example does not contain real values
5. Verify .gitignore covers sensitive files (.env, private keys, credentials)
6. Check HTTP security headers (CORS, CSP, HSTS, X-Frame-Options)
7. Verify TLS configuration if any
For dependencies with Critical or High vulnerabilities, update them if possible. If the update requires breaking changes, report what needs to change.

Step 3: Add Security Review Guidelines to AGENTS.md

Section titled “Step 3: Add Security Review Guidelines to AGENTS.md”

After the audit, codify your security standards so Codex catches issues on every future PR:

## Review guidelines - Security
### P0 (Block merge)
- Raw SQL or string interpolation in database queries (use parameterized queries only)
- User input in shell commands, file paths, or eval() calls
- Missing authentication middleware on non-public endpoints
- Secrets, API keys, or credentials in source code
- Error responses that expose stack traces, file paths, or internal details
### P1 (Require fix before merge)
- Missing input validation on API endpoints
- Missing rate limiting on authentication endpoints
- Broken access control (user accessing another user's data)
- Insecure password handling (weak hashing, missing salt)
- Missing security headers in HTTP responses

Set up an automation that runs every day and reports findings to your inbox:

Step 5: Security Review on PRs with @codex

Section titled “Step 5: Security Review on PRs with @codex”

For every PR that touches security-sensitive code, get an automated security review on GitHub:

@codex review for security vulnerabilities and security concerns

Codex reads the PR diff, applies your AGENTS.md security guidelines, and posts a code review focused specifically on security. This catches issues before they reach the main branch.

For scheduled comprehensive reviews, use the Codex GitHub Action:

name: Security review
on:
pull_request:
paths:
- 'src/routes/**'
- 'src/middleware/auth*'
- 'src/lib/db/**'
jobs:
security-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v5
- uses: openai/codex-action@v1
with:
openai-api-key: ${{ secrets.OPENAI_API_KEY }}
prompt: |
Review this PR specifically for security issues:
- Input validation completeness
- Authentication and authorization correctness
- SQL injection or NoSQL injection risks
- Sensitive data exposure in responses or logs
- Dependency security (new packages with known vulnerabilities)
Report only P0 and P1 security findings.
sandbox: read-only

Codex flags false positives. Security scanning has an inherent false positive rate. If Codex flags parameterized queries as SQL injection risks, it may not understand the ORM’s escaping behavior. Include “Drizzle ORM queries are parameterized by default. Do not flag standard Drizzle .where() calls as SQL injection” in your AGENTS.md.

Automated dependency updates break the build. If Codex updates a dependency to fix a CVE and the new version has breaking changes, the build fails. Always include “run the full test suite after updating dependencies. If tests fail, report the failures instead of committing broken code” in your prompt.

The daily scan generates too much noise. If every scan reports the same 15 medium-severity npm audit findings that you have already triaged, the reports become useless. Add acknowledged issues to an ignore list and tell Codex: “Skip vulnerabilities listed in .security-ignore.json. Only report new findings.”

Cloud environment misses environment-specific security issues. The cloud universal container does not replicate your production infrastructure (load balancer, WAF, network policies). Cloud scans catch code-level issues but not deployment-level security. Use the cloud for code audits and supplement with infrastructure scanning tools.