Four-Stage AI Design Pipeline

The wave of model releases in April 2026 — Claude Design (2026-04-17), GPT Image 2 (2026-04-21), and GPT-5.5 Thinking / Pro (2026-04-23) — changes how a small team can ship UI. No single model is best at every step, but composed in the right order they collapse the gap between “sketch on a napkin” and “production code with a documented design system” from weeks to an afternoon.

This page describes that composition: a four-stage pipeline that hands a single product idea down a chain of specialists, each one producing a verifiable artefact for the next.

Why a multi-model pipeline beats one tool

Each model has a different shape of strength. GPT Image 2 renders pixel-faithful UI mockups including small text, dense layouts, and multilingual labels — the things that broke older diffusion models. GPT-5.5 Pro combines a 1M context window with strong HTML/CSS/JS reasoning, so it can hold the mockup, your reference materials, and a long single-file prototype in mind at once. Claude Design thinks in systems — typography scales, color tokens, component variants, state matrices — and packages the result into a handoff bundle. Claude Code, Codex, and Cursor are repo-aware, so the final translation into your stack respects the codebase that already exists.

Run them in series and each handoff is the best version of itself. Try to use any one of them for the whole job and you get the failure modes you’d expect: an image model that can’t reason about state machines, a text model with no visual taste, a design tool that doesn’t know your build system, or a coding agent that invents a layout from scratch.

The pipeline at a glance

1. Mockup

GPT Image 2 in ChatGPT. Visual ideation from references and a brief. Output: 01-mockup.png (4K).

2. HTML prototype

GPT-5.5 Pro / Thinking in ChatGPT. Single-file index.html with realistic copy and every interactive state. Output: 02-prototype.html.

3. Design system

Claude Design. Extracts tokens, components, and state variants; produces a multi-screen flow. Output: 03-handoff/ bundle.

4. Code + docs

Claude Code, Codex, or Cursor. Translates the handoff bundle into tokens, component skeletons, and DESIGN.md. Output: a PR.

Stage 1 — GPT Image 2: Visual ideation

GPT Image 2 is good at the thing older image models were bad at: legible UI text, structured layouts, and consistent typography across a single composition. That makes it a real first-draft tool for product UI, not just a moodboard generator.

Open a fresh ChatGPT thread. Pro is preferred because Image 2 renders at 4K there — small UI text stays readable when you zoom in. On Plus with Thinking you’ll get 1K/2K, which is fine for early ideation.
Attach references — up to 16. Competitor screenshots, your existing brand guide, a screenshot of your current product, two or three moodboard images. Image 2 uses these as visual context, not just inspiration.

Prompt with explicit structure. Don’t ask for “a SaaS landing page” — describe the regions, the typography vibe, and the content. Example:

Generate a 4K product UI mockup for a pricing page with three plans
(Free, Pro, Team). Three-column layout, generous whitespace, sans-serif
headings (close to Inter), warm-neutral background. Each card shows:
plan name, price with /mo suffix, 4-line feature list with checkmarks,
one CTA button. Pro card slightly elevated with a subtle accent border.
Header has a logo placeholder on the left and Sign In / Get Started on
the right. Match the visual tone of the attached references.

Iterate in natural language. “Tighten the spacing in the second card.” “Swap the warm gray for a cooler neutral.” “The Pro CTA needs more contrast.” Image 2 holds the composition stable across edits, so you can refine without starting over.
Save the artefact. Export the final image as design/01-mockup.png in your project. This is the input to Stage 2 — it needs to live somewhere you can attach it from.

Stage 2 — GPT-5.5 Pro / Thinking: Full HTML prototype

This is where the mockup becomes a clickable thing. GPT-5.5 Pro’s 1M context window matters here — it can hold your mockup, the reference materials, the brief, and a long single-file HTML output simultaneously, so it doesn’t drop details halfway through.

Stay in the same thread. Don’t open a new conversation. The model already has the mockup, the references, and the language you used in iteration — that’s the context you want carried forward.

Attach 01-mockup.png and prompt for a single-file prototype. The single-file constraint matters: it forces the model to commit to concrete decisions in one place and gives you a deliverable that opens in any browser without setup.

Build index.html that matches the attached mockup pixel-faithfully.
Constraints:
- Single self-contained HTML file. Tailwind via CDN is the only
  external dependency.
- Semantic HTML (header, main, section, footer; proper heading
  hierarchy).
- Realistic copy — no lorem, no "Feature 1 / Feature 2".
- Every interactive element has hover, focus, active, and disabled
  states. Show disabled with the Free plan CTA grayed out.
- Vanilla JS only — stub interactivity inline. No React, no build step.
- Mobile-first responsive: 1 column on mobile, 3 columns from md up.

Review by clicking, not reading. Open 02-prototype.html in your browser, tab through with the keyboard, hover everything, resize the window. The mistakes you find here are cheap to fix; the same mistakes after you’ve picked a framework and committed to a component library are expensive.
Iterate inside the same thread. “The Pro card’s elevation feels heavy on hover — try a softer shadow.” “Add a focus ring that matches the brand accent.” Each iteration replaces the file; commit each version to a prototype/ folder if you want to A/B compare later.
Save the artefact. design/02-prototype.html. This is what Claude Design will reason over in Stage 3.

Why HTML before any framework? The prototype is a contract about layout, copy, and states. Validating that contract in plain HTML costs nothing. Validating it after you’ve wired up React, picked a routing library, and chosen a component framework costs days. The pipeline pushes every decision as far left as possible.

Stage 3 — Claude Design: Design system + handoff bundle

GPT-5.5 produced one well-designed page. Claude Design’s job is to turn that single page into a system — tokens, scales, components, state matrices — and apply that system across the rest of the product flow.

Open Claude Design and attach both 01-mockup.png and 02-prototype.html. Claude Design accepts both image and HTML as inputs, and using both gives it more signal about what’s intentional vs. accidental in the prototype.

Ask it to extract a system, not just a copy. The prompt is doing the work of distinguishing “the spacing I happened to use” from “the spacing rule the design follows”:

Extract the design system from these two artefacts:
- Typography scale (sizes, weights, line heights, intended use).
- Color tokens (semantic names like "surface", "accent", "text-muted",
  not just hex values).
- Spacing scale (a small set of values used consistently).
- Component variants (cards, buttons, inputs) with every state.
- State matrix per component (default, hover, focus, active, disabled,
  loading, error).
Then apply this system across a 6-screen flow: pricing page (the
original), checkout, account dashboard, billing settings, plan
upgrade modal, and a "subscription paused" empty state.

Iterate inside Claude Design. Use inline edits (“make this card use the secondary surface token instead”) and chat (“the focus ring is inconsistent across screens — pick one and apply it”) until the system is genuinely consistent. This is the stage where the system goes from “almost right” to “actually right” — don’t skip it.
Export the handoff bundle. Use Save as folder (not Canva, not PPTX — those are for stakeholder review, not engineering). Also export a standalone HTML copy as a sanity check you can open offline. The handoff bundle is what Stage 4 reads.
Save the artefact. design/03-handoff/ (the folder export). The bundle contains the metadata Claude Code needs to translate cleanly: token names, component lists, state matrices.

Stage 4 — Claude Code / Codex / Cursor: Documentation + production code

The handoff bundle goes into your repo and gets translated into actual code. Pick the agent that matches your existing workflow — the prompt is roughly the same across all three.

Run from the repo root:

Read design/03-handoff/ and the existing codebase. Produce three things,
in this order, committing after each:

1. A tokens file at src/styles/tokens.css mapping every token from the
   handoff bundle to a CSS custom property. Use the semantic names
   from the bundle, not the raw values.

2. Component skeletons under src/components/, one per component in the
   handoff. Each skeleton renders the visual contract (every state from
   the state matrix) but stubs out business logic with TODO comments.
   Match the project's existing component conventions for file
   structure, props typing, and exports.

3. A DESIGN.md at the repo root documenting the system: the tokens, the
   spacing scale, the component list with state matrices, and a short
   note on how to add a new component. This is the source of truth that
   outlives the handoff bundle.

Do not implement business logic, routing, or data fetching. Only the
visual contract.

Claude Code is best for this stage when you want repo-context-aware translation — it reads existing components first and matches their style.

Run inside the Codex TUI and use the same three-step instruction as the Claude Code tab. The interactive TUI lets you review each tool call (file write, command run) before it executes — that’s the natural review surface for a bulk-generation pass.

codex "Read design/03-handoff/ and the existing codebase. Produce ..."

If you want every write gated explicitly, launch with -a untrusted — it’s the only mode that asks before each action. Every other approval mode auto-approves writes and skips the review this prompt expects you to do: on-failure auto-approves on success (asks only on failures), on-request auto-approves unless the tool itself escalates, never never asks, and --full-auto auto-approves everything in the sandbox.

Open the project in Cursor, drag design/03-handoff/ into the Composer context, and use the same three-step instruction as the Claude Code tab.

Cursor is the right pick when you want to interleave manual edits with the AI pass — accepting some component skeletons as-is, rewriting others in place, and keeping the loop tight inside the editor.

The output is a PR containing tokens, component skeletons, and DESIGN.md. Business logic comes in a follow-up PR; this one is purely about the visual contract.

End-to-end example: pricing page

Here’s the same feature traced through every stage with the actual prompts. Pricing page is a useful example because it exercises typography, layout, copy, every interactive state, and component composition in a single screen.

Stage 1 input. Three competitor pricing pages as references; a brand guide PDF; a one-paragraph brief (“Three plans — Free, Pro at $19/mo, Team at $79/mo. Pro is the recommended plan, visually emphasized.”).

Stage 1 prompt. The 4K mockup prompt above, with the recommended-plan note added.

Stage 1 output. design/01-mockup.png — a single 4K image showing the three-card layout with realistic plan names, prices, and feature lists.
Stage 2 input. 01-mockup.png plus the original brief.

Stage 2 prompt. The single-file HTML prompt above, with disabled state explicitly required on the Free plan CTA (so we can see the design’s empty/disabled treatment).

Stage 2 output. design/02-prototype.html — opens in a browser, every button has hover/focus/disabled states, mobile collapses to one column.
Stage 3 input. 01-mockup.png, 02-prototype.html.

Stage 3 prompt. The system-extraction prompt above with the 6-screen application list.

Stage 3 output. design/03-handoff/ containing tokens, components, state matrices, and 6 screens that all use the same system.
Stage 4 input. design/03-handoff/, the existing codebase.

Stage 4 prompt. The three-step instruction above.

Stage 4 output. A PR with src/styles/tokens.css, src/components/{Card,Button,Plan,Modal,EmptyState,...}.tsx, and DESIGN.md. The next PR wires up routing, Stripe, and copy editing — but those are now decoupled problems.

What this pipeline buys you

Faster ideation. Going from brief to clickable prototype is hours, not days.
No “the design is wrong” loops late in implementation. Layout and copy mistakes are caught in Stage 2, when fixing them is a re-prompt, not a refactor.
A consistent design system from day one. The tokens and components in your repo come directly from a system the model thought through — they’re not a backfill exercise.
Documentation that mirrors what was actually built. DESIGN.md is generated from the same handoff bundle that produced the components, so it stays in sync by default.
Checkpoints to roll back to. Each stage’s artefact lives in design/. If Stage 4 produces something off, you don’t restart from the brief — you restart from Stage 3.

Quality gate at every handoff

Each handoff in this pipeline should clear a 0→10 quality bar before moving downstream. A 6/10 mockup compounds into a 6/10 prototype, a 6/10 design system, and 6/10 code. See Rating & Iteration Loop for the self-critique prompt template and the stage-specific rating focus.

Design-to-Code: Figma and Design Systems — the Figma-MCP variant of Stage 4 if your design lives in Figma instead of Claude Design.
Frontend UI Implementation with Cursor — deeper coverage of Stage 4 in Cursor specifically.