Christoph Fahlbusch
Native apps, systems, AI workflows, and code
Design Systems and AI
Primer iOS started as a response to a structural problem, not a component exercise. I used an agent-first workflow to turn a visual refresh into real design system infrastructure. The result is a Swift 6 package with shared tokens, SwiftUI and UIKit parity, a demo app, a Figma plugin, and a production path.
Primer iOS: One designer, 10 AI agents, and a design system
Primer iOS started because a visual refresh exposed a deeper problem in GitHub Mobile. The app didn't have one reliable styling system. It had multiple overlapping ways to describe color, spacing, typography, and components.
I could have kept polishing screens, and the app would have looked better for a moment, but the system underneath would have kept recreating the same inconsistency.
So I stopped treating the refresh as the finish line. I used it to build the system underneath, a real design system package built in code and meant to reduce translation and give the app a more durable path forward.
What exists now
Primer iOS has three layers, shared tokens, SwiftUI components, and UIKit twins. UIKit still had to be there because GitHub Mobile for iOS still has a lot of UIKit. A SwiftUI-only system would have looked cleaner on paper and been much less useful in practice.
The first version included 597 functional color tokens, 726 bundled Octicons, 18 SwiftUI components, 17 UIKit twins, a demo app, a migration strategy, and a Figma plugin that generates a 1:1 component catalog from the same token definitions.
I built it with Swift 6 strict concurrency and zero third-party dependencies because I wanted the foundation to stay small enough that engineering could trust it.
It wasn't supposed to sit around as a nice demo repo. Engineering had to be able to pick it up and move real work into it.
The agent-first workflow
I wasn't asking Copilot to autocomplete a few files. The repository was built around an agent-first workflow with 10 specialized AI agents, each with a clear job, coordinated by an orchestrator.
The flow was simple on purpose because I wanted this to work with plain language. I described what I needed, the orchestrator classified the request, an analysis agent challenged overlap with existing patterns, and implementation agents worked through tokens, SwiftUI, UIKit, Figma, and the demo app. Build checks ran between steps, a reviewer agent checked token usage, accessibility, parity, conventions, and performance, and a commit agent prepared clean history. Nothing was committed without human approval.
AI handled the volume, repetition, and cross-file coordination. I stayed on system boundaries, product judgment, taste, and the call on whether the work deserved to stay.
The agents had real jobs
Each agent owned a narrow part of the system. Analysis acted like a skeptic before new work started. Tokens owned colors, spacing, fonts, durations, haptics, and icons. SwiftUI built modern components, UIKit built parity twins after reading the SwiftUI source, Figma created plugin builders, Demo made sure every component had a reference page, and Reviewer acted as the quality gate.
Generic AI output is usually where consistency goes to die, so the agents were constrained by repeated instructions, scoped responsibilities, and build verification. A new component was not done until it existed in SwiftUI, had a UIKit twin when needed, appeared in the demo app, and had Figma coverage.
After a while it stopped feeling like code generation. It felt more like running a workflow with clear owners, rules, and checks. The system knew where files lived, which rules applied, how to validate them, and when to send work back.
The instruction layer
The agents are backed by a layered guidance system that contains global project instructions, file-scoped instructions, and agent-specific definitions. In total, the repository had more than 2,600 lines of agent and instruction documentation.
It's a lot, but design systems drift fast when the rules only live in one person's head. "Use PrimerColor tokens, never hardcoded colors" appears in more than one place on purpose. The rule is active during implementation, during review, and when editing matching files.
I repeated those rules on purpose because consistency was the whole point. They had to survive across agents, files, and passes, not live only in my head.
Where AI helped most
AI was most useful where design systems create a lot of necessary surface area. It helped with token generation, component boilerplate, SwiftUI and UIKit parity, demo pages, Figma builders, documentation, and repeated checks that humans can do, but rarely do perfectly at this scale.
The Figma plugin is a good example. It doesn't try to keep Figma and code in sync by asking someone to update both manually. It generates a component catalog from the same token definitions as the Swift package. A designer looking at a PrimerButton in Figma and an engineer looking at PrimerButton in Xcode are looking at the same underlying values.
What I wanted was less translation, less drift, and less time spent rediscovering what the system had already decided.
Where human judgment stayed in charge
The agents were fast, but they weren't the product designer, architect, or final reviewer. They made confident mistakes, and some were caught by the reviewer, some by the build, and some by me.
UIKit was the hardest part. Some SwiftUI effects don't have reliable UIKit equivalents. Instead of forcing a bad port, the workflow documented the gap and moved on. I kept one rule for the rest of the project. When a problem takes too many loops, stop, write down what happened, and avoid turning stubbornness into architecture.
The workflow also had persistent memory. When the system learned that a certain approach didn't work, that lesson became a note, then a rule, then something the Reviewer could enforce later. Every mistake made the next pass a little better.
Why this went beyond the refresh
Traditional design system work still involves too much translation. A designer creates a Figma component, an engineer interprets it, review catches differences, and the Figma component and code component drift over time.
What I wanted instead was infrastructure that designers and engineers could both build on. A designer can describe a component in product language. The system can plan it, build the code, create the Figma builder, add the demo page, and validate the result. The designer still reviews it, but the handoff gets much smaller.
Once tokens, component rules, demo references, and Figma builders point at the same source, the team spends less time reinterpreting what the system already decided.
AI gets interesting for design when it helps build and verify that structure without pretending it can replace judgment.
Closing
Primer iOS wasn't just a design system project, it was infrastructure for better product work.
Primer iOS gave GitHub Mobile a more durable path out of design drift, and the agent workflow made it possible to build across tokens, components, docs, demos, and Figma without giving up human review.
What I care about is that AI broadens what one designer can build, while the quality bar, the system boundaries, and the final judgment still stay human.