Case study
Live

How I govern a fleet of AI agents in my own product's repo

It's not vibe-coding. It's a system with contracts, safety gates, and a human review that's never skipped.

Role
Solo engineering
Jobs shipped
+230
Stack
Astro · React · TS
Status
Live in production
The problem

AI is fast, tireless, and — without a frame — also fast and tireless at breaking things. What matters isn't the prompt: it's the system around the agent.

The system

Four layers that make autonomy safe.

01

State lives in the filesystem

Every job is a ticket; the folder it lives in is its state. Nothing closes without a QA "theater" that proves it. 200+ shipped this way.

02

A binary safety gate

Before touching anything: agent-ok or human-required. Auth, payments, the database, secrets → always human. No grey zone.

03

The agent never merges

It implements, opens a PR, and stops. I review and merge. If something degrades, I see it as "a PR I reject," never as bad code in production.

04

A test that watches whether the AI decays

Frozen golden cases: when I change the agent's rules, it re-solves them and I check the tests still pass. Regression — but for the agent's behavior.

The honest part
Outcome
+230
jobs shipped, each with human review
0
agent merges — PRs only, approved by me
4
governance layers: contract · gate · PR · regression test
Work with me

A real product in production, built solo, with a standard most teams don't have. Want it on yours?

Let's talk Work with me