How I govern a fleet of AI agents in my own product's repo

It's not vibe-coding. It's a system with contracts, safety gates, and a human review that's never skipped.

Role: Solo engineering
Jobs shipped: +230
Stack: Astro · React · TS
Status: Live in production

The problem

AI is fast, tireless, and — without a frame — also fast and tireless at breaking things. What matters isn't the prompt: it's the system around the agent.

The system

Four layers that make autonomy safe.

State lives in the filesystem

Every job is a ticket; the folder it lives in is its state. Nothing closes without a QA "theater" that proves it. 200+ shipped this way.

A binary safety gate

Before touching anything: agent-ok or human-required. Auth, payments, the database, secrets → always human. No grey zone.

The agent never merges

It implements, opens a PR, and stops. I review and merge. If something degrades, I see it as "a PR I reject," never as bad code in production.

A test that watches whether the AI decays

Frozen golden cases: when I change the agent's rules, it re-solves them and I check the tests still pass. Regression — but for the agent's behavior.

The honest part

Outcome

+230

jobs shipped, each with human review

agent merges — PRs only, approved by me

governance layers: contract · gate · PR · regression test

Work with me

A real product in production, built solo, with a standard most teams don't have. Want it on yours?

Let's talk Work with me