For every feature I shipped with an AI agent, I shipped more than two fixes.
1,000+ PRs. 84 days. Solo. The throughput was real — but 360 of those PRs were fixes against 150 features. That ratio is the part nobody puts in their recap post.
The core problem: the agent is a 20× author, not a 20× reviewer. I had no leverage on verification — just guards built from past failures, useless against anything new.
What I'd change: second agent for adversarial review only, and changes small enough that being wrong is cheap.
Agentic engineering doesn't remove the hard part. It moves it.