The Architect and the Builder

Ein Kreislauf von Generate, Test und Fail führt dazu, dass Münzen in ein brennendes Loch mit der Aufschrift Waste fallen, umgeben von Rechnungen und einem leeren Portemonnaie.

There’s a story making the rounds right now. You’ve probably seen it on YouTube, the headline promising that AI can now build entire applications while you watch. Or sleep. Or do whatever productive people do when their computer does all the work.

The pitch goes like this: tell the AI what you want, hit go, and let it loop. It writes code, runs tests, checks its own work, fixes errors, runs tests again, keeps going until it decides it’s finished. Computer use, autonomous agents, the whole circus. Set it loose, come back when the product ships.

I think this is lazy. I think it burns tokens to avoid thinking.

What the loop actually is

The “AI Loop”, sometimes called iterative test-driven development, sometimes called agentic coding, always called revolutionary, is straightforward. You give a model a goal. It generates code. It runs that code against a test suite or a browser check or some other validation. If something fails, it reads the error, modifies the code, tries again. Round and round until everything passes.

On paper, it sounds fine. On paper, a lot of things sound fine.

What bothers me isn’t that it fails. It’s the assumption underneath it. The assumption that the valuable part of building software is the typing. That if you can automate the typing, you’ve automated the building. That the human’s job is to state a wish and wait for fulfillment.

I don’t work that way. I don’t want to.

Ein Bauarbeiter betrachtet den Ablaufplan eines Bauprojekts, der von der Vision über Analyse, Skizzen, Planung, Materialauswahl, Begutachtung, Kostenschätzung bis zur Umsetzung reicht, umgeben von Bauplänen, einem Lineal und einer Kaffeetasse.

How I actually work

When I sit down to build something, a feature for my app, a new tool, whatever, I don’t open a chat window and type “build me a login screen.” I start talking. With the AI, yes, but really with myself, out loud, through conversation.

We discuss what this thing needs to do. Not the implementation. The behavior. Who uses it. What they’re trying to accomplish. Where it fits into everything else that already exists. What happens when something goes wrong. We poke at edge cases. We find the spots where my original idea falls apart under scrutiny.

Those conversations get written down. They become documents. PRDs, specs, call them what you want. Then those documents get broken apart into small, specific pieces. Each piece describes one thing that needs to happen, in enough detail that someone who didn’t have the original conversation could execute it without guessing.

Then, and only then, does the AI write code.

At that point writing the code is mechanical work. Not because it’s easy or unimportant. The decisions have already been made. The architecture is settled. The tradeoffs are named. The AI’s job is to implement according to plan, the same way a builder frames a house according to drawings. The builder picks which nail to drive first. The builder doesn’t get to move the load-bearing wall.

Vision is not a prompt

The difference matters because of what the loop throws away.

When you “build X and run tests until done,” you’re outsourcing the thinking that happens between knowing what you want and knowing how to ask for it specifically. That thinking is where the product gets designed. Where you discover your original idea had a flaw you hadn’t noticed. Where you realize two features you planned are actually one feature, or the thing you thought was simple needs three screens, not one.

The loop skips all of that. It takes your initial statement, which is almost certainly underspecified, and starts executing against it immediately. If you’re lucky it produces something that works. If you’re very lucky it produces something that resembles what you wanted. Either way you’ve spent a pile of tokens generating and discarding code nobody thought through, because the thinking step got treated as overhead instead of treated as the work itself.

I don’t want AI to replace my thinking. I want it to sharpen it.

When I discuss a feature with an AI before anything gets built, the AI pushes back. Asks questions I hadn’t considered. Points out contradictions in what I described. Forces me to articulate the vision clearly enough that someone else, something else, can work from it. That’s what a tool should do. Make me more capable. Don’t make decisions for me.

The tin foil hat section

I’ll say the quiet part out loud: companies selling AI tokens have every incentive to promote workflows that burn lots of them. The loop burns tokens spectacularly. Generate, test, fail, regenerate, retest, refail. Each round trips context through an API. Each round shows up on somebody’s invoice.

Maybe that’s cynical. Maybe the people pushing autonomous coding loops genuinely believe this is the future of software development. I’m sure some of them do. But I also know that when your business model charges per inference, you’re drawn to solutions that require many inferences. The loop is that solution dressed up as innovation.

I’d rather pay for ten focused implementations than a thousand iterations of guess-and-check. I’d rather spend my tokens on conversation than on trial and error.

What I’m arguing for

None of this means I hand-code everything. Far from it. The AI writes plenty of code in my workflow. It just writes it last, after the hard work of figuring out what the code should do.

The division of labor I’m after looks like this:

The human owns the vision, requirements, architecture, the decisions about what the product is
The AI serves as discussion partner during that process. Sounding board. Questioner.
The AI implements once the plan is solid, working from granular instructions that reflect real thinking

This is slower to start. Requires more from you upfront. You can’t pour coffee and watch the agent spin. You have to show up and think through what you actually want.

What you get back is code that reflects your choices. Code where every file exists because someone decided it should, not because the loop generated it and the tests happened to pass. Code you can explain and defend when requirements shift, because you understand why it looks the way it does.

The loop gives you output. It doesn’t give you understanding.

And understanding is the part I’m not willing to outsource.