If you ask me tomorrow, I might change my mind completely, but in March 2026 I firmly believe that LLM agents work best as assistants around a strong human operator, not as replacements.
I read The Mythical Man-Month: Essays on Software Engineering in March 2021, but by then I was already familiar with some of the amazing ideas the book introduced, including:
- The mythical man-month, and the fact that no matter how many mothers you assign to the task of growing a baby, it always takes about 9 months to do so
- No silver bullet, and the fact that there is no single methodology that can promise a tenfold improvement in how things are done
- The second-system effect. This is probably the one I used the most: the idea that when we do things for the second time, we can actually do worse because we have the illusion that now we know what to do, but we still have only part of the experience the project can teach us
- The tendency toward an irreducible number of errors? Entropy
Those are the greatest hits, but there are many other nice ideas in there. The book is from 1975.
One idea that, to be honest, I found funny at the time was the Surgical Team. Brooks was making the argument that good programmers are generally more productive than mediocre ones and that it’s better to make their lives easier by giving them more freedom and letting them do the most critical work. The less critical parts should be handled by the weaker programmers, giving them, in essence, the role of assistants.
So how would that look?
The Surgeon programmer comes into the room and says:
- “Is the codebase ready?”
- “Yes, sir!” says the Assistant. “The code is checked out cleanly from the main branch, we have the dev environment ready with some test data, the user stories are downloaded, and all ambiguous questions have answers.”
- “OK. So what is the plan for today?” asks the Surgeon.
- “We want to automate this media-processing workflow,” replies another Assistant. “We have to move the files from A to B, and as we do so we need to apply some transformations. The input will be partially taken from the users and partially from the system configuration.”
- “I see,” says the Surgeon. He thinks for a moment and then asks, “How does the transformation handle bad input? What if it came from the user? What if it came from the system configuration?”
The conversation goes back and forth, and then, when the Surgeon is ready, he says, “OK, we are ready. Assemble the code!” And the assistants go off and type in their terminals. And when they are ready, the Surgeon asks, “Is it ready? Does it pass our verification criteria?” And when it does, the feature is ready.
I don’t remember why I found this idea funny. Probably partially because saying who is better and by how much is just a strange idea. It reminds me of a post I read recently, Nobody Gets Promoted for Simplicity. What does it mean if someone is producing more? Are they always more productive? Maybe they are doing the wrong thing. Maybe they are the ones who start the fire and then the ones who put it out.
But I was also sceptical of this idea because I knew intuitively that the slowest, most difficult part of any task is communication. And it felt like, in all those moments when the Surgeon would have questions, it would be just so easy to get sidetracked into endless debates or discussions about the wrong things. This just could not work, right?
Now, in March 2026, I think this can work.
In fact, I really think this is close to the most effective way to work with LLM agents. The programmer knows the work deeply (ask me how he learned that in this day and age), and he is only given the most important high-level information. He then proceeds to apply his knowledge and intuition to figure out whether there are any gaps that need addressing, poking and prodding at different elements of the proposed solution. When the solution is ready, he comes back and checks how it behaves, tries how things work together, and decides whether there are any things that still need addressing.
I really think it can work that way, and my workflows gravitate to something like this, with agents handling build waits, PR comments, and test runs, and doing a lot of the groundwork.
What does that look like in practice?
I come in and say:
- “We need to implement a new workflow. We will move files from A to B. We will apply transformations along the way. You can find a reference for moving files from A to B in workflow X, which you can find here. Here’s a [link] to a previous execution. You can find documentation for the transformations at [link]. Settings in the system can be found at [link]. Missing settings should come from user input to the workflow. Go and prepare an implementation plan. Use tools A, B, and C to inspect previous executions, read documentation, and work with the settings storage. If things are not clear, ask me questions to fill in the gaps.”
The agent goes off, figures out a plan, and I review it. Then I say:
- “Execute the plan. Make sure you verify that things are working against the test environment. When you are done, open a new pull request. Ask for a code review on Slack and wait for any feedback. Notify me if there are comments, along with your thoughts on how to address them. If any builds fail on the pull request, fix those as well.”
Of course, I don’t have to say all of that explicitly on every piece of work. Some of it is covered by various skills I have built over the months, but you get the gist. I guide the agent, review the code, and wait for it to finish, but the general architecture, the decisions, and the way we verify things all come from me.
This kind of setup is, sadly, also not the norm. We see people refusing to use LLMs. We see people trying to adopt them slowly, but mostly poorly, through one-shot attempts. And on the other side of the spectrum, we see people using these tools very mechanically, without really knowing what they are good at, saying stuff like the agent should do all the work of a programmer because some lab that sells AI said that it has AI doing 90% of its product work (I know, I know).
I personally hope we will land at workflows similar to what Brooks proposed in his Surgical Team essay, because they are just fun to use and make the work much easier, but we’ll see.