The Middle Loop

"We kept asking the same question in every room: if AI handles the code, where does the engineering actually go?"

That's from a ThoughtWorks retreat held in February 2026. Senior engineering practitioners from major technology companies gathered to confront how AI is reshaping software development.

They identified a new category of work forming between the inner loop (writing, testing, and debugging code) and the outer loop (getting code to production and keeping it running) of software engineering. They called it the middle loop — supervisory engineering that didn't exist before AI agents started writing production code.

Nobody in the industry has named this yet. I've been doing it for the past year as a fractional CTO for founders building with AI. Here's what it looks like from the inside.

What the middle loop is

The ThoughtWorks description is precise (emphasis mine):

The middle loop involves directing, evaluating and fixing the output of AI agents. It requires a different skill set than writing code. It demands the ability to decompose problems into agent-sized work packages, calibrate trust in agent output, recognize when agents are producing plausible-looking but incorrect results and maintain architectural coherence across many parallel streams of agent-generated work.

In plain terms: someone needs to direct the AI, check its work, and make sure the pieces fit together.

The practitioners who excel at this share three traits: they think in terms of delegation rather than implementation, they carry strong mental models of system architecture, and they can rapidly assess output quality without reading every line.

That last point deserves attention. Not "reading the code carefully." Not "reviewing every PR." Assessing quality without reading every line. Most engineering career ladders don't recognize this as a skill. It's the skill that matters most when agents produce code faster than any human can review it.

What they found

I read this paper thoroughly, checking findings against what I see daily on retainer calls. Three observations stood out:

Test-driven development (TDD) as prompt engineering. The retreat found that test-driven development produces dramatically better results from AI coding agents. When tests exist before the code, agents can't cheat by writing a test that confirms whatever incorrect implementation they produced. This reframes TDD from a quality practice into a prompt engineering practice — the test suite becomes deterministic validation for non-deterministic generation.

In practice, I take this further. The first thing I implemented with one client was a BDD (behavior driven design) process: system tests (business cases) first, because a non-technical founder can read them — Ruby reads like prose. I approve the spec. Only then is it frozen, and we move to implementation. The test becomes a contract between the human who decides what's correct and the agent that builds it. (This is how Naveed and I migrated Indee.co from React/Express to Rails.)

The blast radius question. One practitioner from the ThoughtWorks retreat reframed the core engineering discipline: "Instead of asking 'did someone review this code?' organizations need to ask 'what is the blast radius if this code is wrong, and is our verification proportional to that risk?'" Once more, this emphasizes the role of automated testing as a safety net for agentic errors that could end your business.

Decision fatigue. "The constraint shifts from production capacity to decision-making capacity." Agents generate code fixes and feature implementations faster than anyone can approve them. I feel this daily: PRs (code changes) come in so fast I can hardly keep up. The bottleneck isn't writing code anymore. It's deciding whether the code is right.

Where the solo founder fits in

The ThoughtWorks retreat brought together practitioners from major technology companies. Their findings are rigorous. In this section we will try to apply enterprise-scale insights to the fastest-growing segment of AI-assisted development: solo founders building without a team.

The middle loop gap. The paper discusses staff engineers becoming "friction killers," code review as a mentorship channel, and organizational design for agent topologies. A solo founder building with AI has none of this. The founder embodies the inner loop as well as the outer loop — and the middle loop is the gap nobody fills.

That's what a fractional CTO is: a rented middle loop for people who don't have a team yet. The retreat observed that developers are now "doing work that used to belong to product managers." The corollary: solo founders are assuming the role of product managers who produce code — deciding what to build and building it, with AI handling the implementation. They need the middle loop more than anyone, and they're the least likely to know it exists.

The spec-driven tension. The retreat observed teams adopting structured specifications to give AI agents enough precision. This is smart. It also echoes a familiar pattern. Heavyweight upfront specification can slow iteration, create bottlenecks, and resist the rapid pivoting that solo founders depend on. The middle loop practitioner has to calibrate how much specification is enough without falling back into waterfall rhythms. The industry hasn't resolved this tension yet.

The oscillation problem. The retreat flagged agent coordination risks — multiple agents making different prioritization calls, creating feedback loops instead of convergence. This plays out daily. One review tool flags a code quality issue. The fix triggers a new offense in another tool. That fix triggers another. Clean code is a moving target. If you don't make a call about actual risk — what matters versus what's cosmetic — you burn time and tokens chasing a hypothetical standard. In practice, it's the grown-up in the room saying "this is good enough to ship."

The stability signal. ThoughtWorks flagged a concerning regression: AI makes large changesets easy to produce, pushing teams toward waterfall-like patterns with declining stability. For solo founders, the implication is simpler, but also more dangerous: without someone watching how much changes at once, the codebase grows faster than anyone can comprehend it.

What it actually takes

The ThoughtWorks paper describes the capability. Here's my methodology.

Reduced listening. The paper says middle loop practitioners "can rapidly assess output quality without reading every line." I have a name for this skill. It comes from my training in electroacoustic composition.

In electroacoustic music and sound art, reduced listening means stripping a sound down to its essential qualities. Ignoring what produced it or what it reminds you of. Just the structure, the texture, the phenomenology. I apply the same discipline to code. Not "what framework generated this" or "what the developer intended" — but what the code actually does, structurally. Where the weight is. Where the cracks are.

This isn't a metaphor. It's a trained perceptual discipline, and it's what makes middle loop work possible at speed.

Convention as constraint. The paper discusses type systems and formal methods as ways to constrain agent output. Rails offers a pragmatic alternative: convention over configuration. Every Rails convention is a constraint the agent doesn't need to be told about, a decision it doesn't need to make, a misunderstanding eliminated before it starts. The free skills I publish encode these conventions as persistent rules. For the full technical case, read why Ruby wins.

The "grown-up in the room." A phrase from one of my clients — a solo founder who built his entire product with AI and needed someone to tell him if it was actually built right. Not someone who writes better code than the agent. Someone who knows which code matters, which risks compound, and when to stop fixing and start shipping.

The map is being redrawn

The retreat closed with this:

"The retreat didn't produce a roadmap. It produced a shared understanding that the map is being redrawn and that the people best positioned to draw it are the ones willing to admit how much they don't yet know."

The middle loop is real. It's happening now. If you're building with AI and nobody who knows code has looked at what the agent wrote — that's the gap.

The audit tells you where you stand — a one-time read of your codebase, from €2,000. The retainer keeps someone in the room every week, from €4,000/mo.


Code Sculpture

One email every other week. Specific patterns for AI-assisted Rails that I'd only send to people who want to go deep. Unsubscribe anytime.


References

The Middle Loop - Julian Rubisch