Loop Instead of Pipeline

AI is not just speeding up work. It challenges the operating model that knowledge work has been built around.

Plan, design, build, test, review, document, deploy. Seven phases, seven specialist disciplines, seven handoffs. That is how we learned to build software. It is also how we learned to organize many other forms of knowledge work: first the brief, then the analysis, then execution, then review, then approval. A pipeline of people, documents, and handoffs.

And we came to treat what I am calling the pipeline as the work itself. We built roles and identities around the phases, the artifacts, and the need for coordination.

What is becoming visible now is that this pipeline was never the work itself. It was the infrastructure required for people with limited context, limited attention, and limited bandwidth to work together. Once agentic AI systems can handle large parts of that chain autonomously, it is not just a process that collapses. It is the rationale for the process.

Most debates about AI are still stuck on the wrong question. Are models good enough at coding yet? Good enough at analysis? Good enough at producing content? The more interesting question sits one level above that: What operating model does an organization need when execution is no longer the bottleneck?

That is the question this essay is trying to answer.

If the essay works, it works as a map for that shift. Not as a perfect target architecture, but as a way to stop confusing the old pipeline with the work itself. The practical gain is to read loops not as a new buzzword, but as a different answer to the same old coordination problems.

Practical test: The three templates for this essay test one concrete process for handoffs, loop readiness, and valuable human functions. You will not get a reorganization recipe. You will get a sober diagnosis of which parts of your pipeline are real work and which parts are coordination infrastructure. Open the templates

Stay up to date

Get notified when I publish something new, and unsubscribe at any time.

The Pipeline Was a Workaround

The classic separation between planning, building, testing, and shipping was never an end in itself. It was rational. For three reasons.

First: context is scarce. People can only hold so much in their heads at once. Someone doing product discovery cannot simultaneously hold every infrastructure decision, every edge case, and every testing strategy at full resolution. So we specialize.

Second: execution is expensive. Writing code, formulating test cases, maintaining documentation, building decks, checking data: all of that used to cost many hours. That is why it made sense to split work into sections and roles.

Third: coordination was unavoidable. If seven specialists are working along the same value chain, you need meetings, handoffs, specs, tickets, approvals, and status formats. Not because anybody enjoys them. But because people do not share a common working memory.

Frederick Brooks, one of the early theorists of large software projects, described this problem back in the 1970s: adding more people does not increase a project's capacity in a simple linear way. With every additional person, the need for alignment, knowledge transfer, and coordination rises as well. Part of the expected productivity gain gets eaten up immediately.

So the pipeline was never a law of nature. It was a pragmatic organizational compromise for a world in which complex work had to be distributed across many specialized minds.

That is exactly why it is now vulnerable.

What Is Changing Now

Over the past few months, several different threads have described the same shift from different angles.

OpenAI frames the pattern in its guide for AI-native engineering teams as Delegate, Review, Own. Agents take the first pass across almost every phase of the SDLC. Humans review and hold judgment. Shrivu Shankar describes the organizational counterpart in the transposed organization: no longer vertical handoffs between specialist roles, but a loop held by one person from problem to solution. Nate B. Jones describes the same effect as the collapse of the handoff economy. And StrongDM shows with its software factory how far this can be pushed in practice when specification and evaluation move to the center.

These are not four separate trends. They are the same mechanism viewed from four different directions.

If an agent can hold specification, code, tests, documentation, and deployment context in the same working loop, the main reason for hard phase boundaries starts to disappear.

That does not mean everything suddenly becomes easy. It only means the difficulty moves. Away from execution, toward three other problems:

Judgment. Is the result good enough, correct enough, safe enough?

Specification. Has what should exist been described precisely enough?

Operating model. Is the organization built in a way that lets agents actually work with its context, standards, and quality criteria?

The third point gets the least attention. And it is the reason the same models look absurdly productive in one team and barely useful in the next.

From Phase Model to Loop

Shrivu Shankar's term "loop" captures the shift better than almost any current AI metaphor. A loop is the full chain of decisions between a problem and a delivered solution, held by one person. Not alone. But with end-to-end responsibility.

That is more than a full-stack comeback in new packaging. Full-stack used to mean one person can work across several technical layers. The loop means something else: one person holds problem understanding, specification, delegation to agents, evaluation, and ownership of the decision in a continuous cycle.

That difference matters.

In the pipeline, a problem moves through departments. Product formulates a requirement. Design translates it into UI intent. Engineering translates that into system behavior. QA translates that into validation rules. Operations translates it into production stability. Every handoff costs context. Every handoff creates artifacts that are not the product, but bridges between people.

In the loop, that chain does not disappear entirely. But it gets thinner. The person with the best understanding of the problem can work directly on the artifact with an agent. Not through seven translation steps, but in an ongoing cycle of specifying, checking, rejecting, and sharpening.

The core gain of AI is therefore not just faster execution. The core gain is the removal of translation costs between people.

This is the same mechanism that has already shown up elsewhere in dekodiert. The biggest loss in knowledge work was never only in production. It was in the friction losses between heads.

Why This Is Not Just a Model Story

This is where things get interesting. Because this is exactly where much of the public debate stops. It stops at the sentence: models are getting better, so the phases are starting to blur.

That is true. But it is only half the truth.

The more recent discussion around harness engineering exposes the missing middle layer. If agents are supposed to navigate longer work chains reliably, it is not enough for the model to be "good." Then they need:

fast build cycles,

clear skills and runbooks,

observable work traces,

reliable tests and holdout evaluations,

persistent progress artifacts,

and above all: team context that exists not only in heads, but in usable artifacts.

Suddenly the bottleneck looks different. No longer: "Can the model code?" But: "Can our organization express its own standards in a way that an agent can actually work with?"

That is a new kind of infrastructure problem. The old engineering infrastructure consisted of repos, CI, monitoring, and deployment pipelines. The new layer is added on top: skill files, specs, progress files, quality metrics, context stores, adaptive "memory," evaluation harnesses, and internal ontologies.

Put differently: the organization has to become legible to agents.

That is the operational layer missing from many management debates. And it is the reason the same tool stack looks like magic in one place and like a mediocre productivity booster in another.

The New Role of Specialists

Once the loop becomes the target model, the wrong follow-up question usually appears immediately: do specialists disappear?

The answer is a clear no.

But their function shifts.

In the pipeline, specialists contribute direct production inside their section of the chain. The designer delivers mockups. The QA engineer delivers test coverage. The staff engineer delivers architectural decisions and reviews. The ops person delivers production stability.

In the loop model, those capabilities do not become less important. On the contrary. They become infrastructure.

The designer no longer just produces screens. They codify design rules, component logic, and quality criteria in a way that can be reused across every loop.

The QA engineer no longer just tests concrete features. They build test systems, scenarios, and validation patterns that agents and loop owners can use.

The staff engineer no longer just reviews pull requests. They build the guardrails, architectural principles, and evaluation thresholds that make good decisions scalable.

That is a different kind of leverage. Less direct execution, more enablement. Less responsibility for a section, more responsibility for the system.

This may sound abstract, but it is very concrete in organizational terms. The value of a specialist no longer lies only in their own execution. It lies in their ability to turn good judgment into a form that other people plus agents can reliably work with.

This is where the argument touches the machine-readable context idea from an earlier issue. Only this time not as a knowledge problem, but as an organizational one. What is not externalized cannot scale.

Why the Bottleneck Shifts to Judgment

As the pipeline gets thinner, a new scarcity appears: good decisions.

You can see it in almost every credible example. StrongDM does not replace the classic chain with chaos, but with behavioral tests and clean specifications. Karpathy's autoresearch works only because success and failure are mechanically recognizable. OpenAI shifts the focus toward observability and quality scores. Everywhere the same logic: once execution gets cheaper, the value of evaluation rises.

The problem is that judgment scales worse than execution.

You can buy a better agent or a faster loop. You cannot buy better judgment. It has to be developed, calibrated, and maintained inside organizations. It lives in good counterquestions, in clearly formulated rejection criteria, in the ability to stop polished-looking nonsense.

This is also where the overlooked pre-stage of planning sits. As tools get better at turning a request into an implementation plan, they are mostly optimizing the how. The harder work sits one layer earlier: why this problem, why now, and why in this exact shape? Which constraints are real and which are just inherited habit? What would count as success, and what should explicitly not be built? The next-generation loop owner therefore does not just write better specs. They do the harder shaping work first: sharpen the problem, set scope, and close the wrong branches early.

That is why the sentence "the human only reviews now" is misleading. Review sounds like residual work. In reality, that is where the core of responsibility is moving.

Any agentic organization still measuring success primarily in throughput is measuring the wrong thing. More artifacts are often not progress at all. Progress means recognizing faster which artifact should never have been built in the first place.

That is an uncomfortable shift. Production was visible. Coordination was billable. Judgment is harder to measure, harder to standardize, and harder to delegate. That is exactly why it becomes the scarce good.

The Five Constraints Are Still Real

Any clean description of the loop model needs a counterforce. Otherwise it turns into Silicon Valley theater.

Shrivu Shankar names five constraints, and all five matter.

Generalist bound. Not everyone can hold multiple domains together with durable judgment. Some loops become too broad.

Cross-domain taste. A strong first pass is worthless if nobody can assess quality across several layers at once.

Decision capacity ceiling. Even with agents, the number of genuinely good decisions per day remains finite. Hold too many loops and judgment collapses into surface-level pattern matching.

Human touchpoint floor. Not everything can be compressed. Trust, political navigation, conflict resolution, and real accountability remain human.

Bus factor risk. If a loop depends on one person, absence becomes more expensive than in a silo model.

These constraints are not an argument against the essay. They are the reason it does not end as a naive automation story. The pipeline is not collapsing because complexity disappears. It is collapsing because complexity is moving from coordination to judgment.

What This Means for DACH Companies

For German companies, the interesting question is not whether the pattern is real. It is: how do you translate this pattern into organizations that are not starting on a greenfield, that operate with co-determination, and that rarely have the freedom to tear structures apart overnight?

The American version often says: hire loop owners.

The German translation is almost always: turn existing people into owners.

That is slower. But it also has one advantage. It forces articulation.

If roles, and above all the people behind them, cannot simply disappear, then you have to name clearly what function they actually fulfill today. Which part of that is coordination overhead. Which part is real judgment. Which part can be translated into infrastructure. And which part has to remain human.

That does not automatically make co-determination an advantage. But it forces exactly the kind of process archaeology many US texts skip. In Germany you have to answer what a process actually exists for. That is tedious. It is also the difference between redesign and theater. And between hype and a durable chance of success.

That leads to four practical consequences for DACH decision-makers.

First: the old team structure stops looking inevitable. As soon as tasks are describable enough and results are checkable enough, the functional split becomes porous. You no longer necessarily need five specialist roles and four handoffs to get to an outcome. One person can hold the entire loop: understand the problem, work with AI, judge the result. Not everywhere. But in far more places than most organizations are currently willing to admit.

Second: judgment moves from the margins to the center. As long as execution was expensive, review looked like rework. The moment execution becomes cheap, that relationship flips. The scarce resource is no longer who can produce something. The scarce resource is who can tell whether what was produced is any good.

Third: expertise becomes less visible and more valuable at the same time. The specialist of the old world proved their value in direct execution. The specialist of the next phase proves it by externalizing standards, quality criteria, and good judgment so that other people and agents can work with them reliably. Expertise does not disappear. It moves up one layer.

Fourth: machine legibility is not a documentation issue. It is a power issue. Whoever can formulate the context, the criteria, and the rules in a way machines can act on is steering the organization. Whoever cannot will remain stuck in meetings, verbal workarounds, and implicit knowledge. The difference looks technical. In reality, it is political.

Honesty Check

There are at least four ways this essay could be overstating the case.

First: not every pipeline is pointless. Some handoffs do not exist only because of context limits, but because power is deliberately shared, risk is deliberately hedged, or quality is deliberately checked from a second perspective. Anyone reading every loop as waste has missed the point.

Second: regulated contexts remain regulated. In medicine, finance, automotive, or critical infrastructure, gates do not simply disappear. What changes there is the content and role of the gates, not their existence.

Third: machine legibility is not the same as truth. Just because something has been turned into Markdown, tests, or skill files does not make it good judgment. Bad standards scale just as efficiently as good ones.

Fourth: part of good work remains non-externalizable. Political intuition, moral responsibility, taste at the edge of the new, real trust in conflict. Anyone who believes all of that can be fully turned into infrastructure through cleaner formalization is confusing legibility with humanity.

That is exactly why this essay does not end in a full-automation fantasy. It ends in a more sober picture: human value does not shrink to the places machines "still cannot do." It concentrates in the places that require judgment, responsibility, and direction.

A Simple Diagnostic Tool

If you want to know whether a process is still pipeline-shaped or could already become a loop, three questions are enough.

1. Does this step exist primarily to transfer context between people?

If yes, it is a candidate for compression.

2. Does this step exist primarily to compensate for human working-memory limits?

If yes, it is also a candidate.

3. Does this step exist primarily to secure judgment, trust, or accountability?

If yes, caution is warranted. You may not just be removing friction. You may be removing one of the few valuable human functions in the process.

With those three questions, it becomes surprisingly easy to distinguish which meetings, documents, approvals, and handoffs are infrastructure for the old operating model, and which still serve a real function.

This is not a transformation roadmap. But it is a better starting point than the hundredth workshop on AI use cases.

The Actual Redesign

The most important observation in all of this may be the simplest one: most companies still talk about AI as if they were buying a new tool. Something you bolt onto the existing organization.

That is probably the wrong level of abstraction.

AI is not just changing the productivity of individual roles. It is changing the reason organizations were sliced this way in the first place. The pipeline of specialist roles, handoffs, alignment rituals, and translation artifacts was a workaround for a world in which execution was expensive and shared context was scarce.

Once those conditions change, the same infrastructure does not simply become more efficient. It becomes questionable.

That does not mean every organization will work in loops tomorrow. It means something simpler: anyone trying to understand the next phase of knowledge work should stop looking only at model capabilities. They should look at the operating model those capabilities require.

The real question is not: What can the agent do?

It is: What kind of organization becomes possible once we stop treating human context limits as laws of nature?

Want new issues by email? Subscribe to the newsletter

Navigation