A non-technical leader’s guide to letting agents run real work without getting burned.
By Kristian Kabashi
Here is a number that should change how you think about all of this. By 2026, AI was writing roughly half of all new code in the world. A staggering achievement. And in the same period, the amount of that code being thrown away and rewritten shortly after, the churn, jumped by more than forty percent. Read those two facts together and you have the whole lesson of this piece. Speed without judgment is not progress. It is expensive rework with a confident face.
That lesson is not really about code. It is about what happens the moment you let a machine run real work in your company. The agents are fast, capable and tireless, and they are also, sometimes, confidently wrong. So the question every leader is fumbling toward is the right one to ask. How do you trust a machine without getting burned.
I write about this under a name I gave the idea, Blank Collar, and the answer starts by throwing out the question most people ask first.
Trust is a dial, not a switch
Almost everyone frames this as a yes or no. Can I trust AI or not. That framing will hurt you, because it leads to one of two mistakes, either banning the thing and falling behind, or handing it the keys and getting burned.
Trust is not a switch you flip for AI as a whole. It is a dial you set for each specific action, and the setting depends on two things and only two things. How big are the stakes if it goes wrong, and how reversible is it. A first draft of an internal memo is low stakes and completely reversible, so the dial goes to full autonomy, let it run. Wiring money to a new supplier is high stakes and hard to undo, so the dial goes to the other end, a human approves it every single time. Everything in your company sits somewhere on that grid, and the entire skill is matching the leash to the consequences.

What always needs a human
Some actions sit so far into the high stakes, hard to reverse corner that they should never run fully on their own, no matter how good the agent looks. It is worth naming them plainly so your team has a bright line.
Money moving out of the company. Anything touching personal or sensitive data. Changes to policy, legal terms, or anything with regulatory weight. Anything that genuinely cannot be undone. And anything that speaks to the outside world in your company’s name. For all of these, you keep what the field calls a human in the loop, which has a precise meaning worth getting right. It is not a nervous person rubber stamping things. It is a qualified person, with the real context, the authority to say no, and a defensible reason for the decision, placed at the exact point where the stakes are highest. This is also, increasingly, the law. Regulations like the European AI Act now require meaningful human oversight for high risk systems, so this is not just prudent, it is becoming mandatory.

How to actually trust safely
So here is the practical method, the thing you can put in place this quarter, in five moves.
Tier your actions first. Sort the work your agents do into low, medium and high, by stakes and reversibility, and decide the autonomy level for each tier up front, so nobody is improvising in the moment. Keep humans on the gate for the high tier, with a real approval step before anything irreversible happens. Then verify by sampling, not by faith. You do not check every output, that defeats the point, but you check a sample, and you watch a few honest metrics, how often it gets things wrong, how often it escalates, how much of its work gets redone, how fast you catch and fix mistakes. Make everything traceable, so every agent action is logged and you can always answer the question that matters after something goes wrong, which is simply, why did it do that. And finally, calibrate over time. Trust is earned, not granted. As an agent proves itself on a tier, widen its leash. The day it slips, tighten it. That is exactly how you would manage a talented new hire, and it is exactly right here.
The model is rarely the risk
Here is the part that should reassure you and focus you at the same time. When these deployments fail, it is usually not because the model was stupid. The respected forecasters are blunt about this. By the end of the decade, something like half of all agent project failures are expected to come not from weak intelligence but from governance gaps and systems that do not talk to each other properly. The intelligence is rarely the weak point. The guardrails are.
Which means trusting a machine well is not really a technical problem you need a PhD to solve. It is a management problem you already know how to solve. You have onboarded a brilliant, fast, slightly overconfident new hire before. You did not hand them the company credit card and the legal seal on day one. You gave them small reversible things, watched how they did, and widened their authority as they earned it. Notice that business leaders already do this by instinct with AI. Surveys show they happily trust agents to analyse data, and grow far more cautious about letting them move money or deal with people unsupervised. That instinct is correct. The work is just to make it a system instead of a gut feeling.

The reframe
So stop asking whether you can trust AI. Start asking how much autonomy a specific action has earned. Set the dial by stakes and reversibility. Keep a real human on the irreversible, expensive, public, regulated decisions. Verify by sampling and watch the numbers. Log everything. And widen the leash only as trust is earned. Do that and you get the speed without the forty percent of rework, the upside without the disaster.
And once the machines are safely handling the execution, under your watch, on the right leash, you arrive at the most human question of all, the one no amount of automation can answer for you. When anyone can make anything, instantly and for almost nothing, what makes yours the one worth choosing. The answer is the rarest and most valuable thing left, and it is where this whole series has been heading. Taste.
Kristian Kabashi writes Blank Collar, a field guide for executives rethinking how their companies are built. More at kristiankabashi.com.
Sources: Atlan, AI agent risks and guardrails 2026 · Strata, human-in-the-loop AI oversight 2026 · Gartner via Atlan, agent governance predictions



