AI StrategyJune 8, 2026

Why AI Pilots Don't Move EBITDA (and What Does)

Most AI pilots produce spend, not margin. The difference between a pilot and a production system is not the model. It is adoption tied to a specific workflow metric.

A pilot proves a model can work. A production system proves your team will use it. Only the second one shows up on the income statement. The gap between them is not technology. It is adoption tied to a specific workflow metric.

If you run a mid-market company or sit on a PE operating team, you have probably seen the pattern. A vendor demo lands well. A small team runs a pilot. The pilot technically succeeds. Six months later the tool is shelfware, the spend is real, and the operating metrics have not moved. The board asks what changed. The honest answer is nothing.

This is not an AI problem. It is an operating problem that AI happens to expose.

A pilot is designed to answer the wrong question

Most pilots are scoped to answer one question: can the model do the task? That is the easy question. With current models, the answer is almost always yes. The model can summarize the document, extract the invoice fields, draft the reply, flag the exception.

The question that moves EBITDA is different: will the people who own this workflow run it this way every day, at volume, after the consultants leave? A pilot is structured to avoid that question. It runs with a motivated subset of the team, on a clean slice of the work, with someone watching. None of those conditions exist in production.

So the pilot succeeds and the rollout fails, and everyone blames the model.

Spend is easy to create. Leverage is not.

There are two ways an AI initiative can affect the P&L. It can add cost, or it can create operating leverage. Pilots are very good at the first and structurally bad at the second.

Operating leverage means the same team handles more volume, or the same work takes fewer hours, or a reporting cycle that took eleven days takes four. It shows up in headcount productivity, in cycle times, and eventually in margin. It is measurable, and it is measurable against a baseline you set before you started.

A pilot rarely sets that baseline. It measures model accuracy, not hours recovered. So even when a pilot works, there is no before-and-after a CFO can defend. You cannot improve what you never measured, and you cannot report what you cannot quantify.

What actually moves the number

The initiatives that move operating metrics share a few traits. None of them are about the model.

They start from one workflow with visible cost. Not a platform. Not a strategy. One process where the hours and the dollars are obvious: the monthly close, invoice processing, intake routing, reconciliation, customer-operations follow-up. Pick the one with the clearest payoff and the cleanest baseline.

They set a baseline at kickoff. Before any system is built, you write down what the workflow costs today in hours, days, and headcount. That number is the whole point. It is what you measure against at close, and it is what you put in front of a board.

They treat adoption as the deliverable. The system is not done when it works. It is done when the team is using it at volume and the usage data proves it. That means building around the stack people already work in, documenting the new process, and training the people who run it. Adoption is harder than automation, and it is where most of the value lives.

They build for the operator, not the demo. A production system fits the messy reality of how the business actually runs. That is why it keeps running after the engagement ends, and why the metric keeps holding.

The question to ask before the next pilot

Before you fund another pilot, ask one thing: if this works, what specific operating number changes, and how will we measure it before and after?

If the team cannot answer that in a sentence, you are about to buy spend, not leverage. If they can, you are no longer running a pilot. You are running a production project with a defined outcome, and that is the only kind of AI work that reaches the income statement.

That shift, from "can the model do it" to "which workflow metric moves and how do we prove it," is the entire difference between an AI initiative that stalls and one a sponsor can point to before exit.

Get the weekly AI brief.

Read by CIOs and ops leaders. One insight per week.

A pilot is designed to answer the wrong question

Spend is easy to create. Leverage is not.

What actually moves the number

The question to ask before the next pilot

Get the weekly AI brief.

Related reading