Enterprise software has run on a simple promise for 20 years: pay a monthly fee, know what you are spending. That model is ending.
Consumption-based pricing for AI tools means the cost of a deployment only becomes visible after the work is done, and a successful rollout can generate a larger invoice than a failed one. The friction is already showing up on the ground. Users of Claude Code have reported unexpected token burn simply by asking the agent a question, only for it to interpret the query as an instruction and start executing.
The meter was running before anyone decided to spend.
The bigger story is structural, and it is one that enterprise finance functions are only beginning to reckon with.
Table of Contents
- The Unpredictability of Agentic Workloads
- The Pitch: Pay for Outcomes, Not Access
- The Answer? Build Agent Spending Caps Into the Architecture
- What the Future of Agent Pricing Might Look Like
The Unpredictability of Agentic Workloads
For two decades, enterprise software budgeting was relatively straightforward. A seat cost a fixed amount, headcount determined spend and the finance team could plan a year out. AI agents have dismantled that logic.
Alex Bakker, director of primary research and distinguished analyst at ISG, has watched the problem develop across dozens of client engagements and he identifies two structural breaks. The first is that AI experimentation is closer to R&D than to software licensing, making token consumption behave more like infrastructure spend than seat allocation.
The second is harder to solve. "As the industry is shifting toward agentic workloads, many of which involve recursive iteration, we see the size of the context windows for agentic operations growing unpredictably," he said. "It is difficult to predict ahead of time how many iterations an agent will need to be successful."
The mismatch is also visible from the engineering side, said Johnny Halife, CTO at Southworks, a software consultancy that advises enterprises on AI integration. The models, he argues, aren't the problem.
"What breaks first is the financial model around them. Companies are moving from predictable SaaS pricing to consumption models that behave more like cloud infrastructure. You are no longer paying for seats. You are paying for computation, tokens and decisions made by agents," Halife said.
Finance teams are applying forecasting tools designed for a world of fixed costs to a system that charges by the decision, at a volume and speed no human approval process was built to track.
The Pitch: Pay for Outcomes, Not Access
Vendors have reframed the shift in their favor. The pitch: rather than paying for software access, enterprises now pay for outcomes.
Vendors like Intercom, Zendesk and Sierra have led the way, while larger players like Salesforce and Microsoft have moved toward consumption-based models that are closer to — but not quite — true outcome-based pricing.
Microsoft launched a consumption-based Copilot tier in January 2025 that charges organizations by the message. Salesforce introduced flexible Agentforce pricing in May 2025, built around a pay-per-action credit model. The consistent message in both licenses is that cost now equals value.
The framing doesn't hold up against standard outcome-based pricing logic, said Bakker. "Regardless of the value of the output, without knowing the actual costs, framing it as an outcome versus a license does nothing to increase predictability and forecastability."
In the services industry, outcomes typically mean no charge if the result is not delivered. AI vendors, with a few exceptions, bill regardless of what the compute produced. And even in cases where payment is based on outcomes, establishing that a successful outcome is achieved is up for interpretation, according to BCG.
That gap is what concerns Paul Malott, chief executive of Automations24 and a doctoral researcher in digital resource governance. When a user asks a simple question and the agent interprets it as an execution task, compute costs begin compounding before anyone in finance has seen the request.
For Malott, this isn't just a billing quirk but a structural problem: cost is now tied to the behavior of an autonomous system, not to any decision a human made to spend.
The outcomes narrative, in other words, reframes a pricing risk transfer as a strategic upgrade, with enterprises bearing the consumption uncertainty while vendors collect on compute either way.
The Answer? Build Agent Spending Caps Into the Architecture
Most organizations' response has been to reach for better forecasting tools and usage dashboards. Malott said that addresses the wrong layer.
"AI is still being treated as a tool when it actually behaves like an autonomous decision-making system," he said. Malott's answer is governance built directly into the decision layer, with real-time spend caps at every execution point, dynamic routing to the most cost-effective model available, and automated controls that halts a runaway agent before it reaches the balance sheet.
Halife agrees. "The organizations that handle this well treat AI like infrastructure, building observability, guardrails and cost visibility into the architecture from day one."
The problem is that most agent deployments are not being built that way, he continued. AI is being bolted onto existing workflows rather than designed around the consumption model it operates on.
Approaching the governance question from the design phase treats model selection as a core discipline rather than a default, said Jennifer Lendler, founder and managing principal of AI advisory firm Alea Advisors. With around 120 models available for her builds, the selection process is deliberate.
"It can't be a one size fits all, or else you will be wasting tokens and money," Lendler said.
Some of the vendor platforms do have effective guardrails, with controls that pause agent execution before costs escalate, Lendler notes. The gap is that not all enterprise deployments are built with that level of care, and not all vendors make it straightforward to apply.
What the Future of Agent Pricing Might Look Like
Whether consumption pricing is a temporary friction point or a permanent reality may ultimately depend on how quickly enterprise AI reaches the scale that makes variance manageable.
Bakker draws a direct parallel with the early cloud era, noting that in 2010, organizations found infrastructure costs almost impossible to predict because their consumption was lumpy relative to total platform volume. As usage grew, variance smoothed out, reserved instances became viable and forecasting stabilized.
But the analogy has limits. "The challenge with comparing AI tokens to the cloud is that there is an extra layer of uncertainty," Bakker said. With compute, you knew what you bought even if you wasted it. With tokens, there is uncertainty around the volume consumed and whether that consumption produced anything worth the cost.
The optimistic reading is that the industry is passing through an early, uncomfortable phase that will eventually resolve as tools mature and consumption baselines grow. The advisory model that Lendler has built — pricing around value delivered rather than compute consumed — hints at what a more considered market structure might look like.
A hybrid floor-plus-expansion approach, where a predictable baseline covers core access while consumption signals drive expansion conversations rather than surprise invoices, is another approach Malott shares.
The sharpest note of caution comes from Bakker, who resists framing unpredictability as purely a cost problem. "Unmeasured success is at least as common as unforecasted spending," he said, pointing to deployments where AI delivered well beyond expectations and nobody in finance could account for why.
For enterprise teams trying to govern AI spend, that cuts both ways. The invoice may be unpredictable. So, increasingly, is the upside.
Editor's Note: So many unresolved AI questions, so little time:
- Why Agentic AI Projects Fail — The rise of AI agents is reshaping work, but early implementations are running into deep organizational and technical challenges.
- Your AI Workflows Will Outlast the Leaders Who Approved Them — AI agents don't create accountability problems, they inherit them. When autonomous systems outlast the teams that built them, ownership disappears.
- When Copilots Fail: The Risks of Overreliance on AI — AI copilots transform work, but overreliance can lead to mistakes, compliance breaches and lost trust. Discover warning signs and best practices.