The Business Engineer

The Business Engineer

The CFO’s Guide to the Token Economy

Gennaro Cuofano's avatar
Gennaro Cuofano
Jun 23, 2026
∙ Paid

In the economics of AI, I’ve explained the dynamic of the whole ecosystem at a fundamental level.

Tokenomics: The Economics of AI

Gennaro Cuofano
·
May 12
Tokenomics: The Economics of AI

Every conversation about AI in 2026 — capex, capacity, jobs, geopolitics, valuations, margins — eventually collapses to the same atomic unit. The token. It is simultaneously the unit of cognition the model produces, the unit of compute the data center serves, the unit of price the lab charges, and the unit of value the enterprise extracts.

Read full story

For two years, AI was a CIO line item and a CEO talking point. In 2026 it became a CFO problem, because the economics finally turned legible — and unforgiving.

The core mechanism is a paradox most finance teams still misread:

The cost of intelligence is collapsing while the cost of deploying intelligence is exploding.

The numbers make the tension concrete:

  • Unit cost is in free-fall. Achieving GPT-4-level reasoning cost roughly $60 per million tokens in early 2024. By the start of 2026, high-efficiency models deliver comparable reasoning for $0.30–$0.75 per million — a decline of over 98% (TokenRing / 2026 Unit Economics Reckoning).

  • Total spend is in free-flight. The average enterprise AI budget grew from ~$1.2M/year in 2024 to ~$7M in 2026, with some Fortune 500 firms reporting monthly inference bills in the tens of millions (Oplexa, AI Inference Cost Crisis 2026).

  • Inference, not training, is now the bill. Inference has grown to roughly 80–90% of total AI compute consumption and ~85% of the enterprise AI budget.

The reason cheaper tokens produce bigger bills is structural, and it is the single most important sentence in this guide:

The world moved from a per-query model to a per-workflow model — where one action can trigger dozens of model calls — so lower cost per token multiplied by exploding token volume increases total spend (Zenskar, Token-Based Pricing: The CFO’s Guide 2026).

As companies move from chatbots to thousands of autonomous agentic workflows running 24/7, the line item that used to be a rounding error becomes the budget.

The substrate for everything below: a token is the unit of intelligence consumed; an agent is a process that consumes tokens to produce work. The new job of finance is to manage the ratio between the two — work produced per token burned. That ratio is the gross-margin equation of the agentic era.

What a CFO Should Start Doing ASAP

User's avatar

Continue reading this post for free, courtesy of Gennaro Cuofano.

Or purchase a paid subscription.
© 2026 Gennaro Cuofano · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture