Capital Marketsdeep-diveby Amir

AI Coding Has Entered a Token Subsidy War

Why OpenAI, Anthropic, and Cursor are bundling tokens into subscriptions, leaning on API credits, and betting that efficiency gains can outrun the cost of agentic coding.

The subsidy

On March 5, 2026, Forbes reported that Cursor had internally estimated some high-end Claude Code subscriptions could consume far more in compute than the monthly fee implied.[1] The exact figures in that story rely on unnamed sources, so they should be treated as reported claims rather than audited fact. But the broader point is hard to miss: coding has entered a subsidy war.

That war exists because the product changed. A chatbot that answers a short prompt is one thing. An agent that reads a repository, keeps long context in memory, calls tools, runs tests, revises its own code, and stays active for hours is another. OpenAI bills output tokens separately, notes that reasoning tokens count as output, and charges separately for built-in tools. Anthropic similarly prices long-context usage and tools separately, and its Claude Code documentation says cost rises with context size, tool overhead, and the number of active subagents.[5][8][11]

The result is a market in which the headline subscription price often tells you less than the usage pattern underneath it.

The mix

The most important structural question in end-user is no longer what a subscription should cost. It is how much usage a company can bundle before metering has to reappear.

OpenAI includes Codex in paid ChatGPT plans, but it keeps API billing separate and lets Business customers add credits as needed.[2][3][4] Anthropic includes Claude Code in Pro and Max, while Team and Enterprise plans can switch to extra usage billed at standard API rates.[9][10] Cursor has made the underlying economics even more explicit: its individual plans include a fixed amount of API-priced usage, and its Teams plan moved from fixed per-request pricing to usage priced off the model providers' API rates plus a platform markup.[14][15][16]

This is not three companies trying unrelated experiments. It is three companies converging on the same answer from different directions. Subscriptions are the adoption layer: they reduce friction, flatten buying decisions, and help users build habits. APIs, credits, and overages are the economic truth layer: they expose the marginal cost of heavy usage. The durable pricing model is looking less like "unlimited " and more like software seats with embedded cloud spend.

The efficiency question

Will models get dramatically more efficient? Yes. They already have.

Stanford HAI reported that the inference cost for GPT-3.5-level performance fell more than 280-fold between November 2022 and October 2024.[17] Epoch found that inference prices at fixed performance levels have been falling between 9x and 900x per year depending on the task, with a median around 50x annually.[18] Alphabet said it reduced Gemini serving unit costs by 78 percent over 2025. OpenAI said GPT-4.1 was 26 percent less expensive than GPT-4o for median queries and later introduced GPT-5.1 with adaptive reasoning and lower-cost caching for simpler tasks.[19][6][7]

That is the good news. The harder news is that efficiency gains do not automatically simplify the business model, because cheaper models invite more ambitious use. Longer context windows invite bigger prompts. Better reasoning enables longer chains of work. Better tool use encourages more automation. Anthropic's Sonnet 4.6 added a 1 million-token context window in beta and highlighted token-efficiency features such as compaction and tool filtering.[12] Those features matter precisely because usage is moving toward bigger and more autonomous workloads.

So the answer is not that performance will stop mattering and efficiency will take over. Efficiency is becoming part of performance.

Capex and tokenomics

The usual argument here splits into two camps. Either the economics are difficult because token pricing is structurally unfavorable, or they are manageable because the largest players can spend enough capital to make the problem go away.

It is both.

Massive capital spending is what gives the biggest players room to keep cutting prices, bundling usage, and absorbing periods of ugly unit economics. Alphabet said its 2026 capital expenditures are expected to be between $175 billion and $185 billion. Meta guided to $115 billion to $135 billion. Microsoft said cloud gross margin was pressured by continued investments in infrastructure and growing product usage.[19][20][21]

But the spread of credits, rate limits, premium seats, and usage-based overages is a reminder that raw tokenomics still matter at the heavy-user edge. Anthropic's pricing rises for prompts above 200,000 input tokens. Claude Code's own documentation says team usage averages roughly $100 to $200 per developer per month, with large variance. Cursor says daily agent users typically consume more than its base plan includes, while power users often exceed $200 per month.[8][11][15]

If frontier-agent usage were already cleanly economical, companies would not be rebuilding their products around hybrid billing. They are doing that because the marginal economics are still uneven and spiky.

OpenAI is not the whole story

It is tempting to tell this as a story about OpenAI falling behind. The source base does not really support that.

A more defensible reading is that Anthropic enjoyed a visible stretch of coding momentum, especially after Claude 3.7 Sonnet and Claude Code in February 2025, followed by Sonnet 4.6 in February 2026.[13][12] OpenAI's response was not to abandon performance for efficiency. It was to pursue both at once: GPT-4.1 in April 2025, Codex upgrades in September 2025, GPT-5.1 for developers in November 2025, and the Codex app in February 2026.[6][4][7][3]

So if the question is whether OpenAI started gearing toward efficiency, the answer is clearly yes. The more important point is that everyone else did too. The labs are not choosing between being smart and being cheap. They are trying to improve the performance-to-cost ratio fast enough that subscriptions remain attractive while enterprise and API revenue make the math survivable.

Where pricing goes

The likely future of end-user pricing is not pure token billing and not truly unlimited subscriptions.

It is hybrid pricing that hides complexity for average users and restores economic discipline for heavy ones: seats plus included usage, credit top-ups, enterprise pools, premium tiers for higher autonomy, and differentiated pricing for context length, reliability, or background execution. The meter does not disappear. It just becomes less visible until the user becomes expensive enough to meter more precisely.

That is why the Subscription/API mix matters so much. Subscriptions buy distribution. APIs discover the real price. Enterprise plans do the political work of making that price easier to tolerate.

The companies that win this market will not just have the best model. They will have the best performance-to-cost-to-distribution loop. Right now, the subsidy war is how that loop gets built.

Sources

  1. [1]
    Cursor Goes To War For AI Coding Dominance Forbes(accessed 2026-03-10)
  2. [2]
    ChatGPT pricing OpenAI(accessed 2026-03-10)
  3. [3]
    Introducing the Codex app OpenAI(accessed 2026-03-10)
  4. [4]
    Introducing upgrades to Codex OpenAI(accessed 2026-03-10)
  5. [5]
    API Pricing OpenAI(accessed 2026-03-10)
  6. [6]
    Introducing GPT-4.1 in the API OpenAI(accessed 2026-03-10)
  7. [7]
    Introducing GPT-5.1 for developers OpenAI(accessed 2026-03-10)
  8. [8]
    Pricing Anthropic(accessed 2026-03-10)
  9. [9]
    Claude Code Anthropic(accessed 2026-03-10)
  10. [10]
  11. [11]
    Manage costs effectively Anthropic(accessed 2026-03-10)
  12. [12]
    Introducing Claude Sonnet 4.6 Anthropic(accessed 2026-03-10)
  13. [13]
    Claude 3.7 Sonnet and Claude Code Anthropic(accessed 2026-03-10)
  14. [14]
    Pricing Cursor(accessed 2026-03-10)
  15. [15]
    Rate limits and included usage Cursor(accessed 2026-03-10)
  16. [16]
    Updates to Teams pricing Cursor(accessed 2026-03-10)
  17. [17]
    AI Index Report 2025 Stanford HAI(accessed 2026-03-10)
  18. [18]
  19. [19]
    Q4 2025 Alphabet Earnings Conference Call Alphabet(accessed 2026-03-10)
  20. [20]
  21. [21]
    FY26 Q2 performance Microsoft(accessed 2026-03-10)