Turning $100 into $1,569 — A Paper on ‘Token Inflation’ Reveals the Need to Question AI Vendor Pricing

Turning $100 into $1,569. You Can't Verify It. Anthropic's valuation has reached $965 billion, surpassing OpenAI and st

By Kai

|

Related Articles

Turning $100 into $1,569. You Can’t Verify It.

Anthropic’s valuation has reached $965 billion, surpassing OpenAI and standing at the pinnacle of the AI industry.

However, that’s not the main point of this article.

The real issue lies in the structural flaws of the “token billing” system that supports this enormous amount of money. A recently published paper pointed out that “Token Inflation” by AI providers is technically possible, and that it is nearly impossible for users to verify it.

To put it bluntly, there is no way to confirm whether the API fees your company pays every month are truly accurate.

This is a matter that cannot be overlooked by small and medium-sized enterprises (SMEs).

The Token Billing Mechanism and the ‘Invisible Black Box’

First, let’s clarify the premise. AI APIs such as ChatGPT, Claude, and Gemini are billed based on a unit called “tokens.” One token is approximately four characters in English and about one to two characters in Japanese. The fees are determined based on the number of tokens used for both input and output.

For example, with Claude 3.5 Sonnet, the cost is $3 per million tokens for input and $15 per million tokens for output. For GPT-4o, it’s $2.50 for input and $10 for output. At first glance, this appears to be straightforward pricing.

However, the problem is that there is no way for users to verify whether the token count is actually accurate.

The paper points out the following:

  • The model’s tokenizer (the algorithm that splits text into tokens) is either proprietary or frequently changed.
  • Users cannot see how many tokens were consumed internally during the inference process.
  • The number of tokens included in the API response is based on “self-reporting,” with no mechanism for third-party auditing.

In other words, if the provider says, “This request used 500 tokens,” users have no choice but to believe it. Whether it was actually 300 tokens or, conversely, if 2,000 tokens were used internally in the inference chain, users would be none the wiser.

The simulation in the paper demonstrated that up to 1,469% inflation is technically possible, meaning a $100 bill could become $1,569.

Inflation Is Not Just a Matter of Malice

It’s important to clarify that this is not a claim that “AI vendors are intentionally overcharging.”

The essence of the problem lies in the fact that a state of unverifiability as a system has been left unaddressed.

Let’s outline some scenarios that could realistically occur:

1. “Silent Price Increases” Due to Tokenizer Changes
Every time the model is updated, the tokenizer may change as well. A text that was processed as 200 tokens in an old version could become 280 tokens in a new version — such changes can happen without notice. Although the price per token remains the same, it effectively results in a 40% price increase.

2. Internal Token Consumption in the Chain of Thought
AI models like OpenAI’s o1 or o3, which “think,” consume a large number of tokens internally before generating a response. Users have no means to assess whether the amount charged for these internal inference tokens is reasonable. When told, “I thought it through,” users cannot tell if that cost 50 tokens or 5,000 tokens from the outside.

3. Bloating of System Prompts
When using the API, there are system prompts automatically inserted by the provider, such as safety filters and operational instructions. If these system prompts bloat, unintended token consumption occurs each time, and this is also included in the billing.

None of these can be outright labeled as “fraud.” However, they create a structure where costs inflate in areas beyond user control.

For SMEs, This Is Not Just a Matter of a Few Thousand Yen a Month

You might think, “We don’t use it that much, so it doesn’t concern us.”

But consider this: when local SMEs start to seriously integrate AI into their operations, API usage can skyrocket.

For example, consider the following cases:

  • Customer support chatbot: 5,000 inquiries per month × average 800 tokens (input and output combined) = 4 million tokens
  • Automatic summarization of daily reports: 50 employees × 20 business days × 1,500 tokens = 1.5 million tokens
  • Automatic generation of estimates: 200 cases per month × 2,000 tokens = 400,000 tokens

This totals 5.9 million tokens. While it may seem like a few dozen dollars a month at the GPT-4o level, if multiple models are used, inference models are utilized, or image processing is added, the monthly cost could balloon to several hundred to several thousand dollars. That amounts to hundreds of thousands to over a million yen annually.

What if there’s an additional 20% to 30% of “invisible markup”? That would mean tens of thousands of yen disappearing each year. For SMEs, that amount is equivalent to the salary of one part-time employee.

What Anthropic’s $965 Billion Valuation Means

Let’s return to the topic of Anthropic.

$965 billion. Approximately 140 trillion yen. More than three times Toyota’s market capitalization. This valuation is based on expectations for future API revenues.

In other words, investors are betting on the premise that companies around the world will continue to pay for tokens.

When viewed calmly, the opacity of the token billing structure reveals a problem where vendors have “no incentive to correct it.” By increasing transparency, users would optimize their usage and reduce unnecessary token consumption, which would mean a decrease in vendor revenue.

To justify a valuation of $965 billion, token consumption must continue to rise. Users need to be aware of this structural conflict of interest.

So, What Should SMEs Do?

It’s not enough to simply say, “Let’s demand transparency.” Here are actionable steps you can take starting today.

1. Create a System to Measure Token Consumption Yourself

Don’t take the token count included in the API response at face value. Use open-source tokenizers (like tiktoken) to calculate the token count of input text on your own and cross-check it with the billed amount. If there’s a significant discrepancy, that’s an “invisible cost.”

2. Compare Costs for the Same Task Across Multiple Vendors

OpenAI, Anthropic, Google, and local LLMs. Use the same prompt and compare token counts and fees. You may find surprising differences. Using Claude for one task and Gemini for another, or employing local LLMs for routine processing, can cut costs by more than half.

3. Think in Terms of “Task Cost” Rather Than “Token Price”

What really matters is not “how much is one token” but “how much does it cost to create one estimate?” Even if the token price is low, if a model uses a lot of tokens due to redundant output, the task cost will be high. Conversely, a model that accurately answers with fewer tokens may ultimately be cheaper, even if its token price is higher.

4. Consider Local LLMs as an Option

For sensitive data or routine processing, many cases can be handled sufficiently by small models that can run locally (like Llama, Phi, or Gemma). This incurs zero API costs. With an initial GPU investment (around 300,000 to 500,000 yen), monthly API fees can be eliminated. You can recoup your investment in a year.

5. Create Monthly API Usage Reports

Visualize which tasks are costing how much with which models on a monthly basis. Just doing this will reveal waste. The most dangerous state is when you are using it “just because it seems convenient.”

The Dangers of a World Where Tokens Become “Commodity Futures”

As a supplementary note, there are movements to treat AI tokens as subjects of derivative (futures) trading, similar to gold or oil.

While this may seem less relevant to SMEs, it is structurally something to be cautious about. If tokens become speculative assets, price volatility could occur. This means the risk of API prices suddenly skyrocketing could emerge. This is akin to companies that suffered due to market-linked electricity pricing plans.

If “fixed price plans” or “annual contracts with price locks” are available, SMEs should opt for those. Only large companies with the capacity to absorb such risks should take on volatility.

Conclusion: Question, Measure, Compare

Anthropic’s $965 billion figure symbolizes the excitement in the AI industry. However, the fuel for that excitement is the token fees paid by users around the world.

The paper has highlighted the structural opacity in the billing mechanism. Regardless of malice, continuing to pay unverifiable bills every month is a risk for management.

What SMEs need to do is simple:

  • Question: Don’t take the token counts in API bills at face value.
  • Measure: Calculate token counts yourself and understand discrepancies.
  • Compare: Evaluate task costs across multiple vendors, including local LLMs.

Large companies can afford dedicated teams for cost optimization. SMEs do not have that luxury. Therefore, it is essential to protect yourself through the system. Monthly reports, vendor comparisons, and the use of local LLMs can all be achieved without significant investment.

There is no need to continue paying the asking price of AI vendors to benefit from AI.

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN