Google x Blackstone 700 Billion Yen, NVIDIA 4-Bit Learning — When Will ‘API Prices Halve’ Come Amidst the Giants’ Clash? Small and Medium Enterprises Shouldn’t Wait; Start with Profitable Operations at Current Rates
Related Articles
Conclusion
Let’s get straight to the point. API prices will definitely decrease. However, waiting for them to drop before taking action is too late.
In June 2025, the battle among giants over AI infrastructure intensified. Google and the leading investment firm Blackstone established a new AI cloud company with an investment of approximately $5 billion (about 700 billion yen). In the same week, NVIDIA announced the results of its “4-bit learning” technology, which theoretically can halve learning costs.
When we juxtapose these two pieces of news, one structure becomes clear: the cost of using AI will continue to decrease irreversibly due to both upstream investment competition and technological innovation.
The question is “when” and “by how much” will prices drop? And what should small and medium enterprises do now? There are countless articles that end with vague statements like “let’s keep an eye on this.” This article will delve into specific numbers and structures.
—
The Essence of the Google x Blackstone New Company — A Structure Intentionally Creating “Oversupply”
First, let’s clarify what this new company aims to do.
Google already has its own cloud (Google Cloud). Despite this, it has deliberately brought in external massive capital from Blackstone to create a separate company. Why?
The reason is simple. They want to accelerate the physical expansion speed of data centers without using their own balance sheet. Blackstone is a professional in real estate and infrastructure investment. They can expedite the procurement of land, construction, and power contracts — all aspects of the “physical layer”.
What Google will contribute is AI-specific chips centered around their self-developed TPU (Tensor Processing Unit) and the software stack that operates on them. In other words, it is a declaration to flood the market with AI infrastructure that does not rely on NVIDIA’s GPUs.
This also serves as a clarion call for direct price competition against the Microsoft x OpenAI alliance, Amazon AWS, and the NVIDIA GPU camp.
As supply increases, prices will fall. This is a principle of economics. Here, the significance of the 700 billion yen figure is crucial. This is not a scale for “just trying it out.” To recoup this investment, they need to sell services to a large number of customers. In other words, they have entered a phase of going for market share even at the cost of lowering prices.
—
NVIDIA’s 4-Bit Learning — “Halving Learning Costs” Is Not an Exaggeration
Next, let’s discuss NVIDIA’s 4-bit learning. Technically, this has a significant impact.
Traditionally, AI model training has mainly used FP16 (16-bit floating point) or BF16, and there has been a recent shift towards FP8 (8-bit). What NVIDIA has announced this time is training at 4 bits, which is half of that.
The specific results disclosed are as follows:
- A hybrid Mamba-Transformer model with 12B parameters
- Pre-training with 10 trillion tokens
- Achieving almost the same accuracy as FP8 standards
Reducing the bit count by half means, in simple calculations, memory usage is halved, and computational efficiency can be doubled. If the GPU time required for training is halved, the cloud usage fees will also approach half.
Of course, not all models will yield the same results, and stability in mass production environments will need further verification. However, the direction is clear: we are approaching an era where models of the same performance can be created at half the cost.
What we should consider here is “what happens when learning costs decrease”.
The answer is that the cost of providing APIs will decrease. If costs go down, in a competitive environment, the selling price (i.e., API price) will also decrease. The simultaneous progress of Google x Blackstone’s supply expansion and NVIDIA’s efficiency technology makes it harder to find reasons why API prices wouldn’t drop.
—
How Low Will API Prices Go? — Insights from Past Performance
“We understand that prices will drop. But by how much?”
This is likely the most pressing question. Let’s take a look at past performance.
The trend of API prices at the level of OpenAI’s GPT-4 is as follows:
- March 2023 (when GPT-4 was released): approximately $30 per 1 million tokens
- May 2024 (GPT-4o): approximately $5
- Currently in 2025 (latest GPT-4o): approximately $2.5
In two years, this is a decrease of about 1/12. When viewed annually, this equates to a drop of approximately 70-80% each year.
Assuming this pace continues, it would not be surprising if by mid-2026, prices are reduced to about 1/3 to 1/5 of current levels. This would mean a world where 1 million tokens cost between $0.5 and $1, which translates to about 70 to 150 yen in Japanese currency.
If small and medium enterprises are currently spending several thousand to tens of thousands of yen monthly on the ChatGPT API, it is highly likely that this will be reduced to less than one-third of that amount within a year.
However, there are caveats. Cutting-edge models (like GPT-5 or Gemini 2.0 Ultra) will be expensive right after their release. The drop will occur for “1-2 generations older models”. In other words, the performance level of the current GPT-4o will become available at “rock-bottom prices”. For operational purposes in small and medium enterprises, cutting-edge models are often unnecessary. This is nothing but good news.
—
Should Small and Medium Enterprises “Wait or Act Now” — The Answer is Clear
“If prices are going to drop further, wouldn’t it be better to wait?”
The answer to this question is clear: Don’t wait.
There are three reasons for this.
1. Even at current API prices, they are overwhelmingly cheaper compared to labor costs.
For example, consider internal inquiry responses. Suppose a part-time employee spends 20 hours a month on this task. At an hourly wage of 1,200 yen, that amounts to 24,000 yen a month. If this is replaced with an AI chatbot, the API costs would only be around 2,000 to 5,000 yen per month. Even at current rates, this is less than 1/5.
Even if the price were to halve, 2,500 yen would only become 1,250 yen. Is it worth waiting six months for a difference of 1,250 yen? The loss during that six months, which amounts to 24,000 yen × 6 months = 144,000 yen, is far greater.
2. The value of “familiarity” is greater than cost.
Incorporating AI into operations requires time for prompt adjustments, reviewing workflows, and getting employees accustomed to it. This “learning period” cannot be shortened regardless of how much prices drop.
A company that starts now will have a one-year advantage in operational know-how over a company that starts a year later. For small and medium enterprises, this difference can be more critical than for large corporations. This is because small and medium enterprises operate on the premise of “running with fewer people”. Whether one person can master it can change the productivity of the entire organization.
3. Companies that say “I’ll start when it gets cheaper” will not start even when it does.
This may sound harsh, but it is the reality. I have seen many companies that have said, “Just wait a little longer” for three years. The prices of technology always continue to drop. The moment when one feels that it is “sufficiently cheap” will never come.
—
So What Specifically Should You Start With?
Even if you are told to “act now,” you might be unsure of what to do. I hear such voices.
Here’s a simple framework to consider:
“List the routine tasks that take more than 10 hours a month for people to do.”
- Assistance in creating estimates
- Searching and responding to internal manuals
- Summarizing daily reports and reports
- Drafting emails
- First responses to job applicants
Such tasks can be automated or semi-automated for just a few thousand yen a month at current API prices. Perfection is not necessary. Let AI handle 70% of the tasks previously done by humans, with the remaining 30% checked by a person. This alone will significantly free up the time of the responsible person.
What to do with the freed-up time? Focus on sales, enhance customer support, or think of new products. Understanding this not as “cost reduction” but as “time redistribution” is the essence of AI utilization for small and medium enterprises.
—
Summary — Turning the Giants’ Clash into a Tailwind for Small and Medium Enterprises
The 700 billion yen investment by Google x Blackstone and NVIDIA’s 4-bit learning are infrastructure competitions among large corporations. This is not a matter directly involving small and medium enterprises.
However, as a result of this competition, API prices will definitely decrease. There is a proven track record of a 1/12 reduction over the past two years. There is a significant possibility that they will further decrease to 1/3 to 1/5 in the next year.
Yet, I repeat: Don’t wait.
Even at current prices, they are overwhelmingly cheaper compared to labor costs. And the asset of “familiarity,” which cannot be bought with money, can only be acquired by those who start early.
While the giants are battling to lower prices, there is one thing small and medium enterprises should do: choose one task today and let AI handle it. A 2,000 yen experiment can change the way you work in six months.
The clash of infrastructure is not something to watch from the sidelines. It is something to seize the benefits from as quickly as possible.
JA
EN