The Cost of AI Inference Has Crashed by 90% in a Year. In an Era Where ‘Enterprise-Level AI’ Can Be Acquired for Just 50,000 Yen a Month, What Should SMEs Do?

Conclusion First: AI Is No Longer Just a Tool for the Privileged DeepSeek V4, GPT-5.5, and the practical application of

By Kai

April 25, 2026 | Last updated April 25, 2026

April 11, 2026

From 3 Million Yen to 50,000 Yen: How AI Micro-Dramas Are Revolutionizing Advertising for Small and Medium Enterprises

May 12, 2026

The Implications of Nvidia Becoming a ‘Software Company’—What Really Changes for SMEs is ‘Computational Costs’ and ‘Barriers to Entry’

Conclusion First: AI Is No Longer Just a Tool for the Privileged

DeepSeek V4, GPT-5.5, and the practical application of small models in the 8B class. To summarize what is happening in the AI industry as of spring 2025 in one phrase:

“The bottom has fallen out of inference costs.”

This is not just a technical matter; it is a seismic shift in business. AI processing that used to cost 300,000 yen per month last year now runs at just 30,000 yen. What used to take large companies tens of millions of yen to build in analytical infrastructure can now be done with a single API contract.

I want to ask: Does your company still think that “AI is for large enterprises only”?

—

What Happened: The Simultaneous Release of DeepSeek V4 and GPT-5.5

In April 2025, two major announcements coincided.

China’s DeepSeek announced its V4, a flagship model released as open-source. The inference performance has significantly improved from the previous generation V3, achieving overwhelming benchmarks particularly in coding and structured data processing. Being open-source, anyone can download it and run it on their own servers. When used via API, the cost per input token is about $0.40 per million tokens, and about $1.60 for output.

In the same week, OpenAI released GPT-5.5. This model is a successor to o3 and GPT-4o, with dramatically improved quality in complex inference, data analysis, and long text generation. Notably, despite the enhanced performance, GPT-5.5 is priced at the same level or lower than GPT-4o, which had an input token cost of $2.50 per million tokens.

In other words, performance has increased while prices have decreased. Both at the same time.

This is the essence of the “model war.” As DeepSeek and OpenAI compete on price and performance, it is the users—especially small and medium-sized enterprises (SMEs) that previously could not afford such technology—who benefit.

—

A Numerical Look at the Collapse of Inference Costs

How much have costs changed specifically? Let’s summarize the trends over the past year and a half.

Period	Representative Model	Input Cost per Million Tokens	Output Cost per Million Tokens
End of 2023	GPT-4 Turbo	About $10.00	About $30.00
Throughout 2024	GPT-4o	About $2.50	About $10.00
April 2025	DeepSeek V4 (API)	About $0.40	About $1.60
April 2025	GPT-5.5	About $2.00	About $8.00

The transition from GPT-4 Turbo to DeepSeek V4 API has resulted in a 96% reduction in input costs and a 95% reduction in output costs.

Let’s translate this into practical terms.

For example, consider a local manufacturing company that automatically classifies and summarizes 200 daily order emails using AI to generate daily reports. Assuming an average of 1,000 tokens for input and 500 tokens for output per email, and calculating over 22 working days in a month:

Monthly Input Tokens: 200 emails × 1,000 tokens × 22 days = 4.4 million tokens
Monthly Output Tokens: 200 emails × 500 tokens × 22 days = 2.2 million tokens

Cost during the GPT-4 Turbo era:
Input 4.4 × $10 + Output 2.2 × $30 = About $110 (around 17,000 yen)

Cost with DeepSeek V4 API:
Input 4.4 × $0.40 + Output 2.2 × $1.60 = About $5.3 (around 800 yen)

The cost has dropped from 17,000 yen to 800 yen per month. This is a 95% cost reduction.

For this level of processing, it is no longer a question of “whether to implement it.” It’s just the cost of a few cans of coffee.

—

Another Revolution: Small Models

Alongside the price collapse of large models, another change is underway: small models in the 8B class have reached practical utility.

Models like Llama 3.1 8B, Gemma 2 9B, and Phi-3 Mini, with around 8 billion parameters, are being released one after another. What is remarkable about these models?

They can run on a single PC.

If you have a PC equipped with an NVIDIA RTX 4060 (priced around 40,000 yen), you can run inference locally. There are zero API costs. Since there is no need to send data externally, they can be used for tasks involving personal information.

For example, a local labor consulting office has started using an 8B model locally to draft employment regulations. What used to take a veteran consultant three hours to create a first draft can now be completed in 30 minutes with AI generating the initial draft. The API costs are zero. The only investment required was an additional GPU for 50,000 yen to the existing PC.

A consulting fee of 3 million yen has been replaced by a 50,000 yen GPU—this is already happening in reality.

—

What Happens After “It Became Cheaper”

Now we get to the main point. The reduction in costs is merely a means. What is crucial is the structural discussion about what happens after costs go down.

1. The Barrier to “Testing” Disappears

At 800 yen per month, no approval is needed. Business owners can start with a simple, “Let’s give it a try.” There is no need to spend six months running a PoC (proof of concept) like large enterprises do. The speed of decision-making in SMEs directly translates to the speed of AI implementation. This is a structural advantage that large companies cannot replicate.

2. The “Personalization” Breaks Down

Know-how that was only in the heads of veteran employees—key points for estimates, patterns for handling complaints, specific considerations for each client. By feeding these to AI and prompting it, even newcomers can achieve 70% accuracy. Operations will not stop if someone leaves. This becomes the greatest insurance against the serious “resignation risk” that local SMEs face, more pressing than recruitment difficulties.

3. The Meaning of “Outsourcing” Changes

Updating websites, creating social media posts, simple data aggregation, and creating meeting minutes. Many tasks that used to be outsourced for 100,000 to 300,000 yen per month can now be internalized using AI. It’s not just about saving on outsourcing costs. The time spent on ordering, confirming, and revising communications disappears. This is actually the biggest benefit.

—

So, What Should We Do?

We don’t need abstract discussions. Here are three concrete actions that SMEs should take starting tomorrow.

1. First, Try Replacing One Task with AI

There’s no need to think about company-wide implementation. Choose one repetitive task you do every day and throw it at ChatGPT or DeepSeek’s API. Email classification, daily report summarization, automatic FAQ responses—anything works. You can experiment for under 1,000 yen a month.

2. Try Local Operation of Small Models

If you have tasks that handle sensitive data in-house, use tools like Ollama to run an 8B model locally. If you lack GPUs, add one for 50,000 yen. An environment where AI can be used without sending data externally directly reduces security costs for SMEs.

3. Think About “What Personalization to Break” Rather Than “What to Make AI Do”

It’s meaningless to just look at a list of AI tool functions. List out tasks in your company that “cannot be done without that person.” Apply AI to those. The goal is not “to implement AI” but to “increase the reproducibility of operations.”

—

The Winner of the Model War Is Not the AI Creators

The battle between DeepSeek and OpenAI will intensify further. Google’s Gemini, Anthropic’s Claude, and Meta’s Llama will join the fray, and the price competition will not stop.

However, the winner of this war will not be the companies that create AI. It will be the companies that “fully utilize” AI.

And what is needed to fully utilize AI is not massive investments, AI specialists, or the latest GPU clusters. It is the insight from the field that recognizes, “This could be useful for our operations.”

This insight is overwhelmingly possessed by local SMEs that sweat it out in the field every day, rather than large corporations in Tokyo.

The collapse of inference costs is not the democratization of technology. It is a reversal of field power.

The weapons are ready. The only question left is whether to use them.

TOPICS

WORLD INSIGHT

The Cost of AI Inference Has Crashed by 90% in a Year. In an Era Where ‘Enterprise-Level AI’ Can Be Acquired for Just 50,000 Yen a Month, What Should SMEs Do?

Conclusion First: AI Is No Longer Just a Tool for the Privileged

What Happened: The Simultaneous Release of DeepSeek V4 and GPT-5.5

A Numerical Look at the Collapse of Inference Costs

Another Revolution: Small Models

What Happens After “It Became Cheaper”

1. The Barrier to “Testing” Disappears

2. The “Personalization” Breaks Down

3. The Meaning of “Outsourcing” Changes

So, What Should We Do?

The Winner of the Model War Is Not the AI Creators

POPULAR ARTICLES

Hanging Paintings in Barber Shops and Playing Strings in Temples: How ‘Non-Traditional Spaces’ Embrace Culture

Concerns Over Environmental Destruction: Government to End Support for Mega Solar Projects from FY2027

Ishiba’s Resignation and Japan’s Political Earthquake: Coalition Collapse and the Road to a New Prime Minister

Tramway Exhibition, Large Roof at the Station Building, Tourists on the Circular Line—Hiroshima’s “Flow” is Being Rewritten Simultaneously in Summer 2026

Related Articles

Cloudflare Lays Off 1,100 Employees Amid Record Revenue—What Small Businesses Should Do in the Era of ‘Making Money with AI and Reducing Workforce’

A 290MB AI Runs in the Browser, and AI is Embedded in Smartwatches—How the ‘Location of AI’ is Changing and Upending Cost Structures for SMEs

Google x Blackstone 700 Billion Yen, NVIDIA 4-Bit Learning — When Will ‘API Prices Halve’ Come Amidst the Giants’ Clash? Small and Medium Enterprises Shouldn’t Wait; Start with Profitable Operations at Current Rates

Chrome Automatically Downloads a 4GB AI Model—What Small and Medium-Sized Enterprises Stand to Lose and How to Protect Themselves in the Era of ‘AI Without Knowing’

POPULAR ARTICLES

Hanging Paintings in Barber Shops and Playing Strings in Temples: How ‘Non-Traditional Spaces’ Embrace Culture

Concerns Over Environmental Destruction: Government to End Support for Mega Solar Projects from FY2027

Ishiba’s Resignation and Japan’s Political Earthquake: Coalition Collapse and the Road to a New Prime Minister

Tramway Exhibition, Large Roof at the Station Building, Tourists on the Circular Line—Hiroshima’s “Flow” is Being Rewritten Simultaneously in Summer 2026

TOPICS

WORLD INSIGHT