The Implications of Nvidia Becoming a ‘Software Company’—What Really Changes for SMEs is ‘Computational Costs’ and ‘Barriers to Entry’
Related Articles
Conclusion
Let’s get straight to the point: The era of ‘GPU = expensive’ is coming to an end.
Nvidia is no longer just a hardware company.
You might think, “What are you talking about? They still sell GPUs!” Indeed, the H100 and B200 are selling like hotcakes. However, Nvidia’s fundamental competitive advantage is no longer in the chips themselves, but in the software ecosystem known as CUDA (Compute Unified Device Architecture).
What does this mean for small and medium-sized enterprises (SMEs)? In short, it means:
We have entered a phase where the ‘cost of running AI’ is structurally decreasing.
The AI infrastructure that large companies used to build at costs of tens to hundreds of millions of yen is becoming dramatically cheaper due to software optimization. Surprisingly, the biggest beneficiaries will not be large corporations, but SMEs.
—
Nvidia’s ‘Core Business’ is Now CUDA
The reason Nvidia’s GPUs hold an overwhelming market share is not just due to chip performance. The development environment known as CUDA has effectively become the standard for AI and machine learning.
AMD and Intel are also releasing AI-targeted chips. Some products are competitive in terms of performance. But developers are not adopting them. Why? Because the code assets written in CUDA are too vast, and the switching costs are too high.
This is similar to the structure that Microsoft built with Windows. It’s not just the OS itself, but the applications and ecosystem that run on it that create lock-in. For Nvidia, CUDA serves this purpose, and the software layer built on top of the hardware is the real core.
In other words, Nvidia’s strategy has shifted from “making good GPUs” to “creating overwhelming value through software on GPUs.” This shift is leading to a dramatic reduction in computational costs.
—
The Real Meaning of ‘20% Faster on the Same Hardware’
Let’s look at some specific numbers.
In 2025, a technology called ‘TwELL (Trimming Weights with Efficient Layer-Level Sparsity for LLMs)’ co-developed by Sakana AI and Nvidia was announced.
What this technology does is simple. It identifies neurons (activation sparsity) that do not actually contribute to computation within the feedforward layers of large language models (LLMs) and omits them. The architecture itself does not change; it merely eliminates “waste” in computation.
The results are as follows:
- Inference speed: 20.5% improvement
- Training speed: 21.9% improvement
The hardware remains the same. With software optimization alone, speeds have increased by about 20%.
How you interpret this—whether as “20% faster” or “the same processing can be done 20% cheaper”—changes the perspective.
For SMEs, the latter is crucial. An AI inference cost of 100,000 yen per month drops to 80,000 yen. This saves 240,000 yen annually. If you run 10 processes, that’s 2.4 million yen saved each year, all occurring without any additional investment, purely through software optimization.
Moreover, this is just the beginning. Software optimization compounds over time. While hardware generations change every 2-3 years, software improvements occur in a matter of months. The rate of cost reduction is clearly faster than in the hardware-driven era.
—
Agent AI Triggers ‘Inference Cost Explosion’—This is Why Software Optimization Matters
Another structural change to note is the rise of AI agents.
Until now, AI usage has primarily been about “one question, one answer.” You ask ChatGPT, and it provides a response. Inference is completed in one go.
However, agent AI operates differently. It breaks down complex tasks, performs inference repeatedly in multiple steps, calls external tools, verifies results, and retries as necessary. For a single task, inference might run 10 or 20 times.
When Nvidia’s CEO Jensen Huang states that “the demand for inference computation will increase 100-fold in the future,” it is in this context. As agents become more widespread, inference costs will explode.
This raises an important question:
In a world where inference costs increase 100-fold, how valuable is a 20% cost reduction through software optimization?
The answer is clear. If inference volume increases 100-fold, a 20% efficiency gain corresponds to a cost reduction of 20 times the original amount. The value of software optimization increases exponentially with the rise in inference volume.
This is the structural reason why Nvidia is shifting to become a “software company.” Simply selling hardware is no longer sufficient. If they do not continue to enhance inference efficiency through software, customer costs will balloon, halting the very utilization of AI.
—
So, What Can SMEs Gain?
Let’s move from abstract discussions to concrete matters.
1. The Cost of ‘Using AI’ Will Continue to Decrease. Don’t Wait, Start Using It Now
With advancements in software optimization, the cost of utilizing cloud AI will certainly decrease. Just looking at OpenAI’s API fees, the cost per token has dropped to about one-tenth compared to when GPT-4 was announced just a year ago. This trend will accelerate.
Waiting for prices to drop further is the worst strategy. As costs decrease, competitors will also start using it. Companies that adopt it early and integrate it into their operations will build a barrier to entry through their know-how. While AI costs will decrease, the experience needed to effectively utilize AI cannot be bought with money.
2. ‘Running AI on In-House Servers’ Will Become Realistic
As inference efficiency improves, practical AI can run without expensive GPU servers. Already, models like Meta’s Llama 3.1 8B class can operate on gaming PCs priced in the 100,000 yen range. If optimization technologies like TwELL become widespread, practical-level inference will be possible on even lighter hardware.
This is significant for SMEs. Without paying tens of thousands of yen monthly for cloud APIs, they can run AI on a local setup with an initial investment of 200,000 to 300,000 yen, keeping their data in-house.
For example, a regional manufacturing company could run inspection AI, or a professional services firm could automate contract reviews. In such cases, reliance on the cloud is becoming less necessary.
3. ‘AI Agents’ Will Provide Greater Benefits to SMEs
The essence of agent AI is “substituting human labor.” It autonomously handles multiple steps.
Large corporations have personnel. SMEs do not. Therefore, the impact of automation through agent AI is greater for SMEs.
For instance, the following scenarios are becoming a reality:
- Reading the content of inquiry emails, automatically generating quotes, and sending approval requests via Slack to superiors
- Compiling monthly billing data, detecting anomalies, and generating reports to be sent via email
- Reading job application documents, screening them, and listing candidates for interviews
These tasks would have been considered a “200,000 yen system project” just six months ago. Now, using AI agent frameworks (like LangChain, CrewAI, etc.), they can be built with development costs of just a few tens of thousands of yen and a few days of work.
—
What Happens in a World Where ‘Software is the Center of Value’
Finally, let’s summarize the larger structural changes.
Nvidia becoming a software company means that the source of AI value has shifted from ‘having computational power’ to ‘effectively using computational power.’
This is good news for SMEs.
Having computational power is a capital game, where large corporations hold a significant advantage. However, effectively using computational power is a battle of wisdom. Even small companies that deeply understand on-the-ground challenges and can appropriately integrate AI can win.
A regional food manufacturer with 300 employees could run demand forecasting AI by combining its order data with weather data, reducing waste loss by 30%. This is a world where SMEs with a sense of the field can implement projects that would take a large corporation’s DX department a year to complete in just three months.
Now that the hardware barrier has lowered, what will determine the outcome is the quality of the question ‘How to use AI?’
That question can best be posed by those closest to the field—the SME owners themselves.
—
What to Do Starting Today
- Write down three of your company’s ‘repetitive tasks.’ Consider whether they can be automated with AI agents.
- Research the API fees for cloud AI. Compare them to the prices from six months ago. Update your cost awareness.
- Try out a local LLM on one machine. Install Ollama on a PC priced in the 100,000 yen range and experiment with your own data.
“Getting hands-on first” is the strongest strategy. The benefits of the software shift that Nvidia is advancing on a trillion-yen scale will not come to you if you wait. Only those companies that actively seek them out will reap the rewards.
JA
EN