The Era of ‘AI as Someone Else’s GPU’ is Over—How Small Models, Local Execution, and a Monthly Cost of 50,000 Yen Will Revolutionize the Competitive Landscape for SMEs
Related Articles
Conclusion
Let’s get straight to the point: The “ownership cost” of AI has collapsed.
Just two years ago, if you wanted to seriously use AI, you had to brace yourself for monthly cloud fees ranging from hundreds of thousands to millions of yen. The costs included the API fees for GPT-4, GPU instance charges on AWS, and data transfer fees—all of which were expenses for renting “someone else’s GPU.”
Now, AI can run on your own PC for just 50,000 yen a month.
Most small and medium-sized enterprise (SME) owners still do not grasp the significance of this change. This is not merely a story about cost reduction. It’s about the very structure of where the value of AI resides has changed.
What Happened—Three Major Shifts
1. Small Models Have Reached a “Usable” Level
Until 2023, the prevailing wisdom was that “the bigger, the smarter” when it came to AI. GPT-4 has an estimated 1.8 trillion parameters and required GPU clusters costing tens of millions of yen just to run.
However, the situation changed dramatically in 2024. Microsoft’s “Phi-3 Mini” achieved performance equivalent to GPT-3.5 with 3.8 billion parameters. Meta’s “Llama 3 8B” can run on smartphones. Google’s “Gemma 2B” can even perform inference on devices as small as a Raspberry Pi.
What specifically has changed?
- Practical accuracy can be achieved with fewer than 1/100th the number of parameters.
- The required GPU memory is now under 16GB (i.e., it can run on commercially available gaming PCs).
- In some cases, inference speed is even faster than through cloud APIs.
In the research world, frameworks like Falconer have emerged, establishing methods for efficiently distilling the knowledge of large models into smaller ones. Reports indicate a 90% reduction in inference costs and a 20-fold increase in processing speed.
This is not about compromising because of size. It’s a reversal: “Because it’s small, it’s faster, cheaper, and can run locally.”
2. The Democratization of Hardware Has Accelerated
Apple’s introduction of robust local AI inference support with the M4 chip was symbolic. With 16GB of unified memory on a MacBook Pro, models in the 7 billion parameter range can run comfortably, priced in the 200,000 yen range.
Windows desktops equipped with NVIDIA’s RTX 4060 (retail price around 40,000 yen) can achieve similar results using tools like llama.cpp or Ollama.
In other words, the initial investment required for the hardware to run AI has dropped to between 100,000 and 300,000 yen.
Considering that cloud GPU instances (e.g., AWS p4d.24xlarge) cost about 5,000 yen per hour, using them for just 100 hours a month would cost 500,000 yen. In contrast, running it on your own PC costs less than 5,000 yen a month, including electricity.
An initial investment of 300,000 yen plus a monthly running cost of 5,000 yen breaks down to a total of 50,000 yen per month (the initial investment amortized over six months, with only 5,000 yen per month after the seventh month).
3. The Open Source Ecosystem Has Matured
Having the models alone is not enough; what’s crucial is that the “toolset” to integrate them into business operations is now in place.
- Ollama: Launches local LLMs with a single command. Installation to inference can be completed in 5 minutes.
- LM Studio: Allows users to select and download models via a GUI, enabling immediate use of a chat UI.
- LocalAI: Can set up a local OpenAI-compatible API, allowing existing GPT integration tools to work seamlessly.
- LangChain / LlamaIndex: Can vectorize internal documents to build RAG (Retrieval-Augmented Generation).
All of these tools are free, with zero licensing fees.
Two years ago, setting up such an environment required a specialized ML engineer; now, an employee with a bit of IT knowledge can complete it in half a day.
So, What Changes for SMEs?
Now we get to the main point.
Reversal Structure 1: Large Corporations’ “AI Investments” Become a Handicap
Large corporations have already invested tens of millions to billions of yen in cloud AI infrastructure. Annual contracts with vendors, internal approval processes, and security reviews weigh heavily as “switching costs.”
In contrast, SMEs can start from scratch. They have no entanglements and can begin as soon as next week.
While large corporations take three months to get approval, SMEs can create a working prototype in three days.
This speed difference becomes a critical advantage in an era where technological change is rapid.
Reversal Structure 2: Proximity to Data Becomes a Weapon
The performance of AI is not determined solely by the size of the model. “How well you can provide context” is what makes the difference.
SMEs have the advantage of being close to the data from the field. The CEO speaks directly with customers. The expertise of veteran craftsmen is just a desk away. Daily sales reports can directly translate into customer insights.
By feeding this information into a local LLM to build RAG, they can generate far more relevant responses than a generic AI chatbot from a large corporation.
For instance, if a local construction company feeds its local LLM with 10 years’ worth of construction reports (500 cases), and asks, “What are the precautions for foundation work in winter on this area’s soil?” it will receive specific answers based on its own track record—something GPT-4 would not provide. There is definitely a domain where the combination of proprietary data and small models can outperform general large-scale models.
Reversal Structure 3: “Eliminating Dependency” Can Be Achieved for 50,000 Yen
The biggest challenge for SMEs is dependency on key personnel. If a veteran leaves, operations can grind to a halt. Manuals exist, but no one reads them. Transitions can take six months.
By feeding the veteran’s know-how into a local LLM, you can create a “copy of a veteran employee who answers questions 24/7.” While not perfect, being able to provide answers with 70-80% accuracy dramatically changes the onboarding process for new hires.
Previously, outsourcing such knowledge management systems would cost between 3 million and 5 million yen. Even using SaaS would cost over 100,000 yen per month. Now, it can be built in-house for just 50,000 yen (effectively 5,000 yen after the seventh month).
300,000 yen has become 50,000 yen. This cost difference of over ten times changes the reality of AI adoption for SMEs.
Caution—It’s Not a Silver Bullet
Before getting too excited, there are a few important points to keep in mind.
- Small models have limitations. Complex multi-step reasoning, referencing the latest information, and multilingual support—these are still areas where large cloud models have the advantage. It’s essential to discern the appropriate use cases.
- Security is your own responsibility. You need to manage the security that cloud vendors previously guaranteed. However, the benefit of “data not leaving the company” is a plus for SMEs handling confidential information.
- Systematization of operations is essential. Implementation is not the end. Who will update the models? How will you maintain data freshness? If this becomes dependent on individuals, it defeats the purpose.
So, What Should You Do?
Here are three things you can start doing tomorrow.
1. First, install Ollama and try running a small model on your PC. It takes about 15 minutes and is free. Experience the reality of “AI running locally” for yourself.
2. Compile a list of the “most frequently asked questions” within your organization. This is the first task that should be delegated to AI. Automating FAQ responses, drafting estimates, summarizing meeting minutes—start small.
3. Allocate a budget of 50,000 yen and set a three-month experimental period. If it doesn’t yield results, you can stop. Even if it fails, that’s only 15,000 yen (50,000 yen x 3 months)—the cost of three drinking parties.
The One Thing SMEs Should Do in the Era of “Owning AI”
AI has shifted from being something to “borrow” to something to “own.”
The essence of this change is that the value of AI has moved from “the intelligence of the model” to “the proximity of data” and “the speed of deployment.”
And these two factors are strengths that SMEs already possess.
There’s no need to mimic large corporations. There’s no need to build an AI infrastructure costing hundreds of millions of yen. Just feed your company’s data into your own PC and start using it tomorrow. That’s all you need to do.
There’s only one thing to focus on: “Just start experimenting.” That’s it.
The technology has caught up. The costs have collapsed. Now, it’s just a matter of whether you take action or not.
JA
EN