The Break-Even Point for Local AI is 2.6 Years—From ‘Renting AI’ to ‘Owning AI,’ the Cost Structure of Small and Medium Enterprises is Flipped

Cloud AI: Pay 50,000 Yen a Month or Own It? To get straight to the point: If a company with 30 employees is spending ov

By Kai

|

Related Articles

Cloud AI: Pay 50,000 Yen a Month or Own It?

To get straight to the point: If a company with 30 employees is spending over 600,000 yen annually on cloud AI, it should seriously consider the option of ‘owning AI’ right now. The break-even point is 2.6 years. In other words, from the third year onward, the money previously spent on the cloud will be entirely saved.

This figure has been made possible by the cost disruption of open models represented by DeepSeek and the rapid decline in hardware prices. The option of ‘local AI’—that is, running AI on the company’s own machines—is no longer exclusive to large corporations. For regional small and medium enterprises, this marks a structural turning point.

What DeepSeek Has Disrupted—The Common Belief That ‘AI is Expensive’

What has DeepSeek accomplished? In short, it has enabled models with GPT-4 class performance to be run at an extraordinarily low cost.

OpenAI’s GPT-4 Turbo costs about $10 (approximately 1,500 yen) per million tokens of input, with output costing three times that. In contrast, DeepSeek-V3 costs just $0.27 (around 40 yen) per million tokens for input and $1.10 (around 165 yen) for output. Even when looking solely at API usage, the cost is less than one-tenth.

However, the real impact is not the API fees. DeepSeek’s model uses open weights—meaning it can be downloaded and run on the company’s own machines. This eliminates usage-based charges entirely, leaving only electricity costs and hardware depreciation. This is where the distinction between ‘renting’ and ‘owning’ becomes clear.

Breaking Down the 2.6-Year Break-Even Point

Let’s do some specific calculations.

In the case of Cloud AI (Renting):

  • Assuming 30 employees use an average of 2,000 tokens (input and output combined) per day
  • Monthly total token usage: 30 employees × 2,000 tokens × 22 working days = approximately 1.32 million tokens
  • For GPT-4 Turbo, the monthly cost is roughly 30,000 to 50,000 yen
  • Annually, that amounts to 360,000 to 600,000 yen. It’s not uncommon for usage to exceed 1 million yen annually.

In the case of Local AI (Owning):

  • To run the distilled model of DeepSeek-V3 (70B parameters), a single NVIDIA RTX 4090 (approximately 300,000 yen) can achieve practical speeds.
  • If you build a workstation with two RTX 4090s for a bit more headroom, the hardware cost will be around 800,000 to 1 million yen.
  • Monthly electricity costs will be about 5,000 to 8,000 yen, even when running 24 hours a day.
  • Annual running costs will be approximately 70,000 to 100,000 yen.

In other words, comparing the ‘owning’ model with an initial investment of 1 million yen + annual running costs of 100,000 yen to the ‘renting’ model at 600,000 yen annually, you can recoup your investment in about 2 years. For companies with high usage, it may even be possible to recover the costs in 1.5 years. Conversely, if usage is low, it could take over 3 years. An average of 2.6 years is a very realistic figure.

From the third year onward, over 500,000 yen will be saved annually. In 5 years, that amounts to 2.5 million yen. This is a significant sum for small and medium enterprises.

768GB Memory and 1 Trillion Parameters—The Era Where ‘Gigantic AI’ Can Run Locally

“But doesn’t local AI have lower performance?”

This concern is valid. However, technological advancements are rapidly alleviating this worry.

Recent experiments have reported success in running a 1 trillion parameter LLM in a single GPU environment using 768GB of system memory (not GPU memory, but regular RAM). The processing speed is about 4 tokens per second. While this may be slow for real-time conversations, it is sufficiently practical for batch processing tasks such as document summarization, report generation, and data analysis.

Of course, small and medium enterprises do not need to purchase machines with 768GB of memory. The key point is that “the technical ceiling continues to rise.” Today’s 70B parameter models deliver performance equal to or greater than last year’s cutting-edge models at less than one-tenth of the cost. This trend is accelerating and shows no signs of stopping.

Three Reasons Small and Medium Enterprises Can Win with Local AI

Now, let’s get to the main point. Local AI is actually more beneficial for small and medium enterprises than for large corporations. There are three reasons for this.

1. No Need to Expose Data Externally

The data handled by regional small and medium enterprises—customer information, estimates, contracts with business partners—comes with high psychological and legal hurdles when it comes to throwing it into cloud AI. With local AI, the data never leaves the company’s machines. For companies that have delayed AI adoption because “we can’t expose our data,” this is a decisive difference.

2. Ability to Specialize in Company Operations

Cloud AI is generic. It can do a little bit of everything but struggles with “industry-specific terminology” and “internal company rules.” With local AI, you can train it on your own manuals and past proposals to create a company-specific AI. Customizations that large corporations used to spend tens of millions of yen on can now be done for just a few hundred thousand yen.

3. Faster Decision-Making

How many levels of approval does a large corporation need to implement AI? Security reviews, legal checks, vendor selection. It’s not uncommon for this to take six months. In a small or medium enterprise, if the president says, “Let’s do it,” action can be taken as soon as next week. When technology democratizes, the first beneficiaries are the ‘organizations that can move quickly.’

So, What Should Be Done?

I propose three steps.

Step 1: Visualize Current AI Costs (Something You Can Do Today)

ChatGPT’s Team plan, Copilot licenses, and other API usage fees. Add up how much you are paying each month. If it exceeds 300,000 yen annually, it’s worth considering local AI.

Step 2: Test on a Small Scale (Can Be Done in a Week)

Install Ollama on your PC and try running small models of DeepSeek or Llama (around 8B parameters). It’s free. Experience the reality of “AI running locally” firsthand. You’ll be surprised by the performance.

Step 3: Calculate the Profit and Loss for Full Implementation (Can Be Done in a Month)

Based on the costs from Step 1 and the experience from Step 2, calculate the initial investment and payback period for ‘owning a GPU machine.’ Hardware costs will be around 800,000 to 1 million yen, with annual running costs of 100,000 yen. If it can be recouped in 2 to 3 years compared to the annual costs of cloud AI, then it’s a GO.

This Trend Will Not Stop

Six months ago, local AI was seen as a “hobby for tech enthusiasts.” Now, it is becoming a “business decision.”

DeepSeek has disrupted costs, the performance of open models is improving every month, and hardware prices are dropping every year. While the costs of ‘renting AI’ accumulate the more you use it, the costs of ‘owning AI’ decrease over time. Companies that have recognized this asymmetry are starting to take action.

A break-even point of 2.6 years. This number will likely drop to 1.5 years by next year. If there’s a reason to wait, it’s because “it will get a little cheaper,” but in the meantime, payments to the cloud will continue.

It’s time to seriously consider the option of ‘owning AI.’

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN