LLMs Run in Browsers, Raspberry Pi Gains AI, and News Summarized in 325 Lines—Calculating the Monthly Costs of the “Era Without AI Servers”

Monthly 500 Yen vs. Monthly Tens of Thousands of Yen. The Cost Structure of AI Changes with How It Is "Run" We may no l

By Kai

|

Related Articles

Monthly 500 Yen vs. Monthly Tens of Thousands of Yen. The Cost Structure of AI Changes with How It Is “Run”

We may no longer need servers to use AI.

Enter “WebLLM,” which runs LLMs (Large Language Models) directly in the browser. The “AI HAT+” adds an AI inference chip to the Raspberry Pi 5, which costs under 10,000 yen. And there’s a project that automates news summarization with just 325 lines of Python code.

When we put these three together, we see the fact that “the cost of using AI is changing dramatically.” The world where processing that used to cost tens of thousands of yen monthly through cloud APIs can now run for just 500 yen in electricity has already begun.

For small and medium-sized enterprises (SMEs), this is not just “interesting technology news.” It’s a matter of business decision.

WebLLM—Turning Browsers into AI Servers

WebLLM is an open-source engine that performs LLM inference within the browser. It utilizes WebGPU to run the model directly on the user’s local GPU.

In other words, no API keys or servers are required. Simply opening a browser allows chat AI to operate.

What changes?

  • API usage fees become zero. You are freed from the structure where you are charged per request for APIs like OpenAI or Claude.
  • Data stays in-house. There’s no need to send customer information or internal documents to the cloud. The concern that local SMEs have about “Is our data safe?” disappears.
  • It works even with a slow internet connection. Once the initial model download (a few GB) is complete, inference can be done offline.

Of course, there are limitations. There is a cap on the model size that can run in a browser. Performance on par with GPT-4 is not achievable. Currently, practical models include Llama 3 8B or Phi-3 class, which have parameter counts in the billions.

But consider this: Is GPT-4 really necessary for internal inquiry responses, generating templates, or summarizing meeting minutes? There are actually many tasks where an 8B model is sufficient.

Using the “highest performance model” and using a model that is “adequate for business needs” are entirely different matters. SMEs will succeed when they can make the latter judgment.

Raspberry Pi 5 + AI HAT+—Building a “Company AI Server” for 25,000 Yen

The Raspberry Pi 5 is originally a compact computer priced around 10,000 yen. By adding the “AI HAT+” expansion board equipped with Hailo’s AI accelerator (also around 10,000 yen), it gains AI inference performance of up to 26 TOPS (26 trillion operations per second).

Here’s a realistic breakdown of the initial investment:

Item Price (approx. including tax)
Raspberry Pi 5 (8GB) About 12,000 yen
AI HAT+ (Hailo-8L equipped) About 10,000 yen
microSD, power supply, case, etc. About 3,000 yen
Total About 25,000 yen

The monthly operational cost is primarily just the electricity bill. The Raspberry Pi 5 consumes a maximum of 27W. Including the AI HAT+, it’s around 30W. Even if it runs 24 hours a day, the monthly electricity cost is about 700 yen (calculated at 32 yen per kWh).

What can this setup do?

  • Automated inspection through image recognition. Analyzing camera footage from the production line in real time to detect defective products.
  • Visitor counting and flow analysis. Automatically aggregating the number of people and dwell time from store camera footage.
  • Real-time transcription of audio. Automating the recording of meetings and phone calls.

Importantly, all of this operates without an internet connection. It can complete tasks at the edge, even in factories or rural stores with unstable connections.

Using cloud image recognition APIs (like AWS Rekognition or Google Vision AI) incurs ongoing costs of hundreds to thousands of yen per 1,000 images. For a factory processing 100,000 images a month, the API costs alone can reach tens of thousands of yen. With Raspberry Pi, the initial cost is 25,000 yen and the monthly electricity bill is 700 yen. It pays for itself in six months.

325 Lines of Python Code—Ending the Personalization of “Information Gathering”

The third example is an AI news summarization tool that operates with 325 lines of Python code.

The mechanism is simple. It automatically collects articles from RSS feeds and news sites, generating summaries using a local LLM (or a minimal use of APIs). Every morning, news relevant to one’s industry arrives summarized.

325 lines. To give a sense of scale, it’s a manageable amount for someone who can write a bit of Excel macros. No special AI engineers are needed.

There’s a reason this resonates in the field of SMEs.

In many companies, checking “industry news” is personalized to specific individuals. The president spends 30 minutes every morning reading the Nikkei. The sales manager gathers information on X. If that person is absent, the information stops.

The 325-line script breaks this personalization. Every morning, at a set time, from designated sources, summaries arrive in a consistent format. It doesn’t depend on individuals. It’s reproducible. This is the essence of systematization.

What about costs? If it’s completed with a local LLM, it’s 700 yen a month with the aforementioned Raspberry Pi setup. Even with a cloud API configuration, summarizing 50 news articles a day results in 1,500 summaries a month. Calculating with the API fee for GPT-4o mini, it comes to about 100 to 300 yen per month.

Consider the president’s 30 minutes at an hourly wage. If the president earns 10 million yen a year, the hourly rate is about 5,000 yen. That’s 2,500 yen for 30 minutes. Over 20 business days a month, that’s 50,000 yen. You can automate a task worth 50,000 yen a month for just 300 yen. This is what it means for “cost structures to change dramatically.”

Monthly Cost Comparison—”Cloud Pay-As-You-Go” vs. “Zero Infrastructure AI”

Based on these three technologies, let’s compare monthly costs for a hypothetical SME with 30 employees.

Pattern A: Cloud API Dependent

Item Monthly Cost
LLM API usage (internal chat, summarization, etc.) About 15,000 to 30,000 yen
Image recognition API (inspection, analysis, etc., 50,000 images a month) About 5,000 to 15,000 yen
Cloud storage and transfer fees About 2,000 to 5,000 yen
Total About 22,000 to 50,000 yen/month

Costs increase in proportion to usage. They can spike during busy periods.

Pattern B: Zero Infrastructure AI

Item Cost
Raspberry Pi 5 + AI HAT+ (initial) About 25,000 yen
PC for WebLLM (reuse existing PC) 0 yen
Monthly electricity cost About 700 yen
News summarization API (minimum usage) About 300 yen
Monthly Total About 1,000 yen/month

The initial investment of 25,000 yen can be recouped in three months.

Monthly costs drop from 50,000 yen to 1,000 yen. That’s a factor of 50. Over a year, that’s a difference of about 600,000 yen. For an SME with 30 employees, 600,000 yen is not just a budget for a year-end party; it’s enough to hire a new employee for a month.

However, It’s Not All-Powerful. Criteria for Differentiation

To avoid misunderstandings, let’s clarify: Zero infrastructure AI is not all-powerful.

  • For tasks requiring advanced inference (legal review of contracts, complex data analysis) → Large-scale cloud models are necessary.
  • If you want to train on a large amount of data across various domains → Local resources may be insufficient.
  • When both real-time performance and accuracy are required → There are limits to edge-only solutions.

The criteria for judgment are simple. “Is GPT-4 class performance really necessary for this task?” If the answer is No, then zero infrastructure is sufficient. And it’s likely that 80% of daily tasks in SMEs would answer No.

So, What Should We Do?

I propose three steps.

1. Start with the 325-line news summarization. No investment required. It runs on existing PCs and Python. One person in the company should first experience the transformation that “AI changes business.” This will be the starting point.

2. Try WebLLM in-house. Just open a browser. Experiment with handing over “daily tasks that don’t require much thought” like internal FAQ responses or template email creation to AI. Since data doesn’t leave the premises, it’s easier to get approval from the IT department.

3. Once you see the effects, solidify the edge with Raspberry Pi. Offload tasks that require constant operation, like image recognition or audio processing, to the edge. With an investment of 25,000 yen, you can be freed from cloud billing.

There’s no need to do everything at once. Start small and demonstrate cost savings with numbers. Those numbers will provide the basis for the next investment.

The term “democratization of AI” has become cliché, but what’s really happening is the “collapse of AI prices.” Server costs are disappearing, API costs are vanishing, and what remains is just the electricity bill. SMEs that recognize this structural change will naturally become stronger. In an era where those who act win while waiting for approvals from large companies, the proactive will prevail.

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN