Open Source AI Surpasses GPT-5.5 — Will the Arrival of GLM-5.2 Make ‘Zero Monthly API Fees’ a Reality? What Small and Medium Enterprises Should Consider Now

Conclusion First: "Free AI" Has Surpassed GPT-5.5 The open-source LLM "GLM-5.2" announced by the Chinese AI lab "Z.ai"

By Kai

June 19, 2026 | Last updated June 19, 2026

June 20, 2026

A World Where GPU Costs Are Halved Is Coming — What Amazon’s AI Chip Outsourcing and Baseten’s $1.5 Billion Funding Mean for Small and Medium Enterprises

May 24, 2026

Laid-off Talent from Big Corporations Becomes the Strongest Asset for SMEs—The ‘Reversal Structure’ Created by Mass Layoffs at Cloudflare, Meta, and Standard Chartered

Conclusion First: “Free AI” Has Surpassed GPT-5.5

The open-source LLM “GLM-5.2” announced by the Chinese AI lab “Z.ai” has outperformed OpenAI’s latest model, GPT-5.5, in benchmarks.

If that were all, it would just be another story of “the Chinese are doing well again.” But this time, the structure is different. GLM-5.2 is open-source. This means that anyone can download and use it. The API cost is zero. There are no monthly fees. Commercial use is also possible depending on the licensing terms.

For small and medium enterprises that are currently paying 30,000 yen a month to OpenAI, this is not just a “savings story.” It fundamentally disrupts the entire expenditure structure.

What Makes GLM-5.2 Impressive — A Look at the Numbers

Let’s summarize the basic specs of GLM-5.2.

Parameter Count: 753B (753 billion)
Architecture: Mixture of Experts (MoE), with approximately 40B active parameters
Use Case: Specialized in text generation
License: Open Source (Apache 2.0)

MoE is a system that activates only the necessary “experts” based on the input, rather than using all parameters at all times. Even with a massive model of 753B, only about 40B are actively used, which helps keep inference costs down.

Notably, in coding task benchmarks, GLM-5.2 reportedly achieves scores equal to or greater than GPT-5.5 while operating at one-sixth the cost in API terms. If we assume the input token price for GPT-5.5 is $15 per 1M tokens, processing equivalent to GLM-5.2 would cost around $2.5.

Of course, benchmarks are not infallible. The “usability” and “accuracy in Japanese” in practical applications are separate issues. However, the fact that open-source has matched top-tier commercial models is significant in itself.

Is “Zero API Fees” Real? — The Hidden Costs Small and Medium Enterprises Might Overlook

Now we get to the main point. The statement “it’s free because it’s open-source” is half true and half false.

To run GLM-5.2 in-house, the following costs are incurred.

1. GPU (Hardware) Costs

To run a 753B MoE model, at least several GPUs with 80GB VRAM are required. For NVIDIA A100 (80GB), you would need about 2 to 4 units, and for H100, around 2 units.

Purchasing 4 A100s: Approximately 6 to 8 million yen
Renting cloud GPUs (e.g., AWS p4d.24xlarge): Monthly cost of 500,000 to 800,000 yen

Paying 500,000 yen a month for GPU costs to eliminate a 30,000 yen API fee is counterproductive. This is the biggest pitfall.

2. Exploring Practical Solutions with Quantization

However, using a technique called quantization can change the situation. This method sacrifices some model accuracy to significantly reduce the required VRAM. With 4-bit quantization, it may be possible to run on about 2 RTX 4090s (VRAM 24GB each).

Purchasing 2 RTX 4090s: Approximately 600,000 to 800,000 yen
Electricity costs: Monthly about 5,000 to 8,000 yen (if running 24/7)

With this setup, the initial investment would be 700,000 yen plus a monthly electricity cost of 8,000 yen. If the API fee is 30,000 yen per month, it would take about 2.5 years to break even. However, the extent of accuracy degradation due to quantization needs to be verified in your own use case.

3. Labor Costs and Verification Efforts

Another often-overlooked factor is the labor costs for setup and operation.

Deploying an open-source model involves tasks such as environment setup, inference server configuration, prompt tuning, and output quality verification. Outsourcing to an AI engineer could cost 300,000 to 500,000 yen per project, and even if done in-house, there are learning costs for the responsible personnel.

This is the biggest bottleneck for small and medium enterprises. Technically feasible, but “there are no people who can do it.”

Organizing Realistic Options

Option	Estimated Monthly Cost	Initial Investment	Required Skills
GPT-5.5 API	30,000 yen or more	Zero	Low
GLM-5.2 (Cloud GPU)	500,000 to 800,000 yen	Zero	Medium to High
GLM-5.2 (In-house GPU with Quantization)	8,000 yen	700,000 yen	High
GLM-5.2 (Via API Service)	5,000 yen or more	Zero	Low

The last row is particularly noteworthy. Open-source models like GLM-5.2 are increasingly being offered at low costs through third-party API services. Services like Together AI, Fireworks AI, and Groq allow you to use open-source models at one-third to one-sixth the cost of GPT-5.5. Even without owning GPUs, you can still benefit from cost reductions.

ChatGPT Market Share Falls Below 50% — What’s Happening?

Another crucial piece of data is that ChatGPT’s market share has fallen below 50%.

A year ago, ChatGPT held an overwhelming share of the generative AI chat market. Now, it has lost the majority to the rise of Claude, Gemini, and open-source competitors.

This is not a story of “OpenAI becoming weaker.” It signifies that the performance gap in AI is narrowing, and we are entering an era where “it doesn’t make much difference which AI you use.”

When performance differences diminish, what will separate the contenders? Cost and optimization for one’s own business.

This presents an opportunity for small and medium enterprises.

The “Reversal Structure” for Small and Medium Enterprises

Large companies enter into annual contracts with OpenAI or Google, build dedicated environments, and invest hundreds of millions of yen to implement AI. Small and medium enterprises do not need to mimic this.

Now that the performance of open-source AI has caught up with commercial models, small and medium enterprises have a “way of fighting that only they can do because they are small.”

1. Narrowing Use Cases Dramatically Reduces Costs

Large companies seek “AI that can do everything.” Small and medium enterprises are different. “Automatic generation of estimates,” “drafting inquiry emails,” “summarizing meeting minutes” — when the use cases are clear, smaller models suffice. The 40B active parameters of GLM-5.2 may even be excessive. A specialized model with 7B to 14B parameters could run on a single RTX 4060 (around 50,000 yen).

2. No Need to Expose Data Externally

Using OpenAI’s API means that your company data passes through OpenAI’s servers. Customer information, partner data, internal know-how. Not every company can say, “We don’t mind.” Running an open-source model in-house means that data never leaves your premises. This is a significant reassurance for small and medium enterprise owners, even more than cost savings.

3. Question the Conventional Wisdom of “Paying Monthly for AI”

The model of continuously paying monthly for SaaS is a fantastic business for providers. But what about for users? 30,000 yen per month for 12 months equals 360,000 yen annually. Over five years, that’s 1.8 million yen. During that time, open-source models will undergo generational changes and continue to improve in performance.

The turning point from “continuing to pay” to “owning it yourself” is arriving right now.

So, What Should You Do?

Here are three things you can start doing today.

1. First, take stock of your company’s AI usage fees. ChatGPT Plus (monthly $20 per person), API fees, and other AI tools. Do you have an accurate grasp of how much you are paying each month?

2. Try GLM-5.2 via API. Before purchasing GPUs for your company, try throwing your current prompts at services like Together AI or Fireworks AI. If it performs comparably to GPT-5.5, switching could reduce costs to one-third to one-sixth.

3. There’s no rush to “run it in-house.” Quantization and local deployment can be confirmed for effectiveness after step 2. The technical hurdles are certainly lowering. In six months, it will become even easier.

Open-source AI has surpassed top commercial models. This is not a temporary phenomenon but a structural change. The rationality of continuously paying monthly for APIs is diminishing at this very moment.

I want to ask: Will your company continue to pay OpenAI 30,000 yen next month?

TOPICS

WORLD INSIGHT

Open Source AI Surpasses GPT-5.5 — Will the Arrival of GLM-5.2 Make ‘Zero Monthly API Fees’ a Reality? What Small and Medium Enterprises Should Consider Now

Conclusion First: “Free AI” Has Surpassed GPT-5.5

What Makes GLM-5.2 Impressive — A Look at the Numbers

Is “Zero API Fees” Real? — The Hidden Costs Small and Medium Enterprises Might Overlook

1. GPU (Hardware) Costs

2. Exploring Practical Solutions with Quantization

3. Labor Costs and Verification Efforts

Organizing Realistic Options

ChatGPT Market Share Falls Below 50% — What’s Happening?

The “Reversal Structure” for Small and Medium Enterprises

1. Narrowing Use Cases Dramatically Reduces Costs

2. No Need to Expose Data Externally

3. Question the Conventional Wisdom of “Paying Monthly for AI”

So, What Should You Do?

POPULAR ARTICLES

The Young Prince Who Bears the Future Throne: His True Self and Determined Resolve

MVP Shohei Ohtani and Joe Maddon, the Creator of True Two-Way

Self-Defense Forces Arrive in a Town Without Doctors, and the Museum Disappears—A Memorandum on How the ‘National Hand’ Reaches Iwakuni

Young Japanese Player Challenges the Unsung World of India’s Pro Kabaddi

Related Articles

“Reasoning Models Lie” — What Small and Medium Enterprises Should Know Before Losing Hundreds of Thousands to AI’s Answers

A Company of 30 Engineers Creates an App with AI, While Zuckerberg Cuts 8,000 Jobs—Why Small and Medium Enterprises Benefit Structurally in an Era Where the Meaning of ‘Headcount’ Has Broken Down

The Era of Running Multiple AIs on a Single GPU: Technology Halving Inference Costs Makes “50,000 Yen Monthly AI Operations” a Reality

LLM Inference Costs Cut by 70%, Cash Savings of 95%—Five Things Companies Still Paying 100,000 Yen a Month Should Do in a Week Where ‘AI is Expensive’ is Over

POPULAR ARTICLES

The Young Prince Who Bears the Future Throne: His True Self and Determined Resolve

MVP Shohei Ohtani and Joe Maddon, the Creator of True Two-Way

Self-Defense Forces Arrive in a Town Without Doctors, and the Museum Disappears—A Memorandum on How the ‘National Hand’ Reaches Iwakuni

Young Japanese Player Challenges the Unsung World of India’s Pro Kabaddi

TOPICS

WORLD INSIGHT