Korea’s $1 Trillion, Amazon’s $100 Billion, Etched’s $5 Billion—When and How Much Will AI Chip Investments Lower API Costs for Small and Medium Enterprises?
Related Articles
Conclusion
Let’s get straight to the point: “Prices will go down” is certain. The questions are “when” and “by how much.”
Korea is set to invest $1 trillion (approximately 150 trillion yen) in memory chips and AI infrastructure. Amazon has launched a new organization, “FDE,” with a budget of 100 billion yen, aimed at promoting the rapid deployment of AI agents to businesses. AI chip startup Etched has achieved a valuation of $5 billion and already has $1 billion in orders.
These three news items may seem like they pertain to large corporations and the tech industry. However, there is only one thing to consider here.
“How much will these investments ultimately reduce the API costs that local small and medium enterprises (SMEs) have to pay?”
Currently, using OpenAI’s GPT-4o costs $2.50 for every 1 million tokens of input and $10 for output. For Claude 3.5 Sonnet, the costs are $3 for input and $15 for output. It is not uncommon for SMEs to pay tens of thousands of yen in API fees each month. Just running a single chatbot for customer inquiries can cost between 50,000 to 100,000 yen a month.
How will these “electricity costs of AI,” also known as API fees, change due to upstream investments? Let’s break down these three movements and consider their implications for SMEs.
—
Wave 1: Korea’s $1 Trillion Memory Investment—”Component Costs” Will Decrease
What is Happening
The South Korean government, along with companies like Samsung and SK Hynix, has announced plans to invest a total of $1 trillion in enhancing memory chip manufacturing capabilities and building AI data centers by 2047.
Why is memory important? AI inference processing requires a large number of parameters to be quickly read and written in memory. In particular, High Bandwidth Memory (HBM) is an essential component for AI-oriented GPUs. Currently, SK Hynix holds about 50% of the HBM supply, while Samsung accounts for about 40%, effectively monopolizing the market.
The problem is that demand significantly exceeds supply. NVIDIA’s H100/H200 and AMD’s MI300X are selling like hotcakes, keeping HBM prices high. This drives up GPU prices, increases data center operational costs, and ultimately gets passed on to API fees.
When Will SMEs Feel the Impact?
Semiconductor fabrication facilities (fabs) take at least 2 to 3 years from the start of construction to full operation. The impact of Korea’s investment on supply will likely be felt in the market as early as late 2027 to 2028.
However, there is something important to note here. Even if HBM supply increases, it does not necessarily mean that API fees will decrease correspondingly. This is because the demand for AI itself is growing explosively. If supply doubles but demand triples, prices will not fall.
A realistic outlook suggests that by around 2028, HBM supply will begin to catch up with demand, potentially leading to a 20-30% decrease in GPU prices compared to current levels. If this is reflected in API fees, we can expect a decrease of about 10-15% in inference costs.
For an SME currently paying 100,000 yen in API fees, this translates to a reduction of 10,000 to 15,000 yen per month, or 120,000 to 180,000 yen annually. While not dramatic, it will gradually make a difference.
—
Wave 2: Amazon’s FDE Organization—”Usage Costs” Will Decrease
What is Happening
Amazon’s newly established “FDE (Fast Deployment Engineering)” team sends engineers to client companies to implement and deploy AI agents within weeks. With an investment of 100 billion yen, Amazon aims to drastically lower the barriers for AI adoption in businesses.
At first glance, this seems like a service aimed at large corporations. However, the essence is different.
What Amazon is trying to achieve is a price disruption of the “cost of AI adoption” itself. Until now, SMEs looking to integrate AI into their operations had to pay external system integrators or consultants millions to tens of millions of yen. Alternatively, they could hire AI talent in-house, which is not realistic for many local SMEs.
The movement of FDE indicates that the standardization of AI adoption is progressing. If tasks like customer inquiries, inventory forecasting, and invoice processing can be packaged and implemented within weeks, the cost of adoption will change dramatically.
Impact on SMEs
For the time being, FDE itself is unlikely to reach SMEs directly. However, this movement will have two ripple effects on SMEs.
The first is a change in AWS pricing structures. Amazon is accelerating the development of its own chips, “Trainium” and “Inferentia,” aiming to reduce its dependency on NVIDIA. By securing customers through FDE while lowering costs with its own chips, AWS Bedrock’s inference costs have already dropped by approximately 40-60% compared to a year ago as of 2025. This trend is expected to continue.
The second is a shift in the “market perception of AI adoption.” If large companies begin low-cost implementations through FDE within weeks, local system integrators will have no choice but to compete on the same level. The world where “AI adoption costs 5 million yen” will soon become obsolete.
The timeline suggests that between late 2025 and 2026, AWS Bedrock’s inference costs are likely to drop by an additional 20-30%. For SMEs, this will be a significant moment when not only the “unit price” of APIs but also the “total cost of implementation” decreases substantially.
—
Wave 3: The Rise of Etched—”Competition” Will Break Prices
What is Happening
Etched is a startup developing ASICs (Application-Specific Integrated Circuits) specialized for the Transformer architecture. With a valuation of $5 billion, it already holds $1 billion in orders.
Why is this important? The current AI inference market is almost entirely dominated by NVIDIA. This monopoly keeps prices high. If specialized chip manufacturers like Etched emerge, the cost structure of inference processing itself will change.
According to Etched, its chip “Sohu” achieves up to 20 times the throughput in Transformer inference processing compared to NVIDIA’s H100. If even half of this claim is realized, inference costs will drop dramatically.
And it’s not just Etched. Startups like Groq, Cerebras, and SambaNova are also raising funds for inference-specialized chips. Including Google’s TPU, Amazon’s Trainium/Inferentia, and Microsoft’s Maia, the monopoly held by NVIDIA is beginning to crack.
Impact on SMEs
As competition intensifies, prices will fall. This is an economic principle.
What’s important is that SMEs do not need to buy chips directly. The competition among chips will translate into price competition among cloud providers, ultimately reaching SMEs as lower API fees.
Between 2027 and 2028, if the widespread adoption of inference-specialized chips begins, API fees could drop by 50-70% compared to current levels.
This is not just a prediction; it is an extension of trends that are already occurring. OpenAI’s API prices have dropped by about 90% over the past two years. The performance equivalent to GPT-3.5 can now be used at less than one-tenth the price of GPT-4 from two years ago. If the intensifying chip competition adds to this trend, the acceleration will be even greater.
Imagine a world where monthly API costs of 100,000 yen drop to 30,000 to 50,000 yen. This will be the tipping point that transforms AI usage for SMEs from “let’s try it out” to “it’s a standard practice.”
—
What Do We See When We Layer the Three Waves?
| Period | Main Factors | Impact on API Fees (Compared to Current) |
|---|---|---|
| Late 2025 – 2026 | Adoption of Amazon’s own chips, competition among clouds | ▲20-30% |
| 2027 – 2028 | Increase in Korean memory supply, full-scale adoption of inference-specialized chips | ▲50-70% |
| After 2028 | Normalization of competition, commoditization | ▲70-80% (Compared to Current) |
By around 2028, it is highly likely that AI will be usable at less than one-third of current API fees.
—
So, What Should SMEs Do Now?
“If prices are going to drop, why not just wait?”—This is a misconception.
There are two reasons.
First, AI is not something to use “after it becomes cheaper”; it is something that becomes cheaper “as you use it.” If you don’t experiment with what AI can do now and incorporate it into your workflows, you won’t be able to utilize it effectively when it becomes cheaper. Even if tools become cheaper, it’s meaningless if you don’t know how to use them.
Second, competitors won’t wait. If you think there are no competitors just because you are a local SME, you are mistaken. What if a similar business in the neighboring prefecture automates customer support with an API cost of 50,000 yen a month and raises its after-hours inquiry response rate to 80%? In a region facing labor shortages, that difference could be fatal.
Here are three things you should do now:
- Start small. Choose one task that can operate with an API cost of 10,000 to 30,000 yen and try to automate it. It could be customer inquiries, meeting minutes, or drafting estimates—anything will do.
- Monitor cloud price fluctuations. AWS Bedrock, Google Vertex AI, Azure OpenAI—each company’s pricing revisions occur quarterly. The same processing could be half the price six months later.
- Create a list of what you want AI to do. Even if something is too expensive to do now, it may become realistic in two years. Having that list will change how quickly you can act when prices drop.
—
Summary: Upstream Investments in the Trillions Will Reach Downstream Monthly Costs in the Thousands
Korea’s $1 trillion, Amazon’s 100 billion yen, and Etched’s $5 billion. These are not distant tales from another world. As semiconductor supply increases, chip competition intensifies, and cloud prices fall, the API fees your company pays will also decrease. This causal relationship is certain.
The only questions are “when” and “by how much.” And whether you can benefit from it will depend on whether you are preparing to act when that time comes.
Trillions are moving upstream. What SMEs downstream should do is prepare to ride the wave the moment it arrives. Starting with a 10,000 yen experiment is a good way to begin.
JA
EN