The Shocking Drop in Music Production Costs: From 300,000 Yen to 5,000 Yen—How Spotify and Stability AI are Disrupting the Notion that ‘Creativity is Expensive’

Conclusion First The production cost of promotional content is about to drop by 98%. When local small and medium-sized

By Kai

|

Related Articles

Conclusion First

The production cost of promotional content is about to drop by 98%.

When local small and medium-sized enterprises (SMEs) outsource music for in-store BGM or social media videos, the going rate is typically between 200,000 and 300,000 yen per song. If they hire a production company to create a corporate podcast with narration, it costs between 100,000 and 150,000 yen per episode. Requesting a short video for YouTube from a video production company can range from 100,000 to 500,000 yen per piece.

However, due to the changes currently underway, these costs are set to plummet to a range of 5,000 to 10,000 yen.

“AI can now create content”—upon hearing this, you might think it’s just another buzzword. But this time, it’s different. Specific services have emerged, clarifying “who can create what and at what cost.” Moreover, the biggest beneficiaries of this shift are not large corporations but SMEs.

What Happened: Three Key News Items

1. Spotify and UMG’s AI Remix Agreement

Spotify has signed a licensing agreement with Universal Music Group (UMG), allowing Premium members to generate covers and remixes of existing songs using AI.

Previously, if you wanted to arrange existing songs, you would have to negotiate master rights, hire an arranger, and incur recording costs, which would amount to at least several hundred thousand yen. Now, within the framework of the license, AI handles the processing. The key point is that the costs and hassles of rights management are structurally eliminated.

2. Stability AI’s Six-Minute Song Generation Model

Stability AI has unveiled a new audio generation model capable of producing original songs up to six minutes long. By specifying a text prompt like “bright acoustic, suitable for cafes, BPM 100,” a corresponding song is generated.

The six-minute length surpasses practical barriers. It covers most commercial uses, such as in-store BGM, event music, and video backtracks. Furthermore, the generation cost is at the level of a few dozen to a few hundred yen for API usage. What used to cost 300,000 yen to outsource can now be done for a few hundred yen. This is not a mere discrepancy; it represents a structural change.

3. YouTube Shorts Remix Feature

YouTube’s newly introduced Shorts remix feature allows users to remix video materials from other creators using AI to generate new short videos. The processes of sourcing materials, editing, and exporting are all completed within the platform.

Additionally, Spotify’s AI podcast generation feature, “Spotify Studio,” is also noteworthy. Based on users’ listening histories and information from connected apps, an AI agent automatically generates daily briefings and podcasts.

The Essence is Not “Cost Reduction” but Changing the “Types of Things You Can Do”

Here, I want to pause and reflect.

“The cost has dropped from 300,000 yen to 5,000 yen—what a bargain!”—this is not the point.

When costs drop by 98%, you start doing things you never did before. This is the essence.

A local restaurant changes its original BGM seasonally. A construction company shares dedicated short videos with music for each project. An accounting firm distributes weekly tax topics via podcast.

What was previously thought to be impossible due to scale suddenly becomes an option when the cost barrier disappears. The quality difference between brand content that large companies used to create for 5 million yen and content that SMEs can now create with AI for under 10,000 yen is no longer significant for the audience.

In other words, we are witnessing the first moment where the quality of content no longer differentiates large corporations from SMEs.

Five Steps for SMEs to Start Acting Today

No need for abstract discussions. Let’s write specifically about “what to do”.

Step 1: Narrow Down the Purpose to One

“I want to increase my social media followers,” “I want to boost foot traffic,” “I want to increase job applications”—first, decide on just one objective. Trying to do everything will leave each tool half-heartedly utilized.

Step 2: Create Your Own Music

Use AI music generation tools like Stability AI to create BGM that fits your company’s tone. Simply input “industry + vibe + purpose” into the prompt. Generation time is just a few minutes, and costs are in the hundreds of yen. First, create three variations and let your staff choose.

In the era of outsourcing: 200,000 to 300,000 yen per song, with a delivery time of 2 to 3 weeks.
In the case of AI self-production: a few hundred yen per song, taking about 10 minutes.

Step 3: Mass Produce Short Videos

Overlay the music created in Step 2 onto footage shot with a smartphone. Post these on YouTube Shorts or Instagram Reels. Elaborate editing is unnecessary. Aim for three posts a week.

In the era of hiring production companies: 100,000 to 500,000 yen per video, limited to one per month.
With AI utilization: nearly zero additional cost per video, making it possible to produce more than three per week.

One video per month versus twelve per month. This difference in output will create a decisive gap in six months.

Step 4: Share Your “Expertise” via Podcast

Utilize AI agents like Spotify Studio or other AI voice synthesis tools to turn your company’s expertise into a podcast. Draft the script using ChatGPT, generate the audio with AI, and let AI tools handle the editing automatically.

In the era of hiring production companies: 100,000 to 150,000 yen per episode, limited to one per month.
With AI utilization: costs range from 1,000 to 3,000 yen per episode, making it possible to produce one per week.

The era is coming where local experts (accountants, labor consultants, construction companies, farmers) will have their own podcasts. This is something large companies find hard to replicate because the “real voices from the field” are what give content its value.

Step 5: Distribute, Analyze the Numbers, and Iterate

Distribute the content created on Spotify, YouTube, and social media. The key is to “think after you publish.” Look at the numbers—views, engagement rates, conversions to foot traffic or inquiries—and adjust the content for the following week based on these metrics.

The cost of maintaining this cycle will be around 10,000 to 20,000 yen per month. This means that the annual production cost for promotional content will drop from the outsourcing era’s 2 to 5 million yen to just 150,000 to 250,000 yen.

Two Risks to Be Aware Of

The Boundary of Copyright and Licensing

Spotify’s AI remix operates within the licensing agreement with UMG. This means that commercial use outside of Spotify is a separate matter. When using AI-generated music for in-store BGM or commercials, it is essential to check the terms of use. The commercial licensing of songs generated by Stability AI also varies depending on the plan. “Just because it’s AI-generated doesn’t mean it’s copyright-free.” Underestimating this can lead to serious consequences.

The Trap of “AI-Likeness”

As AI-generated content increases, audiences may unconsciously skip over things that feel “AI-like.” The key to differentiation is to use AI to create a foundation and then layer on your unique “real voices from the field,” “human faces,” and “local context.” AI is a tool, not the star of the show.

What Becomes Clear Structurally

Finally, let’s take a step back and think.

What is collapsing with this series of movements is the assumption that “creative production requires paying experts a lot of money.” As the production costs for music, videos, and podcasts approach nearly zero, the axis of competition will completely shift from “whether you can create it” to “what you can convey.”

This is great news for SMEs. Large companies may have the budget for production costs, but they lack the “real on-the-ground experience.” Local SMEs possess the “30 years of experience in this region,” “closeness to customers,” and “craftsmanship”—none of which can be generated by AI.

In other words, this is what it means:

As production costs approach zero, those who possess substance will win.

SMEs have that substance. All that’s left is to put it out there. The “cost of putting it out” is now on the verge of disappearing.

I encourage you to create one piece of music using AI. It will take just ten minutes. You will surely realize, “Ah, this is really going to change things.”

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN