Production Failures Caused by AI-Generated Code: Three Principles to Avoid Breaking Internal Systems with ‘Vibe Coding’

Bluesky Crashed: The Cause Was 'AI-Generated Code' The decentralized social network Bluesky recently experienced a seri

By Kai

|

Related Articles

Bluesky Crashed: The Cause Was ‘AI-Generated Code’

The decentralized social network Bluesky recently experienced a series of service outages. The term that quickly spread among users as a point of criticism was “vibe coding.”

Vibe coding refers to a coding technique where developers give AI vague instructions like “make it feel like this” and use the generated code without fully understanding it. It’s a tongue-in-cheek term that implies writing code based on a “vibe.”

There is debate over whether vibe coding was truly the cause of the Bluesky outages. However, the issue highlighted by this incident is clear: When AI-generated code is deployed in a production environment without human verification, systems can break.

This is not just a concern for large corporations; in fact, small and medium-sized enterprises (SMEs) are at greater risk.

Why SMEs Are the Most Vulnerable

Currently, there is a rapid trend among SMEs to “in-house system development using AI.” The reasons are clear:

  • System development that would cost several million yen when outsourced can be done for tens of thousands of yen using AI coding tools.
  • Tools like GitHub Copilot, Cursor, and Claude Code provide “AI engineers” for a monthly fee of just a few thousand yen.
  • For SMEs that cannot hire engineers, AI becomes the only development resource.

This trend itself is valid. The problem lies in the fact that “AI is being allowed to write code without any quality control mechanisms in place.”

Large corporations have a culture of code reviews, automated testing systems, staging environments, and pre-deployment checklists.

In a company with 20 employees, if the president is running a system “created by ChatGPT” in production, none of these mechanisms exist. There is no one to verify whether the AI-generated code is correct. While they think, “It’s working, so it’s fine,” data can disappear or security vulnerabilities can emerge.

What Are the Specific ‘Quality Issues’ with AI Coding?

Let’s outline the problems with code generated by AI:

1. ‘Works but is Fragile’ Code
AI is good at generating “working code,” but not at producing “robust code.” It often has weak error handling, fails to address edge cases (unexpected inputs), and can cause memory leaks—resulting in a proliferation of code that “usually works but breaks under specific conditions.”

2. Security Vulnerabilities
As pointed out on Hacker News, there have been reports of AI-generated code containing vulnerabilities like SQL injection and cross-site scripting. AI tends to prioritize “functionality” over security, often neglecting it.

3. Lack of Understanding of ‘Why It Was Written That Way’
This is the most troublesome issue. If a human engineer writes code, you can ask them “why this implementation was chosen.” AI-generated code lacks this rationale. When problems arise, identifying the cause takes several times longer than usual.

4. Inaccuracy of Comments
Multiple studies have reported that comments generated by AI do not always match the actual behavior of the code. If you trust the comments and modify the code, it can lead to secondary issues where other parts break.

Three Principles to Avoid Breaking Internal Systems

So, should SMEs avoid using AI coding altogether? Not at all. It’s a matter of how it’s used. By adhering to the following three principles, SMEs can enjoy the productivity of AI while preventing quality degradation.

Principle 1: Treat AI-Generated Code as a ‘Draft’

Never deploy AI-generated code directly into production. This is the fundamental principle.

As a specific operational rule, “AI-generated pull requests (proposed code changes) must always be reviewed by a human before merging.”

Even if you say, “We don’t have any engineers,” minimal checks are still possible:

  • Ask AI, “Are there any security issues with this code?” (Have another AI review it)
  • Request AI to “write test code for this code” and run the tests
  • Before going live, run it in a staging environment (test environment) for at least one week

Having AI write the code and then having another AI review it can prevent many critical bugs.

Principle 2: Create a Governance File

A “governance file” is a document that instructs AI coding tools to “write code according to our company’s rules.”

Recent research applying governance files to 50 repositories found that AI output accuracy reached 96.4%. This is a significant improvement compared to when no governance file was used.

The minimum content that SMEs should include in their governance file is as follows:

  • Specify the languages and frameworks to be used: “Use Python’s FastAPI”; “Use Next.js for the frontend”
  • Security rules: “Always validate user input”; “Do not store passwords in plain text”
  • Coding standards: “Function names should be written in snake_case”; “Keep functions under 50 lines”
  • Prohibitions: “Do not hard-code external API keys in the code”; “Do not write code that directly connects to the production database”

Simply placing this file in the project’s root directory can dramatically improve the quality of AI output. It takes 1-2 hours to create and has a lasting effect.

Principle 3: Establish a Feedback Loop

If issues arise from AI-generated code, document them and reflect them in the governance file. This should be done continuously.

Specifically, implement the following cycle once a month:

1. List bugs and outages caused by AI-generated code over the past month
2. Classify the causes of each bug (security, error handling, performance, etc.)
3. Add rules for preventing recurrence to the governance file
4. Measure the quality of AI output for the next month

If this cycle is repeated for three months, the quality of AI output will improve dramatically. This is because AI has the nature of “following rules when given,” and the problem lies in not providing those rules.

‘Cheap to Build’ and ‘Safe to Use’ Are Two Different Things

AI coding has dramatically reduced the system development costs for SMEs. What used to cost 5 million yen to outsource can now be done for 500,000 yen. This is a fact.

However, what if a system built for 500,000 yen causes outages in production, resulting in the loss of customer data? The recovery costs, compensation to customers, and damage to trust—these losses will not be limited to 500,000 yen.

“Cheap to build” and “safe to use” are entirely different matters. The three principles introduced here are what bridge that gap.

We cannot stop AI from writing code, nor do we need to. However, I hope that starting today, a culture of treating AI output as a ‘draft’ rather than a ‘finished product’ takes root within your organization.

The Bluesky outage is not someone else’s problem. Implement these three principles before the same thing happens to your company’s internal systems. That is the only way to enjoy the benefits of AI while maintaining quality.

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN