How Much Will It Cost to Decide to ‘Pull Back’ AI? — Ford Brings Back Humans, Godot Rejects Code, and Review Summaries Erase Complaints
Related Articles
“The Next Tough Decision After ‘We Implemented AI'”
There are countless stories about implementing AI. However, discussions about “stopping AI” rarely surface.
The decision to stop is far more challenging. When introducing AI, there are noble justifications like “efficiency” and “cost reduction.” In contrast, withdrawing involves the painful admission of “failure.” As a result, companies often continue using AI, allowing the damage to escalate.
In recent weeks, three cases of companies making the decision to stop using AI have emerged simultaneously: Ford, Godot, and TripAdvisor. While they differ in industry and scale, they share a common structure: “Humans can no longer verify the results produced by AI.”
For small and medium-sized enterprises (SMEs), this is not just someone else’s problem; it is often more serious than for large corporations because they lack the resources to redo things.
—
Ford: Discarding a $500 Million AI Camera and Bringing Back Humans for $70 Million a Year
Ford had introduced hundreds of AI cameras for design checks and quality inspections on its manufacturing lines, with an investment of about $500 million. However, what occurred on the ground was a chain of defects.
There were flaws in parts that the AI cameras deemed “problem-free.” Conversely, the cameras flagged non-defective items as “defective,” halting the production line. Both scenarios are catastrophic: the former leads directly to quality incidents, while the latter destroys production efficiency.
Ford’s decision was clear: “Rehire veteran engineers known as ‘graybeards.'” The annual cost for this was about $70 million. They effectively wrote off the $500 million investment and switched to a $70 million personnel cost.
What’s noteworthy here is the structure of the numbers. The $500 million for AI cameras included initial investment plus maintenance, tuning, and costs associated with false positives, which ballooned over time. In contrast, the $70 million for veteran engineers is a direct investment in “the quality of judgment.” The tacit knowledge based on experience—”This sound is unusual,” “This gloss indicates uneven painting”—cannot currently be replicated by AI cameras.
If we translate this to SMEs, consider introducing an AI camera for inspection at a cost of 3 million yen. However, if each false positive stops the line and incurs a monthly personnel cost of 200,000 yen, that totals 2.4 million yen in a year. By the second year, it exceeds the initial investment cost. The notion that ‘AI is cheaper’ can flip the moment operational costs are factored in.
—
Godot: A Quality Declaration of “We Will Not Accept AI-Written Code”
The open-source game engine Godot has explicitly stated that it will not accept contributions (pull requests) of AI-generated code.
The reason is simple: Developers who write code with AI cannot explain its behavior. When asked during reviews, “Why did you include this process?” they cannot respond. If bugs arise, they cannot trace the cause. In other words, there is no “responsible party” for the code.
For open-source projects, this is a matter of survival. To ensure the quality of code contributed by developers worldwide, it is a minimum requirement that “the person who wrote it understands it.” AI-generated code may appear to work at first glance, but it can exhibit unexpected behavior in edge cases or contain security vulnerabilities. Only someone who understands the code can discover these issues.
The same structure is occurring in system development for SMEs. Developers generate code using ChatGPT or Copilot and deploy it to production simply because “it worked.” Initially, there are no issues. However, when bugs arise six months later, no one can read that code. If they hire an external contractor for fixes, the cost will be equivalent to new development as they start from scratch.
The cost of writing code with AI has dramatically decreased. However, the “cost of maintaining AI-written code” has not decreased. Overlooking this can lead to short-term reductions in development costs transforming into long-term maintenance cost explosions.
If SMEs use AI code generation, they need to establish at least one rule: “Human reviewers must add comments to AI-written code, documenting why this implementation was chosen.” This alone can drastically reduce maintenance costs. It’s simple to do, but failing to do so can result in hundreds of thousands of yen in differences later.
—
TripAdvisor: AI Summarized ‘Food Poisoning Hotels’ as ‘Clean’
TripAdvisor’s newly implemented AI review summarization feature led to serious concealment of facts.
Specific examples revealed by investigations include:
- A hotel facing a lawsuit for food poisoning was summarized as “spotless.”
- A resort with reports of sexual harassment by staff was rated as having “friendly service.”
Why does this happen? AI summarization picks up on the “majority opinion.” If 95 out of 100 reviews say, “It was clean,” the 5 reviews mentioning “I got food poisoning” are dismissed as statistical noise. However, for consumers, those 5 reviews are the most critical information.
AI’s summarization averages based on majority opinion. However, in business, it is always the “outliers” that are fatal.
This is directly applicable to customer review management for SMEs. More companies are using AI to summarize Google Maps reviews for internal sharing. However, AI may summarize as “generally favorable,” overlooking a serious complaint. What if that complaint involved health department issues? What if it spread on social media?
Using AI for summarizing reviews is efficient. However, simply adding the rule that “negative reviews must be read in their original form by a human” can significantly reduce risks. The cost is nearly zero. This is a matter of system design, not a technical issue.
—
Establish Numerical Criteria for Deciding to ‘Pull Back’ AI
The commonality among the three cases is the structure that the moment humans can no longer verify AI outputs, the breakdown begins.
So where should SMEs make the decision to “pull back”? I propose checking the following three numbers monthly.
① Time Spent by Humans Correcting AI Outputs (per month)
Compare this with the time spent on that task before implementation. If correction time exceeds 50% of the original task time, it’s a red flag. Rather than reducing work, AI is creating a new task of “cleaning up after AI.”
② Loss Amount Due to AI’s False Positives/Outputs (per month)
Quantify the damages (complaint handling, rework, damage to reputation) that would occur if false outputs were released as is. If this exceeds the costs saved by implementing AI, immediate withdrawal should be considered.
③ Alternative Costs Without AI
Always keep track of the monthly costs if humans were to handle the tasks. In Ford’s case, while the annual maintenance cost for AI exceeded $100 million, switching back to humans cost only $70 million. This comparison allowed for a swift withdrawal decision.
If you have these three numbers, the “right time to stop” can be determined numerically rather than by intuition.
—
The Advantage of SMEs Making Quick Withdrawal Decisions
Large corporations struggle with withdrawing from AI due to deep decision-making hierarchies and the involvement of “the department that promoted the implementation.” Even Ford took time to make its decision.
SMEs are different. When the CEO thinks, “This isn’t working,” they can stop it next week. No bureaucratic approvals or board meetings are needed. This speed of decision-making is the greatest weapon for SMEs in the AI era.
It is more challenging to “pull back” AI than to “implement” it. Therefore, it is crucial to set “withdrawal criteria” before implementation. Check the three numbers monthly. If they exceed the criteria, stop. It’s simple, yet very few SMEs are doing this.
Conversely, just doing this can create a significant gap between companies that are “at the mercy of AI” and those that “master AI.”
This is not a technical issue; it is a matter of management judgment.
—
So, What Should We Do?
- Set numerical ‘withdrawal criteria’ when implementing AI. Correction time, loss amount, alternative costs. These three.
- Limit AI outputs to a range that can be verified by humans. If it cannot be verified, do not implement it.
- Do not leave negative information (complaints, defects, bugs) to AI. Humans must review this information in its original form.
- Review the numbers monthly; if they indicate problems, stop next month. The speed of decision-making in SMEs is a weapon that large corporations do not have.
Using AI is a means, not an end. The courage to “pull back” is where a manager’s true capabilities are revealed, more so than the courage to “implement.”
JA
EN