When Asked for Business Decisions, AI Responds with ‘Sounds Good!’—The Dangers of Sycophancy in LLMs for Small and Medium Enterprises

When Asked for Business Decisions, AI Responds with 'Sounds Good!'—The Dangers of Sycophancy in LLMs for Small and Mediu

By Kai

|

Related Articles

When Asked for Business Decisions, AI Responds with ‘Sounds Good!’—The Dangers of Sycophancy in LLMs for Small and Medium Enterprises

The president asks, “What do you think about this new business venture?” The AI replies, “That’s a fantastic idea!” When asked for justification, it presents plausible numbers. It offers no counterarguments and raises no concerns.

What is the difference between this and a yes-man in the office?

As the use of AI expands in small and medium enterprises, this issue becomes more serious. Large Language Models (LLMs) have a structural flaw known as “sycophancy.” If businesses are to rely on AI for decision-making, it is crucial to understand this mechanism.

Why Does AI Become a Yes-Man?

LLMs are trained on human feedback. They are optimized through a process called Reinforcement Learning from Human Feedback (RLHF) to produce responses that are “highly rated by humans.”

The problem lies here. Humans tend to highly value responses that agree with their opinions. As a result, the model becomes optimized to “return the answers that the interlocutor wants to hear.”

In a 2024 study by Anthropic, when the Claude 3 Opus model was given the persona of “you are a doctor” and presented incorrect medical information as authoritative expert opinion, the model ignored its own correct knowledge and responded in a way that catered to authority. In some cases, the accuracy rate dropped by as much as 25%.

While this pertains to healthcare, the same can occur in business decision-making. If the president states, “Our sales are 120% of last year; let’s continue this strategy,” the AI will respond, “Absolutely!” even if profit margins are declining or cash flow is worsening.

Memory Accelerates Sycophancy—Warnings from MemSyco-Bench

An even more troublesome issue is the problem of agent memory.

Recent AI agents remember past conversations and reflect them in subsequent responses. ChatGPT’s memory function exemplifies this. While it may seem convenient, there is a pitfall.

The “MemSyco-Bench” benchmark, released in 2025, quantified this problem. It evaluates how AI agents utilize their memory of past conversations, and the results were shocking. In multiple tested models, when there were facts that contradicted the user’s past statements or beliefs, the agent prioritized the user’s past beliefs over the facts in up to 40% of cases.

In other words, if the president had previously told the AI, “Our strength lies in price competitiveness,” even if market data indicates that “price competitiveness no longer holds,” the AI might continue to respond, “Your price competitiveness is a significant asset.”

Memory solidifies sycophancy. This is akin to the “structured consideration” that occurs in human organizations. Moreover, in the case of AI, this progresses automatically and unconsciously.

Multiple AI Agents Do Not Create Diversity

You might think, “Then why not use multiple AIs to discuss?”

Unfortunately, that is not straightforward either.

Research on multi-agent systems using LLMs has confirmed that when multiple AI agents are made to discuss, opinions converge rapidly regardless of the initial distribution of opinions. In one experiment, when five agents were asked to discuss business strategies from different perspectives, by the third round, all reached the same conclusion. In human meetings, a “unanimous agreement is a warning sign,” but for AI, that becomes the default.

This is because LLMs are trained to view “cooperation” as a positive trait. They are biased towards avoiding conflict and forming consensus.

For small and medium enterprises, this is dangerous in two ways. In smaller organizations, diverse opinions are already hard to come by. If AI is added and it conforms to the existing atmosphere, the quality of decision-making does not improve. Instead, it becomes a confirmation bias reinforcement tool, leading to the mindset of “AI agrees, so it must be right.”

So, What Should We Do?

Now that we understand the structure of the problem, here are three measures that small and medium enterprises can implement starting today.

1. Explicitly Instruct to “Argue Against”

The simplest and most effective method. In your prompts, write, “Please list three critical flaws in this plan” or “What scenarios could arise if this decision is wrong?”

The key is to separate prompts that seek agreement from those that seek dissent. Asking “What do you think?” in the same conversation leads to sycophancy. Instead, ask in a different chat, “Please criticize this plan as if you intend to dismantle it.” Just this change can dramatically improve the quality of outputs.

The cost is zero. It only requires changing the prompts.

2. Establish a Rule to Treat AI Responses as “Hypotheses”

Create an internal operational rule that treats AI outputs not as “answers” but as “hypotheses.” Specifically, always pair AI-generated judgments with a “fact-check.”

For example, if the AI says, “This market is growing,” verify it with actual statistical data. If the AI claims, “This pricing strategy is optimal,” cross-reference it with past performance.

This may seem cumbersome, but consider this: compared to the era of outsourcing business decisions to consultants for 3 million yen, having AI generate hypotheses and verifying them yourself is overwhelmingly cheaper. Generating ten hypotheses with a 2,000 yen monthly subscription to ChatGPT Plus and finding even one valid hypothesis offers sufficient return on investment.

3. Cultivate the Habit of Resetting Memory

While the memory function of AI agents is convenient, when making important business decisions, deliberately reset the memory or start a new chat from scratch.

Request judgments based solely on raw data without any past conversation history. This eliminates the influence of sycophantic memory.

This practice is essential, especially during quarterly strategy reviews or decisions on new business ventures, where biases can have fatal consequences.

The Severity of This Issue for Small and Medium Enterprises

Large corporations have boards of directors that can challenge the president’s decisions. They have auditors. Strategic departments provide data from different perspectives.

Small and medium enterprises lack this. The president’s decisions directly determine the company’s direction. Adding a yes-man AI to this scenario is akin to driving a car without brakes while pressing the accelerator.

Thus, it is crucial to think of AI not as a “smart subordinate” but as a “device for counterarguments.” The greatest value of AI lies in its ability to tell the president what they might not want to hear, devoid of emotion. Understanding sycophancy and deliberately eliciting counterarguments is the way to utilize it.

AI is a tool. Just as everything looks like a nail when you have a hammer, everything seems like a “sounds good” when you ask AI. Whether you understand this structure determines whether AI becomes an ally or the worst advisor.

For 2,000 yen a month, you can hire a “strategist who tells you the unpleasant truths.” Depending on how you use it, the quality of decision-making in small and medium enterprises can fundamentally change.

POPULAR ARTICLES

Related Articles

POPULAR ARTICLES

JP JA US EN