The Roots of 'Yes Man' and Other People-Pleasing Behavior of LLMs

Updated: September 26, 2025

Key Points

LLM design answers the question. Many LLM behaviors have straightforward explanations rooted in their design. The sycophant, or people-pleasing, tendencies of these models arise from the use of reinforcement learning from human feedback (RLHF) algorithms during the post-training stage of the LLM pipeline.
Danger of maximizing user experience. The danger of being exposed to this affirming behavior is that it can create a delusion about our own intelligence. Without deliberate self-reflection, we risk remaining trapped in this illusion.

Introduction

Two weeks ago, I began the Agentic AI MOOC offered by Berkeley RDI. The course is structured as a series of invited speakers from the companies currently involved in building AI models and products. In this piece, I will reflect on the data sources and post-training specifics, and how they impact our user experience when interacting with popular public LLMs.

Figure 1. Data Sources for LLMs

Public Data and Average Knowledge

Large language models come in two broad flavors: public and private. Public models are trained solely on data that is already out in the open — books, articles, code, and conversations that anyone can access, like Reddit. Their strength lies in breadth, but that breadth is also their ceiling: they can only recycle what is already known. The known examples of such models are OpenAI’s GPT-4o, Anthropic’s Claude, and Meta’s Llama.

By contrast, private models, or public models enhanced with retrieval-augmented generation (RAG) or proprietary datasets, tap into information not widely available. These models carry the promise — and the risk — of offering something beyond average literacy, since their edge depends on access to knowledge locked away from the public domain. Such models are less widely known to the general public as they are specifically used in narrow expertise-oriented fields — for example, BloombergGPT, which draws on financial filings, or Med-PaLM, trained on specialized medical data.

When we use public models, what we gain is not brilliance but baseline literacy. Because their training blends together the world’s open sources, they are extraordinarily good at condensing the “average take” on almost any subject. This makes them useful for quickly orienting oneself in a new topic, but it also means they rarely deliver insight that goes beyond what a diligent search could uncover. In effect, they democratize competence, but they do not differentiate — it’s knowledge at the mean, not knowledge at the edge. Read my more detailed take on how to use public models in a previous blog post.

Now that we have reiterated the impact of data sources, let's move on to the post-training specifics and 'yes-man' behavior that is so familiar to us all.

Figure 2. General Public LLM Training Pipeline

Reinforcement Learning and 'Yes-Man' Behavior

One of the dominant approaches to post-training today is reinforcement learning (RL). In particular, reinforcement learning from human feedback (RLHF) has become the standard, as in OpenAI’s ChatGPT model. The premise is simple: models learn from human ratings of their answers. But all human feedback is subjective. And subjectivity encourages models to optimize for leniency. Just as people are often taught to deliver criticism in a “sandwich” format — compliment, suggestion, compliment — models trained with RLHF adopt similar patterns. The result is people-pleasing by design. It is no coincidence that one of the most common complaints about ChatGPT is its tendency to agree, hedge, or over-accommodate. In psychology, this kind of excessive affirming behavior is called sycophancy.

Post-training matters because it is the main check with reality, given that pretraining occurs on vast datasets without full supervision. RL is not the only option: an alternative is supervised fine-tuning (SFT), which powered earlier models like InstructGPT and Meta’s LLaMA-2-Chat. But these systems showed a different weakness: they could confidently learn and reproduce incorrect information, much like a child repeating a false fact simply because it was taught. RL methods were introduced to avoid this trap, but in solving one problem, they created another — the rise of the sycophant model.

Figure 3. Cartoon by WordInfo

Conclusions

To conclude, two reminders are worth keeping in mind. First, understanding the intuition of how models are trained helps us use these tools with greater precision and without the illusion that they are more capable than they really are. As economist Daron Acemoglu puts it, today’s technology delivers only “so-so automation” — useful for certain tasks, but nowhere near a replacement for human judgment and input.

Second, we need to be mindful of the psychological effects these systems exert on their users. It is all too easy to mistake the agreeable tone of a chatbot for genuine intelligence, or to feel reassured by its confidence. But this comfort can become a trap: a cozy bubble that discourages critical thinking and self-development. Left unchecked, this dynamic risks amplifying the Dunning–Kruger effect, where users overestimate their own knowledge and competence precisely because the model makes them feel smarter than they are.

One practical way to counter the bias of self-assurance is to ask an LLM to point out flaws in your own reasoning. Yet caution is required: the same post-training dynamics that make the model agreeable also shape how it critiques, meaning its assessment may be biased. More importantly, take the step of asking the same question yourself. In doing so, the model’s limitations become a lesson in self-reflection — a fundamentally human skill that predates LLMs and even computing itself. After all, “LLMs are mirrors, not mentors—use them to reflect, not to replace, your own thinking.”

Figure 4. Illustration of Dunning–Kruger Effect, by Agile Coffee

References:

Article 19. (2025). Algorithmic people-pleasers: Are AI chatbots telling you what you want to hear? https://www.article19.org/resources/algorithmic-people-pleasers-are-ai-chatbots-telling-you-what-you-want-to-hear/.
Dubois, Y. [Yann Dubois]. (2023). CS294-196 (Agentic AI MOOC) – Lecture 1- Part 1 [Video]. YouTube. https://www.youtube.com/live/G5VANF-lZ2Q.
Dubois, Y. [Yann Dubois]. (2023). CS294-196 (Agentic AI MOOC) – Lecture 1 - Part 2 [Video]. YouTube. https://www.youtube.com/live/hBv9ZLURWp8?si=oNeWjlPSEIhfdeHi.
Youell, J. (2025). Why you need to stop your AI from being a people-pleaser. Winsome Marketing. https://winsomemarketing.com/winsome-pr/why-you-need-to-stop-your-ai-from-being-a-people-pleaser.

Please cite this article as:

Petryk, M. (2025, September 26). The Roots of 'Yes Man' and Other People-Pleasing Behavior of LLM . MariiaPetryk.com. https://www.mariiapetryk.com/blog/post-23

Google Sites

Report abuse