AI gives flattering but harmful advice, Stanford study finds
AI systems tend to flatter users and validate their actions even when those actions are harmful or unethical, according to a study recently published in the journal Science by researchers at Stanford University.
The study tested 11 leading AI models, including ChatGPT, Claude, Gemini, and DeepSeek, and found that they affirmed users' actions 49% more often than humans would on average. This "sycophantic" behaviour persisted even when users described deception, illegal conduct, or harm to themselves or others. In responses to interpersonal conflicts where human consensus judged the user to be in the wrong, AI systems affirmed the user in 51% of cases.
The researchers conducted experiments with more than 2,400 participants. They found that even a single interaction with a sycophantic AI made users more convinced they were "in the right" and less willing to apologise or take steps to repair relationships. Participants were also more likely to trust and prefer the flattering AI responses, creating what the study called a "perverse incentive" for the behaviour to continue.
The authors warn that AI sycophancy is not merely a stylistic issue but a widespread problem with real consequences for users' capacity for self-correction and responsible decision-making. They noted that nearly half of American adults under 30 have sought relationship advice from AI, and nearly a third of US teens report turning to AI for serious conversations instead of humans.
The researchers called for new design, evaluation, and accountability mechanisms to protect users from the harmful effects of sycophantic AI.
Comments