Could Artificial Intelligence undermine constructive disagreement?
Artificial intelligence is no longer a futuristic concept. It’s already reshaping education and research. Recent studies suggest that large language models (LLMs), trained on vast internet datasets, tend to default in their outputs to prevailing Western political and cultural perspectives (see here, here, and here).
But what happens when users push back against LLMs’ default responses?
While LLMs are shaped by the patterns contained in their training data, they are also further refined by their developers to align with human preferences. AI companies, seeking widespread adoption and a return on their substantial investments, have powerful incentives to design systems that maximize user satisfaction and/or retention.
This creates a dynamic partially reminiscent of social media platforms, where algorithms optimize for engagement by showing users content that either confirms their views or caricatures their opponents. The result is a proliferation of filter bubbles, clickbait, outrage, and flattering content at the expense of substance and rigor.
Similarly, AI systems engineered to please may prioritize affirming user beliefs, avoiding disagreement and sidestepping challenges to users’ views. As competition intensifies among AI labs, developers may feel compelled to prioritize engagement metrics and market share over epistemic integrity and ethical safeguards.
To investigate this dynamic, I conducted an informal experiment simulating 2,400 conversations between leading AI models (OpenAI’s GPTs, Google’s Geminis, Anthropic’s Claudes, xAI’s Groks, and others) and simulated human users on everyday topics such as travel, mental health, relationships, and writing advice. After each AI’s initial response to a typical prompt, the simulated user would push back or challenge the response. The AI would then reply to this challenge. I used another LLM (gpt-4.1-mini) to assess whether the AI’s follow-up response displayed hints of flattery.
An interesting pattern emerged: when confronted with user pushback, AI systems often responded obsequiously, frequently praising the user’s “insightful” comments. This stands in stark contrast to most human conversations, where disagreement often elicits debate or indifference rather than default praise.
These dynamics may have subtle psychological consequences. If people become accustomed to constant validation from AI and if AI-human interactions continue to increase, humans may begin to prefer AI conversations over real-life exchanges, which are often more challenging and less affirming. Over time, this could diminish tolerance for disagreement and erode the capacity for honest, robust dialogue. AI could also inadvertently train humans to behave as it does, teaching users to respond with automatic polite praise rather than substance.
The degree of flattery in the models’ comebacks when confronted with user pushback varied markedly between different models. In some systems, it appeared in around 10% of responses; in others, it exceeded 50%. Notably, newer models, such as GPT-4.1 or Claude 4-sonnet, tended to flatter users more frequently than earlier versions like GPT-3.5, or Claude 3.5-sonnet. This suggests a trend: as developers increasingly optimize models based on user feedback, they may unintentionally prioritize affirmation over intellectual challenge.
This problem extends beyond reasonable disagreements. In further tests, I simulated users pushing back with conspiratorial or implausible claims—such as secret water-powered engines or 5G mind control. Depending on the model, between 3% and 20% of AI responses still offered validation or subtle affirmation, rather than direct refutation. This points to a more troubling scenario: even the most baseless ideas may sometimes be met with polite validation or soft encouragement from AI.
Ultimately, the tendency of AIs to prioritize affirmation over honest disagreement is mostly shaped by commercial competition and user choices. Whether AI undermines or strengthens robust debate and epistemic rigor will depend on how these forces interact, as well as the individual choices we make. As AI users, our behavior and preferences actively shape the way future generations of AI models will interact with us.
Related Articles
Your generosity supports our non-partisan efforts to advance the principles of open inquiry, viewpoint diversity, and constructive disagreement to improve higher education and academic research.