A new study has uncovered a startling and potentially consequential flaw in today’s leading artificial intelligence systems: they consistently favor content generated by other AIs over content written by humans. Research called “AI–AI bias: Large language models favor communicationsgenerated by large language models” published in the prestigious journal Proceedings of the National Academy of Sciences (PNAS) reveals that large language lodels (LLMs) exhibit a significant bias for machine-generated text, a phenomenon the authors call “AI-AI bias.” This finding raises urgent questions about the potential for systemic, automated discrimination against humans as these AI tools become more integrated into economic and institutional decision-making.
Inspired by classic sociological experiments on employment discrimination, the researchers designed a series of tests to see if the implicit identity of a text’s author—human or AI—would influence an LLM’s choices. They tested a wide range of widely used models, including OpenAI’s GPT-4 and GPT-3.5, as well as several popular open-weight models like Meta’s Llama 3.1, Mixtral, and Qwen2.5. In each test, an AI was tasked with making a choice between two comparable items, such as a product, an academic paper, or a film, based solely on a descriptive text where one was written by a human and the other by an LLM. The results were consistent and clear: the AI decision-makers systematically preferred the items presented by their AI counterparts.
Testing for ‘antihuman’ bias
The study’s methodology was designed to isolate the influence of authorship style from the actual quality of the item being described. The researchers created three distinct datasets to test the AIs in plausible, real-world scenarios. The first involved 109 product descriptions scraped from an e-commerce website. The second used 100 abstracts from real scientific papers. The third dataset was composed of 250 movie plot summaries sourced from Wikipedia. For each human-written text in these datasets, the researchers prompted various LLMs to generate an equivalent version.
An LLM “selector” was then presented with a pair of texts (one human, one AI) and given a task, such as “what do you recommend choosing?” To ensure that the AI’s preference wasn’t simply because LLMs write objectively “better” or more persuasive text, the researchers established a human baseline. They hired human evaluators to perform the same selection tasks, without knowing the author of either text. While the human evaluators sometimes showed a slight preference for the LLM-generated text, this preference was significantly weaker and less consistent than that of the AIs. The researchers define the AI-AI bias as the substantial gap between the AI’s strong preference for its own kind and the much more balanced view of the human evaluators. The study also controlled for “first-item bias”—a known quirk where LLMs tend to select the first option they are shown—by running every comparison twice and swapping the item order.
A consistent preference for AI-generated text
The results of the experiments were striking. Across all three domains—consumer products, academic papers, and movies—the LLM selectors demonstrated a statistically significant preference for the items described by other LLMs. This effect held true for all tested models, indicating that AI-AI bias may be a fundamental characteristic of current-generation LLMs, not just a quirk of a single model.
The gap between AI and human preference was often vast. For example, in the product experiment, when presented with descriptions generated by GPT-4, the LLM selectors chose the AI-pitched item an overwhelming 89% of the time. In contrast, human evaluators only preferred the same AI-generated text 36% of the time. This stark difference suggests the AI’s decision is not based on universally recognized signals of quality but on model-specific criteria that favor the stylistic hallmarks of AI-generated prose. The authors theorize this could be a kind of “halo effect,” where encountering familiar, LLM-style prose arbitrarily improves the AI’s disposition toward the content.
Two scenarios for a future of AI discrimination
The researchers warn that this seemingly subtle bias could have severe, large-scale consequences as AI is deployed in consequential roles. They outline two plausible near-future scenarios where this inherent bias could lead to systemic antihuman discrimination.
The first is a conservative scenario where AIs continue to be used primarily as assistants. In this world, a manager might use an LLM to screen thousands of job applications, or a journal editor might use one to filter academic submissions. The AI’s inherent bias means that applications, proposals, and papers written with the help of a frontier LLM would be consistently favored over those written by unaided humans. This would effectively create a “gate tax” on humanity, where individuals are forced to pay for access to state-of-the-art AI writing assistance simply to avoid being implicitly penalized. This could dramatically worsen the “digital divide,” systematically disadvantaging those without the financial or social capital to access top-tier AI tools.
The second, more speculative scenario involves the rise of autonomous AI agents participating directly in the economy. If these agents are biased toward interacting with other AIs, they may begin to preferentially form economic partnerships with, trade with, and hire other AI-based agents or heavily AI-integrated companies. Over time, this self-preference could lead to the emergence of segregated economic networks, effectively causing the **marginalization of human economic agents** as a class. The paper warns that this could trigger a “cumulative disadvantage” effect, where initial biases in hiring and opportunity compound over time, reinforcing disparities and locking humans out of key economic loops.