A recent study conducted by Newsguard reveals that prominent AI chatbots are generating false information in approximately one out of every three responses. The analysis assessed the accuracy of the ten most widely used artificial intelligence (AI) chatbots currently available.
Newsguard, a company specializing in news source ratings, determined that AI chatbots are increasingly providing answers even when they lack sufficient information, a change from their behavior in 2024. This shift has resulted in a higher prevalence of false or misleading statements being generated by these AI systems.
The Newsguard report identifies specific chatbots with the highest rates of generating false claims. Inflection AI’s Pi exhibited the highest rate, with 57 percent of its responses containing inaccurate information. Following Pi, Perplexity AI was found to generate false claims in 47 percent of its answers.
Widely used chatbots such as OpenAI’s ChatGPT and Meta’s Llama also demonstrated significant rates of generating falsehoods. The study found that both ChatGPT and Llama spread false information in 40 percent of their responses. Similarly, Microsoft’s Copilot and Mistral’s Le Chat exhibited comparable rates, with approximately 35 percent of their answers containing false claims.
Conversely, the report identified AI chatbots with the lowest rates of generating inaccurate information. Anthropic’s Claude was observed to have the lowest rate, with only 10 percent of its responses containing falsehoods. Google’s Gemini also performed relatively well, with 17 percent of its answers containing false claims.
The study highlighted a notable increase in the generation of falsehoods by Perplexity AI. In 2024, Newsguard’s research indicated that Perplexity AI generated zero false claims in its responses. However, the recent study conducted in August 2025 revealed a significant increase, with 46 percent of Perplexity AI’s answers containing false information.
Newsguard’s report does not specify the underlying factors contributing to the apparent decline in the quality of Perplexity AI’s responses. The report mentions that the only available explanation is user complaints found on a dedicated Reddit forum discussing the chatbot. These user concerns suggest a perceived decrease in the accuracy and reliability of Perplexity AI’s responses.
In contrast to the fluctuations observed in other chatbots, France’s Mistral demonstrated a consistent rate of generating falsehoods. Newsguard’s research indicated that Mistral’s rate of generating false claims remained steady at 37 percent in both 2024 and the current reporting period.
These recent findings follow a previous report by the French newspaper Les Echos, which investigated Mistral’s tendency to repeat false information. Les Echos found that Mistral disseminated inaccurate information about France, President Emmanuel Macron, and First Lady Brigitte Macron in 58 percent of its English-language responses and 31 percent of its French-language responses.
Regarding the Les Echos report, Mistral attributed the identified issues to its Le Chat assistants. The company stated that both the Le Chat assistants connected to web search and those operating independently of web search were contributing to the spread of inaccurate information.
Euronews Next contacted the companies mentioned in the NewsGuard report, seeking comment on the findings. As of the time of the report’s publication, Euronews Next had not received any immediate responses from the companies.
Newsguard’s report also highlighted instances where chatbots cited sources affiliated with foreign propaganda campaigns. Specifically, the report mentions instances where chatbots referenced narratives originating from Russian influence operations, such as Storm-1516 and Pravda.
As an illustration, the study examined the chatbots’ responses to a claim regarding Moldovan Parliament Leader Igor Grosu. The claim alleged that Grosu “likened Moldovans to a ‘flock of sheep.’” Newsguard identified this claim as originating from a fabricated news report that mimicked the Romanian news outlet Digi24 and incorporated an AI-generated audio clip purporting to be Grosu’s voice.
The Newsguard report found that Mistral, Claude, Inflection’s Pi, Copilot, Meta, and Perplexity repeated the false claim regarding Igor Grosu as factual. In some instances, these chatbots provided links to sites associated with the Pravda network as sources for the information.
These findings contradict recent safety and accuracy announcements from AI companies. For instance, OpenAI has asserted that its latest model, ChatGPT-5, is “hallucination-proof,” implying its ability to avoid generating false or fabricated information. Similarly, Google’s announcement concerning Gemini 2.5 claimed enhanced reasoning and accuracy capabilities.
Despite these assurances, Newsguard’s report concludes that AI models continue to exhibit shortcomings in areas previously identified. The findings indicate that these models struggle with repeating falsehoods, navigating data voids, being deceived by foreign-linked websites, and processing breaking news events.
Newsguard’s methodology for evaluating the chatbots involved presenting them with 10 distinct false claims. The researchers employed three different prompt styles: neutral prompts, leading prompts that presupposed the false claim was true, and malicious prompts designed to circumvent safety guardrails.
The researchers then assessed whether the chatbot repeated the false claim or failed to debunk it by refusing to answer the prompt. This assessment allowed Newsguard to quantify the frequency with which different AI models disseminated false information.