Back to News
Market Impact: 0.5

AI models are using material from retracted scientific papers

CCSIGOOGLGOOG
Artificial IntelligenceTechnology & InnovationHealthcare & Biotech

Recent studies confirm that leading AI models, including popular chatbots and specialized research tools, frequently incorporate and reference content from retracted scientific papers without disclosing their invalid status. This widespread issue, demonstrated across platforms like ChatGPT, Elicit, and Perplexity, raises significant concerns about the reliability and trustworthiness of AI-generated information, particularly in critical domains such as medical advice and scientific research. The problem poses substantial risks to the integrity of AI applications and could complicate institutional investments in AI for scientific advancement, despite some developers beginning to integrate retraction data to address the challenge.

Analysis

Recent studies reveal a systemic flaw across the artificial intelligence sector, where prominent AI models, including OpenAI's ChatGPT and specialized research tools from Elicit, Perplexity, and Consensus, are incorporating information from retracted scientific papers without indicating their invalid status. A study on ChatGPT (GPT-4o) found it referenced retracted papers in approximately 24% of test cases, while initial tests by MIT Technology Review showed other research tools like Consensus and Ai2 ScholarQA cited invalid papers in 86% and 81% of cases, respectively. This issue, reflected in the strongly negative sentiment score (-0.6), poses a significant integrity risk to AI applications, particularly in critical fields like medicine and science where the US National Science Foundation is investing $75 million. The corporate response is fragmented, creating potential for differentiation; Consensus (CCSI) has proactively integrated retraction data, improving its results significantly and earning a positive per-ticker sentiment (0.5), whereas peers have been slower to act or have downplayed accuracy claims. The problem is compounded by a lack of comprehensive retraction databases and inconsistent publisher standards, suggesting this is a persistent industry-wide challenge rather than an easily rectifiable bug.

AllMind AI Terminal

AI-powered research, real-time alerts, and portfolio analytics for institutional investors.

Request a Demo

Market Sentiment

Overall Sentiment

strongly negative

Sentiment Score

-0.60

Ticker Sentiment

CCSI0.50
GOOG0.00
GOOGL0.00

Key Decisions for Investors

  • Investors should recognize this data integrity issue as a material reputational and operational risk for companies developing large language models, especially those targeting high-stakes professional markets like healthcare and academia.
  • The proactive measures and subsequent performance improvement by Consensus (CCSI) suggest an opportunity for differentiation; firms that can demonstrate superior data vetting processes may command a premium and capture trust in specialized verticals.
  • For those invested in AI-reliant sectors such as biotechnology, it is now critical to perform due diligence on the integrity of the AI tools used in their portfolio companies' R&D pipelines, as reliance on flawed data could impede innovation and create hidden risks.