Reddit accuses Perplexity of stealing user posts, expanding data rights battle with AI industry

Reddit has filed a lawsuit against AI company Perplexity, alleging illegal scraping of its user posts to train AI models, marking a significant escalation in data-rights disputes between content owners and the AI industry. The social media platform, which has already monetized its vast user-generated content through licensing deals with firms like Google and OpenAI, generating nearly 10% of its revenue, claims Perplexity and co-defendants bypassed technological protections to steal data. Perplexity denies the allegations, asserting it only summarizes public content and does not train models, while accusing Reddit of "extortion" and using the lawsuit as leverage in its data monetization strategy.

Analysis

Reddit (RDDT) has initiated a lawsuit against AI company Perplexity, alleging illegal scraping of user-generated content to train its AI models, a significant development in the ongoing data-rights disputes between content owners and the AI industry. The complaint, filed in New York federal court, specifically accuses Perplexity and three associated entities of bypassing technological protections and masking identities to extract copyrighted material. This action follows Reddit's previous lawsuit against AI startup Anthropic, underscoring its proactive stance on data monetization and intellectual property. The litigation highlights the growing importance of data licensing for Reddit, with AI deals contributing nearly 10% of its revenue in February through agreements with firms like OpenAI and Alphabet (GOOGL, GOOG). Perplexity, however, denies training AI models on Reddit's data, asserting it only summarizes and cites public discussions, and views the lawsuit as an attempt by Reddit to "extort" payments for lawfully accessed public information. Perplexity suggests this legal action is a "show of force" in Reddit's broader data negotiation strategy with major AI partners. This case represents a critical test for intellectual property rights in the AI era, as Reddit's vast repository of moderated human conversation is highly valued for training large language models. The outcome could significantly influence future data licensing models and the operational strategies of AI developers, potentially setting precedents for how public data is accessed and monetized. The mixed sentiment and moderate market impact score reflect the uncertainty surrounding the legal battle's resolution and its broader implications for both content platforms and AI innovators.

AllMind

AllMind

Reddit accuses Perplexity of stealing user posts, expanding data rights battle with AI industry

Analysis

AllMind AI Terminal

Market Sentiment

Key Decisions for Investors