Back to News
Market Impact: 0.55

Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

RDDTGOOGLGOOGNET
Artificial IntelligenceLegal & LitigationCybersecurity & Data PrivacyTechnology & InnovationPatents & Intellectual PropertyCompany FundamentalsMedia & Entertainment
Reddit drags Perplexity in a new lawsuit, accusing it of building up a $20 billion company off stolen data

Reddit has filed a lawsuit against Perplexity and three other data mining companies, accusing them of illegally scraping its content via Google search results to train AI models, thereby circumventing Reddit's digital guardrails. The social media platform, which has existing data deals with Google and OpenAI, alleges Perplexity continued and even increased its scraping activities after receiving a cease-and-desist letter, using a "marked bill" test post as evidence of unauthorized data ingestion. This legal action underscores Reddit's strategy to monetize its valuable content library for search and AI, highlighting the escalating conflict over data ownership and usage in the generative AI landscape.

Analysis

Reddit (RDDT) has filed a lawsuit against Perplexity and three other data mining firms, alleging illegal scraping of its proprietary content via Google search results to train AI models, thereby circumventing Reddit's digital guardrails. The lawsuit highlights that Perplexity's citations to Reddit content increased forty-fold after a May 2024 cease-and-desist letter, indicating a deliberate disregard for Reddit's terms. This legal action underscores the escalating conflict over data ownership and intellectual property in the generative AI landscape. Reddit's strategy involves monetizing its extensive content library for AI and search, evidenced by existing partnerships with Google (GOOGL) and OpenAI, and a significant investment of tens of millions of dollars in anti-scraping systems. The company presented compelling evidence, including a "marked bill" test post, which appeared in Perplexity's "answer engine" despite being accessible only to Google's search engine, proving unauthorized data ingestion. This proactive defense aims to solidify Reddit's position as a valuable data source. The litigation carries significant implications for the valuation of user-generated content platforms and the operational models of AI companies. Reddit, which went public with a $6.4 billion valuation, views its content as a unique asset for becoming a "true search destination." The outcome could set precedents for how AI firms acquire and utilize data, potentially impacting future data licensing agreements and the competitive landscape for AI training material.