Back to News
Market Impact: 0.55

Perplexity accused of scraping websites that explicitly blocked AI scraping

NETNFLXGOOGLGOOG
Artificial IntelligenceTechnology & InnovationCybersecurity & Data PrivacyRegulation & LegislationLegal & Litigation

Cloudflare research indicates AI startup Perplexity is actively circumventing website blocks, including Robots.txt and user-agent restrictions, to scrape content from sites that explicitly prohibit it. Cloudflare observed Perplexity obscuring its identity by changing user agents and autonomous system networks across millions of daily requests, a practice Perplexity's spokesperson dismissed as a "sales pitch" while denying content access. This development underscores the escalating conflict between AI companies' data acquisition methods and content publishers' efforts to protect their intellectual property and business models, with Cloudflare now implementing new blocking measures and advocating for publishers to charge for AI scraping.

Analysis

A recent report from Cloudflare (NET) accuses AI startup Perplexity of systematically circumventing publisher restrictions to scrape web content, highlighting a significant and escalating conflict within the digital economy. Cloudflare's research details how Perplexity allegedly masks its identity, using tactics such as impersonating a Google Chrome browser and altering its network identifiers to bypass Robots.txt files across millions of daily requests. This activity directly contradicts the explicit preferences of website owners. While Perplexity's spokesperson has dismissed the report as a "sales pitch" and denied ownership of the implicated bot, the allegations add to previous accusations of plagiarism against the company. This development positions Cloudflare not merely as an observer but as a key enabler of publisher defense, reinforcing the value proposition of its recently launched tools designed to block or monetize AI scraping. The incident underscores a critical operational and legal risk for AI firms reliant on web data and signals a potential shift where infrastructure providers like Cloudflare can capitalize on policing the data acquisition practices of the AI industry.

AllMind AI Terminal

AI-powered research, real-time alerts, and portfolio analytics for institutional investors.

Request a Demo