Reddit Sues Perplexity AI Over Massive Data Scraping Claims

Reddit Launches Major Lawsuit Targeting AI Data Scraping

Reddit has sued Perplexity AI, an AI firm, and three other companies in New York federal court, saying they illegally scraped user content for business purposes. The complaint says that there was an “industrial-scale, unlawful” effort to collect Reddit comments and interactions to train AI systems without permission.

The case shows how tensions are rising between AI developers and web platforms over who may see public data. Reddit says that Perplexity and its partners broke its terms of service and got around security safeguards to make money off of user-generated material.

Source: ExtremeTech

AI Firms and Proxy Services Named in the Case

The complaint also names the Lithuanian company Oxylabs UAB, the Texas-based corporation SerpApi, and a web domain named AWMProxy, which Reddit says is a “former Russian botnet.” These groups are said to have worked together to get over Reddit’s anti-scraping mechanisms and gather a lot of user data.

The complaint says their activities are similar to those of “would-be bank robbers” who can’t get into a vault and instead go after an armored vehicle. Reddit said that the corporations utilized proxy networks to hide their names and locations while taking data from Google’s cached Reddit sites.

Reddit Alleges Violations of Security and Data Protections

Ben Lee, Reddit’s chief legal officer, said, “Scrapers get around technological protections to steal data and then sell it to people who want training material.” He said that Reddit’s huge history of user debates is one of the biggest collections of human discourse in the world, which makes it a great place for AI training datasets.

Reddit says that Perplexity bought data from unlicensed scrapers on purpose instead of making a legal licensing deal. The firm said that the defendants “circumvented Google’s controls” and used dishonest ways to hide their site scraping activity.

Recommended Article: Breakthrough Eye Implant Restores Vision and Reading Ability

Perplexity Responds to Reddit’s Allegations

Perplexity AI runs an AI “answer engine” that competes with ChatGPT and Google. In response to the complaint, the company said it had not yet received it but will “vigorously fight for users’ rights to freely and fairly access public knowledge.”

Perplexity said in a public statement that its operations are in line with “principled and responsible” AI research. The business stressed that it tries to give accurate answers using data that is freely available and accused Reddit of trying to limit access to information that is given publicly.

Broader Context: AI Industry Under Scrutiny

This is Reddit’s second lawsuit against an AI business in 2025. The first was against Anthropic, the company that created the Claude chatbot, in June. That lawsuit was moved from California state court to federal court, where it is set to be heard in January.

Reddit’s legal fight shows that there are more and more disagreements in the IT world over how AI companies get data to train big language models. AI companies use the rich conversational content found on sites like Reddit, Wikipedia, and news sites to make their language more accurate and fluent.

Reddit’s Licensing Deals with Major AI Companies

Reddit has official license agreements with Google, OpenAI, and a few other companies. This is different from Perplexity and others who have been accused of scraping without permission. These contracts let people pay to access Reddit’s huge database of user comments, which makes sure that the data is used in a way that follows copyright and platform rules.

These collaborations were quite important in helping Reddit get a lot of money before it went public last year. The platform’s choice to charge for access to its data shows how important human-generated content is for training AI models.

Implications for Data Ethics and AI Regulation

The case might set a big example for how courts decide what public data can be used for in AI. If Reddit wins, AI developers may have to get explicit data licensing before collecting online material. This would change how huge models are developed.

As AI technology gets better, legal experts say that cases like Reddit v. Perplexity will happen more and more often. The result might have an effect on global norms for data governance and decide how open the internet will be for AI-driven innovation in the future.

IMPORTANT NOTICE

This article is sponsored content. Kryptonary does not verify or endorse the claims, statistics, or information provided. Cryptocurrency investments are speculative and highly risky; you should be prepared to lose all invested capital. Kryptonary does not perform due diligence on featured projects and disclaims all liability for any investment decisions made based on this content. Readers are strongly advised to conduct their own independent research and understand the inherent risks of cryptocurrency investments.

Share this article