The Unlikely Copyright Crusader Taking on Meta’s AI Data Machine
When we think of the battle to protect copyrighted material from the insatiable appetite of artificial intelligence, we usually picture bestselling authors,...

When we think of the battle to protect copyrighted material from the insatiable appetite of artificial intelligence, we usually picture bestselling authors, independent illustrators, or major news publishers. But one of the most revealing legal challenges to Silicon Valley's data-scraping practices is currently being spearheaded by an entirely different kind of plaintiff: an adult entertainment company.
A federal judge recently rejected Meta’s attempt to dismiss a lawsuit filed by Strike 3 Holdings, a company operating several popular adult content sites. The lawsuit alleges that Meta systematically scraped the internet to feed its AI models, downloading over 81 terabytes of data from Anna’s Archive—a massive repository of pirated books, movies, and, as it turns out, pornography. According to the plaintiff's investigation, 47 IP addresses belonging to Meta were used to torrent nearly 2,400 of its videos over 6,000 times between 2018 and 2025.
Meta’s defense was surprisingly simple, if not slightly embarrassing. The tech giant argued that Strike 3 Holdings couldn't prove the videos were actually intended for AI training. Instead, Meta suggested the downloads might simply be the work of "rogue employees" using company internet to watch adult content on company time.
Judge Eumi K. Lee, however, saw through the defense, pointing to digital footprints that perfectly illustrate the blind, algorithmic nature of AI data scraping. The court noted that Meta’s IP addresses were downloading files based on broad keyword matching rather than human curation. The smoking gun? On a single day in December 2022, Meta's network downloaded several adult films featuring the word "teen" in the title—alongside episodes of the animated series Teenage Mutant Ninja Turtles and Teen Titans Go to the Movies. As the judge noted, suggesting that individual employees independently decided to torrent a bizarre mix of pornography and children's cartoons simultaneously "strains credulity."
Beyond the dramatic details, this ruling marks a significant moment in AI copyright law. The judge determined that whether Meta actually fed these specific videos into its AI models is legally irrelevant to the motion at hand. The very act of torrenting the files—which involves illegally downloading and simultaneously distributing (seeding) them to others—is enough to establish a plausible claim of copyright infringement.
This legal pivot is crucial. For the past year, tech companies and creators have been locked in complex debates about whether an AI model "memorizes" content or simply "learns" from it. By focusing purely on the mechanics of how the data was acquired in the first place, this lawsuit bypasses the black box of AI entirely. It serves as a stark reminder that while AI technology may be novel, the traditional laws governing how files are copied and distributed across the internet still apply. As AI companies continue their aggressive pursuit of training data, they may find that their indiscriminate vacuuming leaves behind a very traditional trail of legal liability.
Key Points
- A federal judge allowed an adult content company's copyright lawsuit against Meta to proceed.
- Meta's defense that 'rogue employees' downloaded the videos was undermined by evidence of automated, keyword-based scraping.
- The scraping algorithms indiscriminately downloaded everything from adult films to 'Teenage Mutant Ninja Turtles' based on shared keywords.
- The court ruled that the act of torrenting the files is actionable copyright infringement, regardless of whether the data was ultimately used in an AI model.
Why It Matters
By focusing on the illegal acquisition of data rather than how the AI processes it, this case provides a potential blueprint for other copyright holders to hold tech giants accountable without having to decipher the technical complexities of AI training.
Sources: