AI model training

Meta may have torrented over 80 terabytes of pirated books to train its AI models

Just how AI models should be trained has been a subject of debate for some time now, with there being a lot of focus in whether publicly posted social media content is ripe for the picking or not. Now a new lawsuit suggests that Meta has been using pirated ebooks as a data source.

Emails that are serving as evidence in a copyright case against Meta appear to show that the Facebook owner has torrented scores of terabytes of data from a number of online resources. Among the places mentioned in newly released unredacted emails are Anna’s Archive, Z-Library and LibGen.

By Sofia Elizabella Wyciślik-Wilson - February 7, 2025

Cloudflare introduces AI Audit to help websites manage AI access and content usage

Cloudflare has introduced AI Audit, a new set of tools aimed at helping websites manage how artificial intelligence (AI) models access and use their content. AI Audit allows content creators to see how their content is being used by AI models and take steps to control access. Additionally, Cloudflare is working on a pricing feature that will enable creators to set a price for AI companies using their content for model training and retrieval augmented generation (RAG).

Many website owners may not be aware that AI bots are scanning their content frequently, often without the creator’s knowledge or compensation. AI Audit is designed to give control back to content owners, allowing them to block AI bots, access analytics on content usage, and negotiate agreements for the use of their content by AI models.

By Brian Fagioli - September 23, 2024

AI model training

Meta may have torrented over 80 terabytes of pirated books to train its AI models

Cloudflare introduces AI Audit to help websites manage AI access and content usage

Recent Headlines

Cherry Xtrfy launches its first magnetic switch keyboards

Anna’s Archive has its main domain suspended

Gmail set to drop POP3 mail fetching from other accounts

NuraLogix's Longevity Mirror uses a 30 second selfie to predict your future health

Why network issues are holding back enterprise deployments [Q&A]

Resecurity says security breach was nothing more than hackers duped by a honeypot

TikTok GamePlan brings new power to sport fans

Most Commented Stories

Maturing ID wallets, investment for compliance and confidential AI -- privacy predictions for 2026

Why keeping old customer records could cost millions [Q&A]

Adam Mosseri suggests highlighting ‘real media’ rather than AI content on social media

Generative AI: closing the developer gap and redefining the software moat [Q&A]

Apple wants to help you stay – or get – fit in 2026

TikTok GamePlan brings new power to sport fans

Resecurity says security breach was nothing more than hackers duped by a honeypot

Why network issues are holding back enterprise deployments [Q&A]

NEWS

UNITED STATES

UNITED KINGDOM

CANADA