Articles about AI model training

Meta may have torrented over 80 terabytes of pirated books to train its AI models

Meta AI

Just how AI models should be trained has been a subject of debate for some time now, with there being a lot of focus in whether publicly posted social media content is ripe for the picking or not. Now a new lawsuit suggests that Meta has been using pirated ebooks as a data source.

Emails that are serving as evidence in a copyright case against Meta appear to show that the Facebook owner has torrented scores of terabytes of data from a number of online resources. Among the places mentioned in newly released unredacted emails are Anna’s Archive, Z-Library and LibGen.

Continue reading

Cloudflare introduces AI Audit to help websites manage AI access and content usage

Cloudflare has introduced AI Audit, a new set of tools aimed at helping websites manage how artificial intelligence (AI) models access and use their content. AI Audit allows content creators to see how their content is being used by AI models and take steps to control access. Additionally, Cloudflare is working on a pricing feature that will enable creators to set a price for AI companies using their content for model training and retrieval augmented generation (RAG).

Many website owners may not be aware that AI bots are scanning their content frequently, often without the creator’s knowledge or compensation. AI Audit is designed to give control back to content owners, allowing them to block AI bots, access analytics on content usage, and negotiate agreements for the use of their content by AI models.

Continue reading

BetaNews, your source for breaking tech news, reviews, and in-depth reporting since 1998.

Regional iGaming Content

© 1998-2025 BetaNews, Inc. All Rights Reserved. About Us - Privacy Policy - Cookie Policy - Sitemap.