How LLMs could revolutionize ad blocking

Ad blocking software has traditionally relied on filter lists. These need time consuming and regular maintenance in order to keep them up to date and effective.
There have been previous attempts to update ad blocking using machine learning, but new research from AdGuard looks at the potential to use large language models (LLMs) to improve the way it works.
Filter lists rely on either network rules to block specific domains, or cosmetic rules to spot certain elements in the CSS of the page. If kept up to date these techniques are effective but that updating requires considerable human effort, often by a community working for free. Attempts to replicate this by machine learning have proved expensive and the models vulnerable to attacks.
So what advantages do LLMs offer? Their capabilities range from generating high-quality text to analyzing data, creating images and videos, writing code, and supporting complex workflows. AdGuard has carried out experiments using LLMs in techniques like blocking by meaning -- both text and visual -- and using these to supplement filer lists.
AdGuard’s lead developer, Maxim Topciu, writes on the company’s blog, “…LLMs allow us to move beyond simple pattern matching and actually understand the meaning of web content. This opens up a completely new, semantic approach to filtering. And second, LLMs give us rapid prototyping. Ideas that used to take months of engineering can now be tested in a matter of hours. While there are still practical challenges to solve, this new approach allows us to rethink what’s possible in the world of content filtering.”
You can read more on the AdGuard blog. There’s also a working prototype browser extension for Chrome using LLM ad blocking which you can download and try out for yourself -- though note that this is still experimental.
Image credit: Oleksandr Shpak/Dreamstime