New fully open and transparent large language model launches -- it’s Swiss, of course


The Swiss have something of a reputation for being methodical -- particularly when it comes to things like banking -- so it’s no surprise that they take a similar approach to creating a large language model.
EPFL, ETH Zurich and the Swiss National Supercomputing Centre (CSCS) have today released Apertus, a large-scale, open, multilingual LLM. Apertus -- Latin for ‘open’ -- the name highlights its distinctive feature, that the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.
As a fully open language model, Apertus allows researchers, professionals and enthusiasts to build on the model and adapt it to their specific needs, as well as to inspect any part of the training process. This distinguishes Apertus from models that make only selected components accessible.
“Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry,” says Thomas Schulthess, director of CSCS and professor at ETH Zurich.
The upcoming Swiss {ai} Weeks hackathons will be the first opportunity for developers to experiment hands-on with Apertus, test its capabilities, and provide feedback for improvements to future versions. Swisscom will provide a dedicated interface to hackathon participants, making it easier to interact with the model. As of today, Swisscom business customers will be able to access the Apertus model via Swisscom’s sovereign Swiss AI platform.
For people outside Switzerland, the Public AI Inference Utility will make Apertus accessible as part of a global movement for public AI.
“Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles”, says Imanol Schlag, technical lead of the LLM project and research scientist at ETH Zurich.
Apertus is designed with transparency at its core, thereby ensuring full reproducibility of the training process. Alongside the models, the research team has published a range of resources: comprehensive documentation and source code of the training process and datasets used, model weights including intermediate checkpoints -- all released under the permissive open-source license, which also allows for commercial use.
Future versions aim to expand the model family, improve efficiency, and explore domain-specific adaptations in fields like law, climate, health and education. They are also expected to integrate additional capabilities, while maintaining strong standards for transparency
You can find out more on the ETH zurich site.