ETH Zurich and EPFL will release a large language model (LLM) developed on public infrastructure. Trained on the “Alps” supercomputer at the Swiss National Supercomputing Centre (CSCS), the new LLM marks a milestone in open-source AI and multilingual excellence.

"The model will be fully open: source code and weights will be publicly available, and the training data will be transparent and reproducible, supporting adoption across science, government, education, and the private sector. This approach is designed to foster both innovation and accountability.

A distinctive feature of the model is its capability in over 1000 languages. “We have emphasised making the models massively multilingual from the start,” says Antoine Bosselut.

Training of the base model was done on a large text dataset in over 1500 languages — approximately 60% English and 40% non-English languages — as well as code and mathematics data. Given the representation of content from all languages and cultures, the resulting model maintains the highest global applicability."

https://ethz.ch/en/news-and-events/eth-news/news/2025/07/a-language-model-built-for-the-public-good.html
#llm #ai #switzerland