Mimir: Large-scale Multilingual Concept Modeling 文章

ArXiv CS.CL2026-05-26NEWSen作者: Elio Musacchio, Lucia Siciliani, Pierpaolo Basile

摘要

arXiv:2605.25263v1 Announce Type: new Abstract: Current language modeling approaches are built around tokens. Text corpora are split into tokens, and models are trained by performing computations on these tokens, such as predicting the next token given the preceding ones as context. This paradigm has become the standard in modern language modeling, especially given the outstanding performance obtained by token-based architectures. However, recent works have not only begun to question how language models process and understand meaning from tokens, but also to question whether using higher levels of granularity could advance the research field. This led to the idea of Concept Modeling, that is, to directly train models for next-concept prediction rather than next-token prediction. The goal is to change the input from tokens to concepts, forcing the underlying language model to shift its granularity from fine-grained tokens to broad concepts. In this work, we introduce Mimir, a 1.

Mimir: Large-scale Multilingual Concept Modeling 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (4)

相关人物

相关产品查看全部 (12)

相关技术查看全部 (17)