What Makes Two Language Models Think Alike? 文章

ArXiv CS.CL2026-06-05NEWSen作者: Louis Jalouzot, Christophe Pallier, Emmanuel Chemla, Yair Lakretz

摘要

arXiv:2406.12620v3 Announce Type: replace Abstract: Do architectural and training differences influence the way models represent and process language? Traditional similarity metrics tell us whether two models share a similar representational geometry, but they cannot explain why. Here, we propose a new, simple, approach to address this question. This approach maps neural activity in each model layer onto a set of interpretable linguistic features and quantifies how much each of them drives similarities and differences between models. We use this approach to compare 43 language models across 10 families, including decoder Transformers, State-Space Models, and Recurrent Neural Networks. We find that model-level similarity is driven most strongly by release date, a proxy for general LLM development, and model family, suggesting that linguistic signatures are not primarily shaped by scale or architecture class.

What Makes Two Language Models Think Alike? 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品

相关技术查看全部 (3)