摘要
arXiv:2406.12620v3 Announce Type: replace Abstract: Do architectural and training differences influence the way models represent and process language? Traditional similarity metrics tell us whether two models share a similar representational geometry, but they cannot explain why. Here, we propose a new, simple, approach to address this question. This approach maps neural activity in each model layer onto a set of interpretable linguistic features and quantifies how much each of them drives similarities and differences between models. We use this approach to compare 43 language models across 10 families, including decoder Transformers, State-Space Models, and Recurrent Neural Networks. We find that model-level similarity is driven most strongly by release date, a proxy for general LLM development, and model family, suggesting that linguistic signatures are not primarily shaped by scale or architecture class.
相关事件查看全部 (1)
相关公司
暂无数据
相关人物
暂无数据
相关产品
暂无数据