详细信息
- 来源站点
- ArXiv CS.AI
- 作者
- Hector Borobia, Elies Segu\'i-Mas, Guillermina Tormo-Carb\'o
- 文章类型
- NEWS
- 语言
- en
- 发布日期
- 2026-06-09
摘要
arXiv:2603.22473v2 Announce Type: replace-cross Abstract: Hybrid language models combine softmax attention with linear-time sequence mechanisms such as state-space or linear-attention layers, but the functional contribution of each component type remains insufficiently characterized. We study component-level ablation in two sub-1B hybrid language models, Qwen3.5-0.8B and Falcon-H1-0.5B, using likelihood-based evaluation, downstream benchmarks, layer-wise interventions, random controls, and representation-level diagnostics. Across the tested models, removing either attention or the alternative sequence-processing pathway substantially degrades performance, indicating that both component types contribute to model behavior. Likelihood metrics are especially sensitive to the linear-attention or state-space pathway, while downstream benchmark degradation depends on task and architecture.
相关事件
暂无数据
相关公司
暂无数据
相关人物
暂无数据