Component Ablation for Efficient Hybrid Language Model Architectures: Performance, Resilience, and Compression Implications 文章

ArXiv CS.AI2026-06-09NEWSen作者: Hector Borobia, Elies Segu\'i-Mas, Guillermina Tormo-Carb\'o

详细信息

来源站点: ArXiv CS.AI
作者: Hector Borobia, Elies Segu\'i-Mas, Guillermina Tormo-Carb\'o
文章类型: NEWS
语言: en
发布日期: 2026-06-09

摘要

arXiv:2603.22473v2 Announce Type: replace-cross Abstract: Hybrid language models combine softmax attention with linear-time sequence mechanisms such as state-space or linear-attention layers, but the functional contribution of each component type remains insufficiently characterized. We study component-level ablation in two sub-1B hybrid language models, Qwen3.5-0.8B and Falcon-H1-0.5B, using likelihood-based evaluation, downstream benchmarks, layer-wise interventions, random controls, and representation-level diagnostics. Across the tested models, removing either attention or the alternative sequence-processing pathway substantially degrades performance, indicating that both component types contribute to model behavior. Likelihood metrics are especially sensitive to the linear-attention or state-space pathway, while downstream benchmark degradation depends on task and architecture.

Component Ablation for Efficient Hybrid Language Model Architectures: Performance, Resilience, and Compression Implications 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (2)

相关技术查看全部 (4)