Rescaling MLM-Head for Neural Sparse Retrieval 文章

ArXiv CS.AI2026-06-18NEWSen作者: Youngjoon Jang, Seongtae Hong, Jonah Turner, Heuiseok Lim

详细信息

来源站点
ArXiv CS.AI
作者
Youngjoon Jang, Seongtae Hong, Jonah Turner, Heuiseok Lim
文章类型
NEWS
语言
en
发布日期
2026-06-18

摘要

arXiv:2606.18811v1 Announce Type: cross Abstract: Learned sparse retrieval (LSR) models such as SPLADE have traditionally used BERT-style masked language models as backbone encoders. A natural expectation is that replacing BERT with stronger pretrained encoders should improve retrieval effectiveness. However, we find that under standard SPLADE training recipes, backbones with large MLM-head L2 norms can suffer performance degradation and even training collapse under standard SPLADE training recipes. We identify this failure as a scale mismatch in the MLM head: SPLADE directly uses MLM-head outputs to construct sparse lexical representations, and query-document relevance is computed by an unnormalized dot product over these representations. As a result, an inflated MLM-head scale can amplify sparse activations, distort matching scores, and destabilize contrastive training under common training settings.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据