LongAttnComp: Cross-Family Context Compression for Long-Context Reasoning 文章

ArXiv CS.CL2026-06-02NEWSen作者: Mengmeng Ji, Ravi Shanker Raju, Jonathan Lingjie Li, Chen Wu

摘要

arXiv:2606.01336v1 Announce Type: new Abstract: As real-world applications increasingly require processing inputs of 100k+ tokens, the gap between context length and inference efficiency has become a critical bottleneck. Context compression offers a way to reduce prefill costs while preserving task accuracy. However, existing training-free attention-based methods leave substantial gaps in demanding long-context tasks such as code reasoning. We present LongAttnComp, a long-context adaptation of AttnComp that fine-tunes a lightweight cross-attention scoring layer and introduces tokenlevel chunking, a token-budget top-p algorithm, positional reordering, and a formatagnostic query parser. We further design a two-stage fine-tuning recipe for the compressor: Stage 1 builds a general retrieval foundation from NIAH-style data, and Stage 2 extends it with multi-hop and reasoning data for broader long-context task coverage.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据