Answer Presence Drives RAG Rewriting Gains 文章

ArXiv CS.AI2026-06-06NEWSen作者: Yuejie Li, Yueying Hua, Ke Yang, Li Zhang, Yueping He, Yueping He, Ruiqi Li, Bolin Chen, Tao Wang, Bowen Li, Chengjun Mao

摘要

arXiv:2606.05633v1 Announce Type: new Abstract: Retrieval-augmented QA pipelines often route retrieved passages through an LLM \emph{rewriter} before a smaller reader, lifting F1 by tens of points on multi-hop benchmarks; this gain is typically credited to improved evidence quality. We ask whether that lift is causally driven by the gold answer string appearing in the rewritten context rather than by curation per se, using a controlled intervention audit. For each rewritten context we re-run the reader after one of four controlled edits to the compile output: removing the gold answer span, replacing a length-matched random non-answer span (placebo), or injecting the gold into rewrites where it was absent (at the prefix or at a midpoint sentence boundary). Across twelve completed (cell, baseline) intervention runs spanning three reader families (Qwen2.5-7B, Qwen3.5-35B, GLM-4.

相关事件查看全部 (1)

Answer Presence Drives RAG Rewriting Gains
2026-06-06PRODUCT_LAUNCH影响: MEDIUM

相关公司

暂无数据

相关人物

暂无数据