RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents arXiv:2605.26352v1 Announce Type: new Abstract: Retrieval is increasingly moving from one-shot matching toward interactive reasoning, where language agents iteratively inspect evidence, reformulate queries, and search again. Training such agents raises a credit-assignment challenge: executable actions such as queries or summaries can be directly evaluated by the retriever, while latent reasoning steps are not direc