Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models 事件

PRODUCT_LAUNCH2026-06-09影响: MEDIUM

Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models arXiv:2606.08446v1 Announce Type: cross Abstract: Despite being powerful, reinforcement learning with verifiable rewards (RLVR) induces extremely long COT, making it computationally expensive. Since RLVR per-step cost is dominated by long-context rollout generation, sparse attention offers a promising way to accelerate dense rollout. However, sparse rollouts require a delicate stability-efficiency tradeoff

Sparrow: Sparse Rollout for Stable and Efficient Long-context RL of Large Language Models · 相关报道