Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR 事件

Name: Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR
Start: 2026-05-29

PRODUCT_LAUNCH2026-05-29影响: MEDIUM

Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR arXiv:2602.12642v2 Announce Type: replace Abstract: Reward-maximizing RL methods have shown to be capable of enhancing the reasoning performance of LLMs, but often lead to reduced generation diversity. Recent works address this issue by adopting GFlowNets, training LLMs to match a target distribution while jointly learning its partition function. In contrast to prior works that treat this partition functi

人工智能

关系图谱

Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR 事件

相关公司查看全部 (6)

相关人物查看全部 (2)

相关产品查看全部 (10)

相关技术查看全部 (9)

相关报道查看全部 (1)