AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning arXiv:2605.18592v2 Announce Type: replace-cross Abstract: Rubric-based reward shaping provides interpretable and editable reward signals for fine-tuning LLMs via reinforcement learning (RL), but existing adaptive rubric methods typically update criteria from local evidence such as the current batch or instance-level comparisons. This local view discards diagnostic information produced during training, m