Beyond pass@k: Redundancy-Aware RLVR for Multi-Sample Code Generation 文章

ArXiv CS.CL2026-05-28NEWSen作者: Le Bronnec Florian, Alexandre Verine, Rio Yokota, Benjamin Negrevergne

摘要

arXiv:2605.28022v1 Announce Type: new Abstract: LLMs for code generation are commonly evaluated in repeated-sampling settings using Pass@k, where multiple candidate programs are executed against unit tests under a finite sampling budget. While recent verifier-based reinforcement learning (RLVR) methods improve executable correctness, how these objectives affect redundancy among sampled programs remains poorly understood. In this work, we study implementation-level redundancy in code generation using JPlag, a plagiarism-detection system for code. Across models and benchmarks, we show that correctness-only RLVR often concentrates generations around repeated implementations, whereas Pass@k-aware objectives maintain lower redundancy and improve larger-budget performance. Motivated by these observations, we augment RLVR with direct anti-redundancy rewards based on JPlag similarity.

相关公司

暂无数据

相关人物

暂无数据