SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale 文章

ArXiv CS.CL2026-06-02NEWSen作者: Ibragim Badertdinov, Maksim Nekrashevich, Anton Shevtsov, Alexander Golubev

摘要

arXiv:2602.23866v2 Announce Type: replace-cross Abstract: Software engineering agents (SWE) are improving rapidly, with recent gains largely driven by reinforcement learning (RL). However, RL training is constrained by the scarcity of large-scale task collections with reproducible execution environments and reliable test suites. Although a growing number of benchmarks have emerged, datasets suitable for training remain limited in scale and diversity or often target a limited set of high-resource language ecosystems. We introduce SWE-rebench V2, a language-agnostic automated pipeline for harvesting executable real-world SWE tasks and constructing RL training environments at scale. The pipeline synthesizes repository-specific installation and test procedures via an interactive setup agent, and filters unsound instances using an ensemble of LLM judges, validated against human-verified SWE-bench annotations.

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (5)

相关技术查看全部 (4)