SWE-Explore: Benchmarking How Coding Agents Explore Repositories 文章

ArXiv CS.CL2026-06-08NEWSen作者: Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, Yuling Shi, Wenhao Zeng, Maoquan Wang, Shilin He, Ningyuan Xu, Siyu Ye, Kai Cai, Xiaodong Gu

查看原文 →

关系图谱

详细信息

来源站点: ArXiv CS.CL
作者: Shaoqiu Zhang, Yuhang Wang, Jialiang Liang, Yuling Shi, Wenhao Zeng, Maoquan Wang, Shilin He, Ningyuan Xu, Siyu Ye, Kai Cai, Xiaodong Gu
文章类型: NEWS
语言: en
发布日期: 2026-06-08

原文

摘要

arXiv:2606.07297v1 Announce Type: cross Abstract: Repository-level coding benchmarks such as SWE-bench have driven a rapid surge in the capabilities of coding agents. Yet they usually treat coding tasks as a holistic, binary prediction problem (e.g., resolved or unresolved), neglecting fine-grained agent capabilities such as repository understanding, context retrieval, code localization, and bug diagnosis. In this paper, we introduce SWE-Explore, a benchmark that isolates the evaluation of repository exploration, a critical capability of coding agents. Given a repository and an issue, SWE-Explore asks an explorer to return a ranked list of relevant code regions under a fixed line budget. SWE-Explore covers 848 issues across 10 programming languages and 203 open-source repositories.

SWE-Explore: Benchmarking How Coding Agents Explore Repositories 文章

详细信息

摘要

相关事件

相关公司

相关人物

相关产品查看全部 (4)

相关技术