SWE-Explore: Benchmarking How Coding Agents Explore Repositories 事件
OPEN_SOURCE2026-06-08影响: MEDIUM
SWE-Explore: Benchmarking How Coding Agents Explore Repositories arXiv:2606.07297v1 Announce Type: cross Abstract: Repository-level coding benchmarks such as SWE-bench have driven a rapid surge in the capabilities of coding agents. Yet they usually treat coding tasks as a holistic, binary prediction problem (e.g., resolved or unresolved), neglecting fine-grained agent capabilities such as repository understanding, context retrieval, code localization, and bug diagnosis. In this paper, we introd