BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? arXiv:2603.03194v2 Announce Type: replace Abstract: Current code-agent benchmarks primarily evaluate localized issue resolution within a single target repository, leaving under-tested many software engineering tasks that require external knowledge or broader repository-level changes. We introduce BeyondSWE, a 500-instance benchmark drawn from 246 real-world GitHub repositories to evaluate code agents beyond single-reposito
相关产品查看全部 (10)
相关报道查看全部 (1)
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?
ArXiv CS.CL2026-05-27