ADK Arena: Evaluating Agent Development Kits via LLM-as-a-Developer 文章

ArXiv CS.AI2026-06-06NEWSen作者: Jintao Huang, Xiaomin Li, Gaurav Mittal, Yu Hu

详细信息

来源站点
ArXiv CS.AI
作者
Jintao Huang, Xiaomin Li, Gaurav Mittal, Yu Hu
文章类型
NEWS
语言
en
发布日期
2026-06-06

摘要

arXiv:2606.05548v1 Announce Type: cross Abstract: The rapid proliferation of Agent Development Kits (ADKs), SDK-level frameworks for building LLM-powered autonomous agents, has outpaced any empirical understanding of how framework choice affects agent performance. We propose \textbf{LLM-as-a-Developer}, a methodology that replaces human developers with an LLM coding agent that learns each framework's API from documentation, writes agent code, and iteratively repairs it through a validate-and-feedback loop until tests pass. By holding the developer constant and varying only the framework, generation effort becomes a quantitative proxy for API usability and the resulting agents provide a controlled measure of framework effectiveness. We implement this in \textbf{ADK Arena}, a fully automated pipeline with per-framework Docker isolation, a three-level validation pipeline, and benchmark adapters for SWE-bench, $\tau^2$-bench, Terminal-Bench, and MCP-Atlas.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据