Can Generalist Agents Automate Data Curation? 文章

ArXiv CS.CV2026-06-04NEWSen作者: Feiyang Kang, Hanze Li, Adam Nguyen, Mahavir Dabas, Jiaqi W. Ma, Frederic Sala, Dawn Song, Ruoxi Jia

查看原文 →

关系图谱

摘要

arXiv:2606.04261v1 Announce Type: cross Abstract: Curating training data is among the most consequential yet labor-intensive parts of modern AI development: practitioners iteratively propose, implement, evaluate, and revise data policies against noisy benchmark feedback. We ask whether generalist coding agents can automate this data-curation loop. We introduce *Curation-Bench*, an agent-centric benchmark that fixes the model, training recipe, and evaluation suite while giving agents command-line access to inspect data, implement policies, submit them to a fixed training/evaluation pipeline, and revise. In a vision-language instruction-tuning instantiation, out-of-the-box agents reach strong published data-selection baselines within ten iterations. However, trajectory analysis reveals a persistent *execution-research gap*: agents mainly tune local policy variants rather than explore new policy families, even when given strategy guides and paper references.

Can Generalist Agents Automate Data Curation? 文章

摘要

相关事件查看全部 (1)

相关公司

相关人物

相关产品查看全部 (1)

相关技术