Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows 文章

ArXiv CS.AI2026-05-28NEWSen作者: Yilun Yao, Xinyu Tan, Chao-Hsuan Liu, Yaoming Li, Zhengyang Wang, Wenhan Yu, Zhewen Tan, Yuxuan Tian, Guangxiang Zhao, Lin Sun, Xiangzheng Zhang, Tong Yang

Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows · 相关技术

暂无数据