Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows 文章
ArXiv CS.AI2026-05-28NEWSen作者: Yilun Yao, Xinyu Tan, Chao-Hsuan Liu, Yaoming Li, Zhengyang Wang, Wenhan Yu, Zhewen Tan, Yuxuan Tian, Guangxiang Zhao, Lin Sun, Xiangzheng Zhang, Tong Yang