DataClawBench: An Agent Benchmark for Exploratory Real-World Financial Data Analysis 文章

ArXiv CS.AI2026-05-28NEWSen作者: Qiaohong Zhang, Weihao Ye, Jialong Chen, Yi Luo, BoYuan Li, Bowen Deng, Zibin Zheng, Jianhao Lin, Wei-Shi Zheng, Chuan Chen

摘要

arXiv:2605.02503v3 Announce Type: replace Abstract: Autonomous data analysis agents are increasingly expected to conduct exploratory analysis with limited human guidance about data. However, existing benchmarks typically evaluate such agents in prior-guided settings, providing selected data sources, explicit data schemas, or cleaned data, thereby understating the exploratory burden. To evaluate this realistic exploratory data analysis task, we introduce DataClawBench, a benchmark built from financial think-tank consulting scenarios where agents must independently explore unfamiliar, noisy, cross-domain data and produce verifiable conclusions. DataClawBench provides a unified real-world data environment with approximately 2.06 million records across enterprise, industry, and policy domains, with native data noise preserved.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据