LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis 事件

Name: LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis
Start: 2026-06-01

PRODUCT_LAUNCH2026-06-01影响: MEDIUM

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis arXiv:2605.30434v1 Announce Type: cross Abstract: Real-world data analysis is inherently iterative, yet existing benchmarks mostly evaluate isolated or short interactive tasks, leaving agents' ability to track evolving analytical context over long horizons untested. We introduce LongDS, a benchmark for long-horizon, multi-turn data analysis where agents must maintain, update, restore, and compose evolving analytical states. Long

人工智能

关系图谱

LongDS-Bench: On the Failure of Long-Horizon Agentic Data Analysis 事件

相关公司查看全部 (10)

相关人物

相关产品查看全部 (10)

相关技术查看全部 (9)

相关报道查看全部 (1)