AvalancheBench: Evaluating Enterprise Data Agents Through Latent World Recovery 事件

PRODUCT_LAUNCH2026-05-26影响: MEDIUM

AvalancheBench: Evaluating Enterprise Data Agents Through Latent World Recovery arXiv:2605.24183v1 Announce Type: cross Abstract: We introduce AvalancheBench, a benchmark for evaluating enterprise data agents through \emph{latent world recovery}. AvalancheBench improves on existing benchmarks in three ways. First, it evaluates analytical understanding rather than pipeline completion: systems are scored on whether they recover the segments, drivers, temporal events, and relationships that explai