DisasterBench: Benchmarking LLM Planning under Typed Tool Interface Constraints 事件

PRODUCT_LAUNCH2026-05-28影响: MEDIUM

DisasterBench: Benchmarking LLM Planning under Typed Tool Interface Constraints arXiv:2605.27957v1 Announce Type: new Abstract: Disasters cause severe societal impacts, demanding rapid coordination of heterogeneous AI tools, from satellite analysis to flood prediction and damage assessment, into coherent multi-step workflows. As LLMs increasingly serve as orchestrators of such pipelines, effective coordination requires more than selecting semantically plausible tools: LLMs must generate executa