GroupTravelBench: Benchmarking LLM Agents on Multi-Person Travel Planning 文章

ArXiv CS.CL2026-05-26NEWSen作者: Xiang Cheng, Yulan Hu, Lulu Zheng, Zheng Pan, Xin Li, Yong Liu

摘要

arXiv:2605.25200v1 Announce Type: new Abstract: Travel planning is a realistic task for evaluating the planning and tool-use abilities of LLM agents. However, existing benchmarks typically assume only a single user, thereby avoiding one of the most challenging aspects of real-world scenarios: an agent's ability to identify and resolve conflicts among multiple users. To address this gap, we introduce \textbf{GroupTravelBench}, the first benchmark for \textbf{multi-user, multi-turn} travel planning. Based on real user profiles, POI data, and ticket price data, we synthesize 650 tasks and divide them into three difficulty levels. Beyond standard abilities in single-user itinerary planning, such as multi-step reasoning and tool use, our benchmark further evaluates three key capabilities required for travel agents: \emph{(i) elicitation} -- proactively engaging in multi-turn dialogue to gather preferences from each user;

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据