Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents 文章

ArXiv CS.CL2026-05-29NEWSen作者: Alejandra Zambrano, Sara Vera Marjanovic, Imene Kerboua, Xing Han L\`u, Leila Kosseim

详细信息

来源站点: ArXiv CS.CL
作者: Alejandra Zambrano, Sara Vera Marjanovic, Imene Kerboua, Xing Han L\`u, Leila Kosseim
文章类型: NEWS
语言: en
发布日期: 2026-05-29

摘要

arXiv:2605.29927v1 Announce Type: new Abstract: Despite recent advances, LLM-based web agents still struggle with limited exploration, omission of critical steps, and sensitivity to task constraints. Prior work suggests that many of these failures stem from weaknesses in planning, yet the impact of alternative natural language plan representation remains unexplored. To address this, we introduce PlanAhead, a static planner-executor framework that evaluates the impact of plan representation in agent performance. We first automatically categorize WebArena tasks into 3 difficulty levels, enabling consistent difficulty grading without human annotation. Then we systematically evaluate 4 different plan representations on the tasks categorized as hard: sequential subgoals, narrative, pseudocode, and checklist; across different families of multimodal LLM powered agents (OpenAI, Alibaba, and Google).

Does The Way You Plan Matter? An Empirical Study of Planning Representations for LLM Web Agents 文章

详细信息

摘要

相关事件

相关公司查看全部 (2)

相关人物

相关产品查看全部 (2)

相关技术查看全部 (5)