TraceGraph: Shared Decision Landscapes for Diagnosing and Improving Agent Trajectories 文章

ArXiv CS.AI2026-06-01NEWSen作者: Junjie Nian, Kang Chen, Ge Zhang, Yixin Cao, Yugang Jiang

摘要

arXiv:2605.31308v1 Announce Type: new Abstract: Agent benchmarks increasingly record rich interaction trajectories, yet evaluation often reduces each rollout to a pass rate or reward score. We introduce TraceGraph, a graph-based framework that turns released multi-model agent trajectories into shared decision landscapes. For each task, TraceGraph builds a graph over observable action-observation states from pooled rollouts before model identity is introduced. It then overlays outcome-informed productive cores and trap regions, and summarizes each rollout with three events: Access, Trap exposure, and Repair. Across trajectories spanning five benchmark splits, TraceGraph profiles reveal navigation differences hidden by aggregate scores and show that splits differ in whether they reward avoiding traps or recovering from them.

相关公司

暂无数据

相关人物

暂无数据

相关技术

暂无数据