Adaptive Multimodal Agents-Based Framework for Automatic Workflow Execution 文章

ArXiv CS.CL2026-05-28NEWSen作者: Susanna Cifani, Mario Luca Bernardi, Marta Cimitile

摘要

arXiv:2605.28607v1 Announce Type: cross Abstract: Modern information systems require autonomous agents capable of navigating complex workflows, yet current methodologies often struggle with the transition from structured metadata parsing to general environmental perception. While the integration of MLLMs has enabled agents to interact directly with GUIs, existing approaches typically treat task sequences as discrete, linear episodes. This fragmentation prevents agents from capturing the underlying transition topology, limiting their effectiveness in novel or non-stationary scenarios. To address this, we propose a novel multimodal multi-agent framework that achieves automatic workflow execution through a distinct two-phase pipeline. First, during an offline discovery phase, the architecture adaptively constructs a topological knowledge base from fragmented execution logs.

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据