GenClaw: Code-Driven Agentic Image Generation 文章

ArXiv CS.CV2026-05-29NEWSen作者: Junyan Ye, Jun He, Zilong Huang, Dongzhi Jiang, Xuan Yang, Rui Chen, Weijia Li

摘要

arXiv:2605.30248v1 Announce Type: new Abstract: Image generation models have evolved from text-conditioned pixel synthesis toward multimodal agents endowed with visual comprehension and tool invocation capabilities. Yet, existing agents remain at the mercy of underlying black-box image models. Their workflow is trapped in a repetitive cycle of prompt rewriting for generation refinement, leaving them with no mechanism to directly manipulate the canvas. In essence, the potential of LLMs to serve as a genuine "brush" for precise visual construction remains largely untapped. In this paper, we propose GenClaw, a code-driven agentic image generation paradigm that empowers the agent to create like a human artist: first conceptualizing, then sketching, and finally coloring. Specifically, the agent first constructs the conceptual knowledge and context through search and reasoning. It then utilizes code (e.g., SVG, HTML, Three.js) to render executable visual sketches.

相关事件查看全部 (1)

GenClaw: Code-Driven Agentic Image Generation
2026-05-29PRODUCT_LAUNCH影响: MEDIUM

相关公司

暂无数据

相关人物

暂无数据