Inspiration Seeds: Learning Non-Literal Visual Combinations for Generative Exploration 文章

ArXiv CS.CV2026-05-26NEWSen作者: Kfir Goldberg, Elad Richardson, Yael Vinker

摘要

arXiv:2602.08615v3 Announce Type: replace Abstract: While generative models have become powerful tools for image synthesis, they are typically optimized for executing carefully crafted textual prompts, offering limited support for the open-ended visual exploration that often precedes idea formation. In contrast, designers frequently draw inspiration from loosely connected visual references, seeking emergent connections that spark new ideas. We propose Inspiration Seeds, a generative framework that shifts image generation from final execution to exploratory ideation. Given two input images, our model produces diverse, visually coherent compositions that reveal latent relationships between inputs, without relying on user-specified text prompts. Our approach is feed-forward, trained on synthetic triplets of decomposed visual aspects derived entirely through visual means: we use CLIP Sparse Autoencoders to extract editing directions in CLIP latent space and isolate concept pairs.