Multi-Concept Customization of Text-to-Image Diffusion 论文

2023引用 555

Generative Adversarial Networks and Image SynthesisImage Retrieval and Classification TechniquesAdvanced Image and Video Retrieval Techniques

Advanced Image and Video Retrieval Techniques Generative Adversarial Networks and Image Synthesis Image Retrieval and Classification Techniques

关系图谱

作者

摘要

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning (~ 6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned models into one via closed-form constrained optimization. Our fine-tuned model generates variations of multiple new concepts and seamlessly composes them with existing concepts in novel settings. Our method outperforms or performs on par with several baselines and concurrent works in both qualitative and quantitative evaluations, while being memory and computationally efficient.

作者查看全部 (5)

Richard Zhang

Jun-Yan Zhu

Eli Shechtman

Bingliang Zhang

Multi-Concept Customization of Text-to-Image Diffusion 论文

详细信息

摘要

作者查看全部 (5)

相关技术查看全部 (3)

相关事件

相关文章