摘要
arXiv:2505.16915v3 Announce Type: replace Abstract: While recent Text-to-Image (T2I) models show impressive capabilities in synthesizing images from brief descriptions, they struggle with the long, detailed prompts required for professional applications. We present DetailMaster, a comprehensive benchmark for evaluating T2I capabilities on long prompts with complex compositional requirements, accompanied by an automated data construction pipeline and an evaluation workflow. Comprising expert-validated prompts averaging 284.89 tokens, our benchmark introduces four critical evaluation dimensions: Character Attributes, Structured Character Locations, Multi-Dimensional Scene Attributes, and Spatial/Interactive Relationships.
相关事件查看全部 (1)
DetailMaster: Can Your Text-to-Image Model Handle Long Prompts?
2026-06-02PRODUCT_LAUNCH影响: MEDIUM
相关公司
暂无数据
相关人物
暂无数据
相关技术
暂无数据