Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment 文章

ArXiv CS.CV2026-05-28NEWSen作者: Zhuchenyang Liu, Yao Zhang, Yu Xiao

Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment · 相关事件