Disentanglement-Based Equivariant Learning for Compositional VQA 事件
BREAKTHROUGH2026-06-02影响: HIGH
Disentanglement-Based Equivariant Learning for Compositional VQA arXiv:2606.02168v1 Announce Type: new Abstract: Compositional visual question answering (VQA) represents a challenging yet fundamental task that requires models to comprehend novel combinations of previously learned concepts. The current methods often overlook the disentanglement of underlying concepts and are restricted in terms of their ability to effectively capture the compositional variation mechanism. Moreover, the state-of-