Formalizing the Binding Problem 文章

ArXiv CS.CV2026-06-03NEWSen作者: Lianghuan Huang, Yihao Li, Saeed Salehi, Yingshan Chang, Ansh Soni, Konrad P. Kording

摘要

arXiv:2606.03976v1 Announce Type: new Abstract: Representations of the world, arguably, contain information about features (e.g. something is blue, something is a circle) but also information about which features are part of the same object (e.g. the circle is blue), which we call binding information. Any system with the ability to understand scenes with multiple objects must be able to solve the binding problem: it needs to know which features belong together. However, despite work showing that Vision Transformers (ViTs) know which patches belong together, it is not known whether current deep learning models learn to exhibit binding information, i.e., for features. We may believe that there is not much binding information, after all misattributing features to wrong objects is a common failure of ViT-based architectures, especially in scenes with objects sharing features.

相关事件

暂无数据

相关公司

暂无数据

相关人物

暂无数据

相关产品

暂无数据