On the Intrinsic Limits of Transformer Image Embeddings in Non-Solvable Spatial Reasoning 事件
PRODUCT_LAUNCH2026-05-28影响: MEDIUM
On the Intrinsic Limits of Transformer Image Embeddings in Non-Solvable Spatial Reasoning arXiv:2601.03048v2 Announce Type: replace Abstract: Vision Transformers (ViTs) excel in semantic recognition but exhibit systematic failures in spatial reasoning tasks such as mental rotation. While often attributed to data scale, this work argues that the limitation arises from the intrinsic circuit complexity of the architecture. By formalizing spatial understanding as learning a Group Homomorphism Probl