Is an Image Also Worth 16x16=256 Superpixels? A Framework for Attentional Image Classification 文章

ArXiv CS.CV2026-05-27NEWSen作者: Pedro Henrique da Costa Avelar, Anderson R. Tavares, Lu\'is C. Lamb

摘要

arXiv:2605.27144v1 Announce Type: new Abstract: Superpixel-based image classification has traditionally leveraged graph neural networks (GNNs) for processing irregular image representations. Recent advances in computer vision, driven by Vision Transformers (ViTs), have introduced new paradigms in self-attentional models, surpassing convolutional neural networks (CNNs) in various tasks. However, a synergistic connection between GNNs, superpixels, and transformers remains unexplored. In this work, we propose Superpixel Transformers (SPT), a novel framework that unifies superpixel-based image classification and ViTs. SPT generalizes the Superpixel Image Classification with Graph Attention Networks (SICGAT) model and ViT to support arbitrary superpixel-based chunking strategies, connectivity graphs, and positional encodings.

Is an Image Also Worth 16x16=256 Superpixels? A Framework for Attentional Image Classification 文章

摘要

相关事件查看全部 (1)

相关公司查看全部 (4)

相关人物

相关产品查看全部 (11)

相关技术查看全部 (26)