JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search arXiv:2605.26636v1 Announce Type: new Abstract: We introduce JetViT, a novel family of hybrid-architecture Vision Transformer (ViT) models that match the accuracy of state-of-the-art full-attention vision foundation models while achieving substantially higher inference efficiency on high-resolution images. At the core of our approach is Post-Training Attention Search, a post-training acceleration framework