From Per-Image Low-Rank to Encoding Mismatch: Rethinking Feature Distillation in Vision Transformers 事件
PRODUCT_LAUNCH2026-05-27影响: MEDIUM
From Per-Image Low-Rank to Encoding Mismatch: Rethinking Feature Distillation in Vision Transformers arXiv:2511.15572v3 Announce Type: replace Abstract: Feature-map knowledge distillation (KD) transfers internal representations well between comparably sized Vision Transformers (ViTs), but it often fails in compression. We revisit this failure and uncover a paradox. Sample-wise SVD shows that each image is highly compressible, which seems to suggest that a narrow student with a linear projector