PEEK: Picking Essential frames via Efficient Knowledge distillation 事件

BREAKTHROUGH2026-06-01影响: HIGH

PEEK: Picking Essential frames via Efficient Knowledge distillation arXiv:2605.31029v1 Announce Type: new Abstract: Video-language models can process only a limited number of frames, making frame selection a key bottleneck for efficient video captioning. Most captioning pipelines still rely on uniform sampling, which is computationally cheap but agnostic to visual content. Adaptive frame sampling has recently emerged as a promising approach for selecting the most informative frames from a video

PEEK: Picking Essential frames via Efficient Knowledge distillation · 相关公司

V
VanceCOMPANY
R
RonCOMPANY
A
arXivNONPROFIT
T
TemporaRESEARCH_INSTITUTE
A
ACTNONPROFIT
F
FINDNONPROFIT
U
UniforNONPROFIT
V
VIACOMPANY