PEEK: Picking Essential frames via Efficient Knowledge distillation 事件
PRODUCT_LAUNCH2026-06-01影响: MEDIUM
PEEK: Picking Essential frames via Efficient Knowledge distillation arXiv:2605.31029v1 Announce Type: new Abstract: Video-language models can process only a limited number of frames, making frame selection a key bottleneck for efficient video captioning. Most captioning pipelines still rely on uniform sampling, which is computationally cheap but agnostic to visual content. Adaptive frame sampling has recently emerged as a promising approach for selecting the most informative frames from a video
相关产品查看全部 (10)
相关报道查看全部 (1)
PEEK: Picking Essential frames via Efficient Knowledge distillation
ArXiv CS.CV2026-06-01