SRL-CLIP: Efficient CLIP Video Adaptation via Structured Semantic Role Labels 事件

PRODUCT_LAUNCH2026-05-27影响: MEDIUM

SRL-CLIP: Efficient CLIP Video Adaptation via Structured Semantic Role Labels arXiv:2401.07669v3 Announce Type: replace Abstract: Adapting CLIP for videos has gained popularity due to its semantic and rich representation. While CLIP is a good starting point, it typically undergoes post-pretraining (contrastive finetuning) on large video narration or caption datasets (e.g. HowTo100M, WebVid2.5M). However, such narrations or captions often lack comprehensive information needed to represent a vide