X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval 论文
2022Proceedings of the 30th ACM International Conference on Multimedia引用 258
Multimodal Machine Learning ApplicationsDomain Adaptation and Few-Shot LearningHuman Pose and Action Recognition
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval · 相关事件
暂无数据