Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding 论文
2023引用 438
Multimodal Machine Learning ApplicationsDomain Adaptation and Few-Shot LearningAdvanced Image and Video Retrieval Techniques
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding · 相关事件
暂无数据