Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding 论文

2023引用 438
Multimodal Machine Learning ApplicationsDomain Adaptation and Few-Shot LearningAdvanced Image and Video Retrieval Techniques

Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding · 相关文章

暂无数据