Video-LLaVA: Learning United Visual Representation by Alignment Before Projection 论文
2024引用 231
Human Pose and Action RecognitionAdvanced Vision and ImagingAdvanced Image and Video Retrieval Techniques
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection · 相关文章
暂无数据