VL-BERT: Pre-training of Generic Visual-Linguistic Representations 论文

2020International Conference on Learning Representations引用 306
Multimodal Machine Learning ApplicationsHuman Pose and Action RecognitionDomain Adaptation and Few-Shot Learning

VL-BERT: Pre-training of Generic Visual-Linguistic Representations · 相关技术