ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks 论文
2019Neural Information Processing Systems引用 907
Multimodal Machine Learning ApplicationsDomain Adaptation and Few-Shot LearningTopic Modeling
ViLBERT: Pretraining Task-Agnostic Visiolinguistic Representations for Vision-and-Language Tasks · 相关文章
暂无数据