Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks 论文

2023引用 468
Multimodal Machine Learning ApplicationsAdvanced Image and Video Retrieval TechniquesVisual Attention and Saliency Detection

Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks · 作者