Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions 论文
2019引用 217
Multimodal Machine Learning ApplicationsAdvanced Image and Video Retrieval TechniquesHuman Pose and Action Recognition
Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions · 相关文章
暂无数据