Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions 论文

2019引用 217

Multimodal Machine Learning ApplicationsAdvanced Image and Video Retrieval TechniquesHuman Pose and Action Recognition

人工智能 Advanced Image and Video Retrieval Techniques Multimodal Machine Learning Applications Human Pose and Action Recognition

相关技术:Human Pose and Action Recognition Multimodal Machine Learning Applications Advanced Image and Video Retrieval Techniques

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions · 作者

Rita Cucchiara

Lorenzo Baraldi

Marcella Cornia