Jointly Modeling Deep Video and Compositional Text to Bridge Vision and Language in a Unified Framework 论文
2015Proceedings of the AAAI Conference on Artificial Intelligence引用 319
Multimodal Machine Learning ApplicationsVideo Analysis and SummarizationGenerative Adversarial Networks and Image Synthesis