Automatic analysis of multimodal group actions in meetings 论文

2005IEEE Transactions on Pattern Analysis and Machine Intelligence引用 346
Speech and dialogue systemsMusic and Audio ProcessingSpeech and Audio Processing

摘要

This paper investigates the recognition of group actions in meetings. A framework is employed in which group actions result from the interactions of the individual participants. The group actions are modeled using different HMM-based approaches, where the observations are provided by a set of audiovisual features monitoring the actions of individuals. Experiments demonstrate the importance of taking interactions into account in modeling the group actions. It is also shown that the visual modality contains useful information, even for predominantly audio-based events, motivating a multimodal approach to meeting analysis.