Jean-Philippe Thiran : Multimodal signal analysis for audio-visual speech recognition



After a short introduction presenting our group and our main research topics, I will address the problem of audio-visual speech recognition, i.e. a typical example of multimodal signal analysis, when we want to extract and exploit information coming from two different but complementary signals: an audio and a video channel. We will discuss two important aspects of this analysis. We will first present a new feature extraction algorithm based in information theoretical principles, and show its performances, compared to other classical approaches, in our multimodal context. Then we will discuss multimodal information fusion, i.e. how to combine information from those two channels for optimal classification.

