Skip to main content
Figure 1 | EURASIP Journal on Advances in Signal Processing

Figure 1

From: Multi-pose lipreading and audio-visual speech recognition

Figure 1

Standard AV-ASR system. Structure of audio-visual ASR system. Upper row corresponds to the audio system, where the features used for speech recognition are extracted and fed to the audio-visual integration block and classifier. The lower part corresponds to the lipreading system: first the mouth is tracked and a sequence of normalized mouth images is extracted, then the visual features are computed and finally used in the audio-visual integration and classification blocks.

Back to article page