Skip to main content
Figure 1 | EURASIP Journal on Advances in Signal Processing

Figure 1

From: Audio visual speech source separation via improved context dependent association model

Figure 1

Visual modality: data and parametrization. (a) Shows some frames (k = 64, 128, 256, 512, 1024, 2048 and 4096) of visual input V(k) from train set of the poet verses corpus. (b) Shows top seven eigen-lips W v i with largest eigenvalues. (c) Demonstrates the effect of varying the average image of all frames in the direction of the major principal axis ( W v 1 ) with −3,−2,−1,0,1,2 and 3 times of square root of the corresponding eigenvalue (σ1) which has resulted in synthetic opening and closing of lips.

Back to article page