Figure 1From: Audio visual speech source separation via improved context dependent association modelVisual modality: data and parametrization. (a) Shows some frames (k = 64, 128, 256, 512, 1024, 2048 and 4096) of visual input V(k) from train set of the poet verses corpus. (b) Shows top seven eigen-lips W v i with largest eigenvalues. (c) Demonstrates the effect of varying the average image of all frames in the direction of the major principal axis ( W v 1 ) with −3,−2,−1,0,1,2 and 3 times of square root of the corresponding eigenvalue (σ1) which has resulted in synthetic opening and closing of lips.Back to article page