Skip to main content

Table 3 WER [%] in terms of rooms and microphone distances on the REVERB challenge dev set using single-channel data and MFCC features

From: Effectiveness of dereverberation, feature transformation, discriminative training methods, and system combination approach for various reverberant environments

   

SIMDATA

REALDATA

   

Room 1

Room 2

Room 3

Avg

Room 1

Avg

 

Feature

Type

Near

Far

Near

Far

Near

Far

 

Near

Far

 

Kaldi baseline

MFCC

ML

10.96

12.56

15.70

34.21

19.61

39.24

22.05

48.53

47.37

47.95

derev.

  

12.41

14.68

14.03

27.16

16.39

33.85

19.75

47.04

44.57

45.81

GMM

+LDA+MLLT

ML

9.46

11.01

11.51

22.04

13.08

28.09

15.87

39.99

40.67

40.33

 

+basis fMLLR

 

7.77

10.00

9.76

19.28

11.05

24.90

13.79

33.00

35.54

34.27

  

bMMI

7.13

9.61

9.12

16.19

10.46

21.98

12.42

30.69

35.20

32.95

  

f-bMMI

6.27

8.73

8.28

14.89

9.37

19.54

11.18

28.32

31.31

29.82

  

f-bMMI c

7.06

9.05

8.58

14.96

10.16

20.43

11.71

29.01

31.72

30.37

 

+SAT

ML

8.87

11.21

9.71

19.89

10.95

24.04

14.11

36.06

36.23

36.15

  

bMMI

6.56

8.51

7.76

16.24

9.03

19.88

11.33

34.19

37.53

35.86

  

f-bMMI

5.88

7.60

7.25

14.59

8.09

17.51

10.15

31.63

34.72

33.18

  

f-bMMI c

6.07

7.82

7.22

14.89

8.43

17.51

10.32

32.38

35.27

33.83

SGMM

 

ML

6.47

9.07

8.18

17.11

9.55

20.40

11.80

33.13

34.93

34.03

  

bMMI

5.53

7.23

7.00

14.44

7.76

17.48

9.91

31.50

33.36

32.43

  

bMMI c

5.68

7.28

7.02

14.44

7.94

17.68

10.01

30.94

33.08

32.01

DNN

 

CE

6.71

8.85

8.70

15.58

9.15

19.07

11.34

30.88

35.82

33.35

  

bMMI

5.29

7.06

6.95

13.09

7.57

15.53

9.25

28.45

32.67

30.56

  

bMMI c

5.14

6.74

6.51

12.37

7.27

15.50

8.92

28.32

33.49

30.91

  1. The proposed dereverberation method was used. Three types of acoustic models (GMM, SGMM, and DNN) were constructed with feature transformation (LDA + MLLT), adaptation (basis fMLLR and SAT), and discriminative training (bMMI and f-bMMI). Subscript letter “c” represents the proposed “complementary” system. Italicized data were the best systems in each condition