Skip to main content

Table 3 WER (%) averages obtained in evaluation data set

From: Reverberant speech recognition exploiting clarity index estimation

 

Clean

Sim.

Real

Avg.

 

Avg.

Avg.

Avg.

 

Clean-cond.

12.21

52.22

89.17

48.04

Multi-cond.

30.13

29.50

56.94

34.67

NIRA-CART

    

Clean&Multi cond.

13.51

29.29

56.94

30.02

C50FV

28.65

29.72

56.84

34.37

C50HLDA

25.52

27.78

55.00

32.12

MS3

22.17

27.22

54.57

30.82

MS3+C50HLDA

19.90

25.24

52.51

28.75

MS5

22.32

26.35

54.38

30.35

MS5+C50HLDA

20.07

24.80

52.65

28.58

MS8

21.57

26.10

53.17

29.80

MS8+C50HLDA

19.69

24.08

51.04

27.79

MS11

21.10

26.04

56.62

30.26

MS11+C50HLDA

19.83

24.24

53.13

28.30

MS14

21.34

25.97

55.13

30.02

MS14+C50HLDA

19.38

23.75

52.31

27.76

MS18

21.96

25.97

55.85

30.32

MS18+C50HLDA

20.73

23.95

53.12

28.38

NIRA-BLSTM

    

Clean&Multi cond.

12.35

29.06

56.94

29.58

C50FV

28.75

29.65

56.91

34.37

C50HLDA

25.86

27.75

54.56

32.12

MS3

20.67

26.79

53.44

29.98

MS3+C50HLDA

18.70

24.57

52.56

28.07

MS5

21.33

26.24

53.87

29.93

MS5+C50HLDA

19.42

24.46

52.07

28.11

MS8

19.97

25.51

53.14

29.03

MS8+C50HLDA

18.58

23.61

50.96

27.22

MS11

18.64

25.40

54.73

28.90

MS11+C50HLDA

17.76

23.47

51.92

27.09

MS14

18.99

25.09

54.31

28.75

MS14+C50HLDA

17.50

23.17

52.15

26.90

MS18

18.40

25.08

56.00

28.89

MS18+C50HLDA

16.96

23.30

52.64

26.91

  1. The first two rows correspond to the baseline methods, and the remainder are the methods proposed in this work. Best performance results in each column are shown in italics