Skip to main content

Table 2 WER obtained using full batch processing for one-channel, two-channel, and eight-channel tasks

From: Speech recognition in reverberant and noisy environments employing multiple feature extractors and i-vector speaker adaptation

 

SimData

RealData

 

Room 1

Room 2

Room 3

Avg.

Room 1

Avg.

 

Near

Far

Near

Far

Near

Far

 

Near

Far

 

One-channel task

          

MLIFD

8

8.8

10.9

15.9

11.1

17.6

12

27

28

27.5

RCGFB

8.2

9.4

9.8

15

10.8

16.6

11.6

30.4

31.5

30.9

MMFBl

8.7

9.9

10.3

17.2

11.3

18.7

12.6

28.7

28.7

28.6

MMFBp

8.5

9.8

11.1

17.2

11.8

19.2

12.9

30.2

31.9

31

RMFB

8.4

9.1

9.7

15

10.8

17

11.6

31.8

31.3

31.5

ITD-MFB

7.6

8.8

10.4

14.5

9.8

16

11.1

31.9

33

32.4

Baseline

7.6

8.9

11.5

18.1

11.2

18.8

12.6

41

38

39.5

ROVER-all

6.7

7.3

8.4

11.8

8.7

12.7

9.3

23.8

24.8

24.3

Two-channel task

          

MLIFD

8.2

9

11.1

16.5

11.5

18.4

12.5

27.3

27.5

27.4

RCGFB

8.4

9.5

10.1

15.2

11.1

17.1

11.9

31.4

32.4

31.9

MMFBl

8.8

10.4

10.5

17.9

11.8

19.5

13.2

29.5

28.8

29.1

MMFBp

8.9

10

11

18.1

12.5

20.3

13.5

31.8

32.9

32.3

RMFB

8.4

9.1

10

15.4

10.8

17.3

11.9

32.6

31

31.8

ITD-MFB

7.8

9

10.5

15.1

10.4

16.1

11.5

33

32.6

31

Baseline

7.8

9.2

11.6

18.3

11.7

19.3

13

42.4

33

40.4

ROVER-all

6.6

7.4

8.1

11.2

8.5

12.2

9

22.6

24.2

23.4

Eight-channel task

          

MLIFD

7.5

8.3

10

14.1

10.4

15.9

11

23.8

24.4

24.1

RCGFB

8.1

9

9.1

14

10.3

15.3

11

27.9

28.7

28.3

MMFBl

8.5

9.5

9.5

16.1

10.8

17.4

12

26

26.2

26.1

MMFBp

8.4

9.2

10.2

16.1

11.4

18

12.2

27.5

28.7

28.1

RMFB

8.1

8.7

9.1

13.6

10

14.8

10.7

29.4

27.7

28.5

ITD-MFB

7.2

8.1

9.7

13.1

9.5

14.8

10.4

29.8

30.1

30

Baseline

7.6

8.4

10.6

17

10.6

17.9

12

37.8

36.8

37.3

ROVER-all

6.7

7.3

8

11.1

8.1

12.1

8.9

21.4

22

21.7