Skip to main content

Table 6 WER on the evaluation data using DNN based acoustic models. “MAP” refers to the DNN based MFCC feature mapping

From: Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation

MAP

Simulated rooms

Real

Avg.

 

Room1A

Room2A

Room3A

Room1

 
 

near

far

near

far

near

far

near

far

 

Single channel + Clean condition training

 

No

10.6

19.3

23.2

69.3

30.2

74.6

68.0

66.2

45.2

Yes

9.3

10.6

12.7

21.4

16.5

25.1

40.2

39.0

21.9

Single channel + Multi condition training

 

No

8.7

9.4

10.5

16.5

13.4

20.0

35.4

34.3

18.5

Yes

8.9

8.8

8.8

13.9

11.4

15.5

32.2

32.7

16.5

MVDR(2ch) + Multi condition training

 

No

8.6

9.6

9.1

14.9

11.6

18.3

33.3

30.7

17.0

Yes

8.5

8.6

7.9

12.4

10.1

14.8

29.1

29.1

15.1

MVDR(8ch) + Multi condition training

 

No

7.8

8.3

8.3

10.8

9.8

13.3

24.8

25.1

13.5

Yes

7.5

8.2

7.4

9.7

8.9

11.3

22.7

24.4

12.5