Skip to main content

Table 6 The average WERs (%) of Eval. dataset for the single channel dataset comparing with other teams. Multi-condition training dataset and the trigram language model were used for all teams

From: Environment-dependent denoising autoencoder for distant-talking speech recognition

Team

Acoustic

Feature

Dereverberation

SimData

RealData

Ave.

 
 

model

 

method

 

REVERB-challenge

GMM

MFCC

CMVN

25.27

47.48

36.38

 

baseline

       

J. Alam et al. [45]

DNN

MFCC

maximum likelihood inverse

11.1

32.4

21.8

 
   

filtering-based dereverberation

    

Y. Tachioka et al. [46]

MMI-SGMM

MFCC and PLP

Single-channel dereverberation

10.05

28.06

19.01

 
   

with estimation of

    
   

reverberation time

    

This paper

MMI-SGMM

MFCC

One-step environment-

7.04

28.66

17.85

 
   

dependent

    
   

DAE

    

This paper

DNN

MFCC

One-step environment-

7.46

28.11

17.79

 
   

dependent

    
   

DAE

    

This paper

MMI-SGMM +DNN

MFCC

One-step environment-

6.41

26.83

16.62

 
   

dependent

    
   

DAE