Skip to main content

Table 6 The average WERs (%) of Eval. dataset for the single channel dataset comparing with other teams. Multi-condition training dataset and the trigram language model were used for all teams

From: Environment-dependent denoising autoencoder for distant-talking speech recognition

Team Acoustic Feature Dereverberation SimData RealData Ave.  
  model   method  
REVERB-challenge GMM MFCC CMVN 25.27 47.48 36.38  
baseline        
J. Alam et al. [45] DNN MFCC maximum likelihood inverse 11.1 32.4 21.8  
    filtering-based dereverberation     
Y. Tachioka et al. [46] MMI-SGMM MFCC and PLP Single-channel dereverberation 10.05 28.06 19.01  
    with estimation of     
    reverberation time     
This paper MMI-SGMM MFCC One-step environment- 7.04 28.66 17.85  
    dependent     
    DAE     
This paper DNN MFCC One-step environment- 7.46 28.11 17.79  
    dependent     
    DAE     
This paper MMI-SGMM +DNN MFCC One-step environment- 6.41 26.83 16.62  
    dependent     
    DAE