Environment-dependent denoising autoencoder for distant-talking speech recognition

EURASIP Journal on Advances in Signal Processing

Table 6 The average WERs (%) of Eval. dataset for the single channel dataset comparing with other teams. Multi-condition training dataset and the trigram language model were used for all teams

Team	Acoustic	Feature	Dereverberation	SimData	RealData	Ave.
	model		method
REVERB-challenge	GMM	MFCC	CMVN	25.27	47.48	36.38
baseline
J. Alam et al. [45]	DNN	MFCC	maximum likelihood inverse	11.1	32.4	21.8
			filtering-based dereverberation
Y. Tachioka et al. [46]	MMI-SGMM	MFCC and PLP	Single-channel dereverberation	10.05	28.06	19.01
			with estimation of
			reverberation time
This paper	MMI-SGMM	MFCC	One-step environment-	7.04	28.66	17.85
			dependent
			DAE
This paper	DNN	MFCC	One-step environment-	7.46	28.11	17.79
			dependent
			DAE
This paper	MMI-SGMM +DNN	MFCC	One-step environment-	6.41	26.83	16.62
			dependent
			DAE