TY - BOOK AU - Huang, X. AU - Acero, A. AU - Hon, H. -. W. AU - Foreword By-Reddy, R. PY - 2001 DA - 2001// TI - Spoken Language Processing: A Guide to Theory, Algorithm, and System Development PB - Prentice Hall PTR CY - New Jersey ID - Huang2001 ER - TY - BOOK AU - Wölfel, M. AU - McDonough, J. PY - 2009 DA - 2009// TI - Distant Speech Recognition PB - Wiley CY - New Jersey UR - https://doi.org/10.1002/9780470714089 DO - 10.1002/9780470714089 ID - Wölfel2009 ER - TY - STD TI - K Kinoshita, M Delcroix, T Yoshioka, T Nakatani, A Sehr, W Kellermann, R Maas, in Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop On. The reverb challenge: a common evaluation framework for dereverberation and recognition of reverberant speech (IEEE, 2013), pp. 1–4. ID - ref3 ER - TY - STD TI - E Vincent, J Barker, S Watanabe, J Le Roux, F Nesta, M Matassoni, in Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference On. The second ’CHiME’ speech separation and recognition challenge: datasets, tasks and baselines (IEEE, 2013), pp. 126–130. ID - ref4 ER - TY - STD TI - M Harper, in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop On. The automatic speech recognition in reverberant environments (ASpIRE) challenge (IEEE, 2015), pp. 547–554. ID - ref5 ER - TY - BOOK AU - Brandstein, M. AU - Ward, D. PY - 2001 DA - 2001// TI - Microphone Arrays: Signal Processing Techniques and Applications PB - Springer CY - Berlin UR - https://doi.org/10.1007/978-3-662-04619-7 DO - 10.1007/978-3-662-04619-7 ID - Brandstein2001 ER - TY - STD TI - J McDonough, M Wölfel, in Hands-Free Speech Communication and Microphone Arrays, 2008. HSCMA 2008. Distant speech recognition: bridging the gaps (IEEE, 2008), pp. 108–114. ID - ref7 ER - TY - STD TI - ML Seltzer, in Hands-Free Speech Communication and Microphone Arrays, 2008. HSCMA 2008. Bridging the gap: towards a unified framework for hands-free speech recognition using microphone arrays (IEEE, 2008), pp. 104–107. ID - ref8 ER - TY - JOUR AU - Wolf, M. AU - Nadeu, C. PY - 2014 DA - 2014// TI - Channel selection measures for multi-microphone speech recognition JO - Speech Comm VL - 57 UR - https://doi.org/10.1016/j.specom.2013.09.015 DO - 10.1016/j.specom.2013.09.015 ID - Wolf2014 ER - TY - STD TI - I Himawan, P Motlicek, S Sridharan, D Dean, D Tjondronegoro, in INTERSPEECH. Channel selection in the short-time modulation domain for distant speech recognition, (2015), pp. 741–745. ID - ref10 ER - TY - STD TI - TN Sainath, RJ Weiss, KW Wilson, A Narayanan, M Bacchiani, in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference On. Factored spatial and spectral multichannel raw waveform CLDNNs (IEEE, 2016), pp. 5075–5079. ID - ref11 ER - TY - STD TI - X Xiao, S Watanabe, H Erdogan, L Lu, J Hershey, ML Seltzer, G Chen, Y Zhang, M Mandel, D Yu, in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference On. Deep beamforming networks for multi-channel speech recognition (IEEE, 2016), pp. 5745–5749. ID - ref12 ER - TY - BOOK AU - Naylor, P. A. AU - Gaubitch, N. D. PY - 2010 DA - 2010// TI - Speech Dereverberation PB - Springer CY - Berlin UR - https://doi.org/10.1007/978-1-84996-056-4 DO - 10.1007/978-1-84996-056-4 ID - Naylor2010 ER - TY - JOUR AU - Hinton, G. E. AU - Salakhutdinov, R. R. PY - 2006 DA - 2006// TI - Reducing the dimensionality of data with neural networks JO - Science VL - 313 UR - https://doi.org/10.1126/science.1127647 DO - 10.1126/science.1127647 ID - Hinton2006 ER - TY - JOUR AU - Hinton, G. AU - Osindero, S. AU - Teh, Y. -. W. PY - 2006 DA - 2006// TI - A fast learning algorithm for deep belief nets JO - Neural Comput VL - 18 UR - https://doi.org/10.1162/neco.2006.18.7.1527 DO - 10.1162/neco.2006.18.7.1527 ID - Hinton2006 ER - TY - JOUR AU - Xu, Y. AU - Du, J. AU - Dai, L. -. R. AU - Lee, C. -. H. PY - 2014 DA - 2014// TI - An experimental study on speech enhancement based on deep neural networks JO - IEEE Signal Process. Lett VL - 21 UR - https://doi.org/10.1109/LSP.2013.2291240 DO - 10.1109/LSP.2013.2291240 ID - Xu2014 ER - TY - JOUR AU - Narayanan, A. AU - Wang, D. L. PY - 2014 DA - 2014// TI - Investigation of speech separation as a front-end for noise robust speech recognition JO - IEEE/ACM Trans. Audio Speech Lang. Process. VL - 22 UR - https://doi.org/10.1109/TASLP.2014.2305833 DO - 10.1109/TASLP.2014.2305833 ID - Narayanan2014 ER - TY - JOUR AU - Xu, Y. AU - Du, J. AU - Dai, L. -. R. AU - Lee, C. -. H. PY - 2015 DA - 2015// TI - A regression approach to speech enhancement based on deep neural networks JO - IEEE/ACM Trans. Audio Speech Lang. Process. VL - 23 UR - https://doi.org/10.1109/TASLP.2014.2364452 DO - 10.1109/TASLP.2014.2364452 ID - Xu2015 ER - TY - JOUR AU - Han, K. AU - Wang, Y. AU - Wang, D. L. AU - Woods, W. S. AU - Merks, I. AU - Zhang, T. PY - 2015 DA - 2015// TI - Learning spectral mapping for speech dereverberation and denoising JO - IEEE/ACM Trans. Audio Speech Lang. Process. VL - 23 UR - https://doi.org/10.1109/TASLP.2015.2416653 DO - 10.1109/TASLP.2015.2416653 ID - Han2015 ER - TY - STD TI - M Karafiát, F Grézl, L Burget, I Szöke, J Černockỳ, in INTERSPEECH. Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge, (2015), pp. 2454–2458. ID - ref20 ER - TY - JOUR AU - Kinoshita, K. AU - Delcroix, M. AU - Gannot, S. AU - Habets, E. AU - Haeb-Umbach, R. AU - Kellermann, W. AU - Leutnant, V. AU - Maas, R. AU - Nakatani, T. AU - Raj, B. AU - Sehr, A. AU - Yoshioka, T. PY - 2016 DA - 2016// TI - A summary of the REVERB challenge: state-of-the-art and remaining challenges in reverberant speech processing research JO - EURASIP J. Adv. Signal Process. VL - 2016 UR - https://doi.org/10.1186/s13634-016-0306-6 DO - 10.1186/s13634-016-0306-6 ID - Kinoshita2016 ER - TY - STD TI - L Couvreur, C Couvreur, C Ris, in INTERSPEECH. A corpus-based approach for robust ASR in reverberant environments, (2000), pp. 397–400. ID - ref22 ER - TY - STD TI - T Haderlein, E Nöth, W Herbordt, W Kellermann, H Niemann, in Text, Speech and Dialogue. Using artificially reverberated training data in distant-talking ASR (Springer, 2005), pp. 226–233. ID - ref23 ER - TY - STD TI - M Ravanelli, M Omologo, in INTERSPEECH. Contaminated speech training methods for robust dnn-hmm distant speech recognition, (2015), pp. 756–760. ID - ref24 ER - TY - STD TI - X Feng, Y Zhang, J Glass, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference On. Speech feature denoising and dereverberation via deep autoencoders for noisy reverberant speech recognition (IEEE, 2014), pp. 1759–1763. ID - ref25 ER - TY - STD TI - M Mimura, S Sakai, T Kawahara, in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. Deep autoencoders augmented with phone-class feature for reverberant speech recognition (IEEE, 2015), pp. 4365–4369. ID - ref26 ER - TY - STD TI - F Weninger, S Watanabe, J Le Roux, JR Hershey, Y Tachioka, J Geiger, B Schuller, G Rigoll, in REVERB Workshop, Florence, Italy. The MERL/MELCO/TUM system for the REVERB challenge using deep recurrent neural network feature enhancement, (2014), pp. 1–8. ID - ref27 ER - TY - STD TI - F Weninger, S Watanabe, Y Tachioka, B Schuller, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference On. Deep recurrent de-noising auto-encoder and blind de-reverberation for reverberated speech recognition (IEEE, 2014), pp. 4623–4627. ID - ref28 ER - TY - STD TI - M Mimura, S Sakai, T Kawahara, in INTERSPEECH. Speech dereverberation using long short-term memory, (2015), pp. 2435–2439. ID - ref29 ER - TY - JOUR AU - Hochreiter, S. AU - Schmidhuber, J. PY - 1997 DA - 1997// TI - Long short-term memory JO - Neural Comput. VL - 9 UR - https://doi.org/10.1162/neco.1997.9.8.1735 DO - 10.1162/neco.1997.9.8.1735 ID - Hochreiter1997 ER - TY - STD TI - T Gao, J Du, L-R Dai, C-H Lee, in Acoust Speech Signal Process (ICASSP), 2015 IEEE Int Conf. Joint training of front-end and back-end deep neural networks for robust speech recognition, (2015), pp. 4375–4379. ID - ref31 ER - TY - STD TI - A Narayanan, D Wang, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference On. Joint noise adaptive training for robust automatic speech recognition (IEEE, 2014), pp. 2504–2508. ID - ref32 ER - TY - STD TI - Y Xu, J Du, Z Huang, L-R Dai, C-H Lee, in INTERSPEECH. Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement, (2015), pp. 1508–1512. ID - ref33 ER - TY - STD TI - R Giri, ML Seltzer, J Droppo, D Yu, in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference On. Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning (IEEE, 2015), pp. 5014–5018. ID - ref34 ER - TY - BOOK AU - Kuttruff, H. PY - 2009 DA - 2009// TI - Room Acoustics PB - CRC Press CY - Florida ID - Kuttruff2009 ER - TY - STD TI - V Tyagi, C Wellekens, in Acoustics, Speech and Signal Processing (ICASSP), 2005 IEEE International Conference On. On desensitizing the mel-cepstrum to spurious spectral components for robust speech recognition (IEEE, 2005), pp. 529–532. ID - ref36 ER - TY - STD TI - P Ghahremani, B BabaAli, D Povey, K Riedhammer, J Trmal, S Khudanpur, in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference On. A pitch extraction algorithm tuned for automatic speech recognition (IEEE, 2014), pp. 2494–2498. ID - ref37 ER - TY - STD TI - T Yoshioka, N Ito, M Delcroix, A Ogawa, K Kinoshita, M Fujimoto, C Yu, WJ Fabian, M Espi, T Higuchi, et al, in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop On. The NTT CHiME-3 system: advances in speech enhancement and recognition for mobile multi-microphone devices (IEEE, 2015), pp. 436–443. ID - ref38 ER - TY - STD TI - M Delcroix, T Yoshioka, A Ogawa, Y Kubo, M Fujimoto, N Ito, K Kinoshita, M Espi, T Hori, T Nakatani, A Nakamura, in Proc. REVERB Challenge Workshop. Linear prediction-based dereverberation with advanced speech enhancement and recognition technologies for the REVERB challenge, (2014). ID - ref39 ER - TY - STD TI - J Barker, R Marxer, E Vincent, S Watanabe, in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop On. The third ’CHiME’ speech separation and recognition challenge: dataset, task and baselines (IEEE, 2015), pp. 504–511. ID - ref40 ER - TY - JOUR AU - Yoshioka, T. AU - Nakatani, T. AU - Miyoshi, M. AU - Okuno, H. G. PY - 2011 DA - 2011// TI - Blind separation and dereverberation of speech mixtures by joint optimization JO - IEEE Trans. Audio Speech Lang. Process VL - 19 UR - https://doi.org/10.1109/TASL.2010.2045183 DO - 10.1109/TASL.2010.2045183 ID - Yoshioka2011 ER - TY - STD TI - J Du, Q Wang, T Gao, Y Xu, L-R Dai, C-H Lee, in INTERSPEECH. Robust speech recognition with speech enhanced deep neural networks, (2014), pp. 616–620. ID - ref42 ER - TY - STD TI - J Du, Q Wang, Y-H Tu, X Bao, L-R Dai, C-H Lee, in Automatic Speech Recognition and Understanding (ASRU), 2015 IEEE Workshop On. An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework (IEEE, 2015), pp. 430–435. ID - ref43 ER - TY - STD TI - Y Tachioka, S Watanabe, in INTERSPEECH. Uncertainty training and decoding methods of deep neural networks based on stochastic representation of enhanced features, (2015), pp. 3541–3545. ID - ref44 ER - TY - JOUR AU - Ueda, Y. AU - Wang, L. AU - Kai, A. AU - Ren, B. PY - 2015 DA - 2015// TI - Environment-dependent denoising autoencoder for distant-talking speech recognition JO - EURASIP J. Adv. Signal Process VL - 2015 UR - https://doi.org/10.1186/s13634-015-0278-y DO - 10.1186/s13634-015-0278-y ID - Ueda2015 ER - TY - JOUR AU - Ren, B. AU - Wang, L. AU - Lu, L. AU - Ueda, Y. AU - Kai, A. PY - 2016 DA - 2016// TI - Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition JO - Multimed. Tools Appl VL - 75 UR - https://doi.org/10.1007/s11042-015-2849-1 DO - 10.1007/s11042-015-2849-1 ID - Ren2016 ER - TY - STD TI - Y LeCun, BE Boser, JS Denker, D Henderson, RE Howard, WE Hubbard, LD Jackel, in Advances in Neural Information Processing Systems. Handwritten digit recognition with a back-propagation network (Citeseer, 1990), pp. 396–404. ID - ref47 ER - TY - JOUR AU - Allen, J. B. AU - Berkley, D. A. PY - 1979 DA - 1979// TI - Image method for efficiently simulating small-room acoustics JO - J. Acoust. Soc. Am VL - 65 UR - https://doi.org/10.1121/1.382599 DO - 10.1121/1.382599 ID - Allen1979 ER - TY - JOUR AU - Peterson, P. PY - 1986 DA - 1986// TI - Simulating the response of multiple microphones to a single acoustic source in a reverberant room JO - J. Acoust. Soc. Am VL - 80 UR - https://doi.org/10.1121/1.394357 DO - 10.1121/1.394357 ID - Peterson1986 ER - TY - STD TI - M Matassoni, A Brutti, P Svaizer, in Acoustic Signal Enhancement (IWAENC), 2014 14th International Workshop On. Acoustic modeling based on early-to-late reverberation ratio for robust ASR (IEEE, 2014), pp. 263–267. ID - ref50 ER -