Skip to main content

Advertisement

Table 4 WER (%) comparisons on RealData for distant-talking conditions with single-channel speech input

From: Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition

System Room1 Room2 Room3 Avg
Single-channel systems     
Multi-1 18.30 27.07 36.46 27.28
Multi-2 16.56 25.59 33.71 25.29
DNN-JT1 15.43 24.30 31.10 23.61
DNN-JT2 14.74 23.77 30.20 22.90
DNN-JT3 16.50 25.69 33.30 25.16
DNN-JT4 15.04 23.64 29.84 22.84
Multi-2(10HL) 16.90 27.04 34.89 26.28
  1. Room1 is a living room, Room2 is a conference room, and Room3 is a classroom