Reverberant speech recognition exploiting clarity index estimation

EURASIP Journal on Advances in Signal Processing

Table 3 WER (%) averages obtained in evaluation data set

	Clean	Sim.	Real	Avg.
	Avg.	Avg.	Avg.
Clean-cond.	12.21	52.22	89.17	48.04
Multi-cond.	30.13	29.50	56.94	34.67
NIRA-CART
Clean&Multi cond.	13.51	29.29	56.94	30.02
C50FV	28.65	29.72	56.84	34.37
C50HLDA	25.52	27.78	55.00	32.12
MS3	22.17	27.22	54.57	30.82
MS3+C50HLDA	19.90	25.24	52.51	28.75
MS5	22.32	26.35	54.38	30.35
MS5+C50HLDA	20.07	24.80	52.65	28.58
MS8	21.57	26.10	53.17	29.80
MS8+C50HLDA	19.69	24.08	51.04	27.79
MS11	21.10	26.04	56.62	30.26
MS11+C50HLDA	19.83	24.24	53.13	28.30
MS14	21.34	25.97	55.13	30.02
MS14+C50HLDA	19.38	23.75	52.31	27.76
MS18	21.96	25.97	55.85	30.32
MS18+C50HLDA	20.73	23.95	53.12	28.38
NIRA-BLSTM
Clean&Multi cond.	12.35	29.06	56.94	29.58
C50FV	28.75	29.65	56.91	34.37
C50HLDA	25.86	27.75	54.56	32.12
MS3	20.67	26.79	53.44	29.98
MS3+C50HLDA	18.70	24.57	52.56	28.07
MS5	21.33	26.24	53.87	29.93
MS5+C50HLDA	19.42	24.46	52.07	28.11
MS8	19.97	25.51	53.14	29.03
MS8+C50HLDA	18.58	23.61	50.96	27.22
MS11	18.64	25.40	54.73	28.90
MS11+C50HLDA	17.76	23.47	51.92	27.09
MS14	18.99	25.09	54.31	28.75
MS14+C50HLDA	17.50	23.17	52.15	26.90
MS18	18.40	25.08	56.00	28.89
MS18+C50HLDA	16.96	23.30	52.64	26.91

The first two rows correspond to the baseline methods, and the remainder are the methods proposed in this work. Best performance results in each column are shown in italics