Skip to main content

Table 4 Comparison of text-independent speaker recognition with related methods in recent years

From: Text-independent speaker recognition based on adaptive course learning loss and deep residual network

      EER (%)  
Studies Models Encoding Layer Loss Vox1 Vox1-E Vox1-H
Xu et al. 2020 [34] ResNet-50 Average Dist. Triplet+N-pair+ Argular+softmax 3.48 - -
Xie et al. 2019 [12] Thin-ResNet-34 GhostVLAD Softmax 3.22 3.13 5.06
Yu et al. 2019 [35] ResNet-50 TAP EAM-Softmax 2.94 - -
Nagrani et al. 2020 [11] Thin-ResNet-34 GhostVLAD Softmax 2.87 2.95 4.93
Jung et al. 2019 [36] ResNet-34 SPE A-Softmax 2.61 - -
Xiang et al. 2019 [32] TDNN (x-vector) - AAM-Softmax 2.24 2.76 4.73
Chung et al. 2020 [23] Fast ResNet-34 TAP AP 2.22 - -
Kye et al. 2020 [33] ResNet-34 TAP NP + Softmax 2.08 - -
Ours Res-CASP CASP ACLL 1.76 1.91 3.24