Skip to main content

Table 4 Comparison of text-independent speaker recognition with related methods in recent years

From: Text-independent speaker recognition based on adaptive course learning loss and deep residual network

     

EER (%)

 

Studies

Models

Encoding Layer

Loss

Vox1

Vox1-E

Vox1-H

Xu et al. 2020 [34]

ResNet-50

Average Dist.

Triplet+N-pair+ Argular+softmax

3.48

-

-

Xie et al. 2019 [12]

Thin-ResNet-34

GhostVLAD

Softmax

3.22

3.13

5.06

Yu et al. 2019 [35]

ResNet-50

TAP

EAM-Softmax

2.94

-

-

Nagrani et al. 2020 [11]

Thin-ResNet-34

GhostVLAD

Softmax

2.87

2.95

4.93

Jung et al. 2019 [36]

ResNet-34

SPE

A-Softmax

2.61

-

-

Xiang et al. 2019 [32]

TDNN (x-vector)

-

AAM-Softmax

2.24

2.76

4.73

Chung et al. 2020 [23]

Fast ResNet-34

TAP

AP

2.22

-

-

Kye et al. 2020 [33]

ResNet-34

TAP

NP + Softmax

2.08

-

-

Ours

Res-CASP

CASP

ACLL

1.76

1.91

3.24