Skip to main content

Table 2 Speech corpora used to train acoustic models

From: Free resources for forced phonetic alignment in Brazilian Portuguese based on Kaldi toolkit

Dataset

Refs.

Hours

Words

Speakers

LapsStory

[11]

5 h:18 m

8257

5

LapsBenchmark

[11]

0 h:54 m

2731

35

Constitution

[44]

8 h:58 m

5330

1

Consumer protection code

[44]

1 h:25 m

2003

1

Spoltech LDC

[45]

4 h:19 m

1145

475

West point LDC

[46]

5 h:22 m

484

70

CETUC

[47]

144 h:39 m

3528

101

Total

 

170 h:51 m

14,518

687