Skip to main content

Table 3 Parameters used for TDNN-F training

From: Free resources for forced phonetic alignment in Brazilian Portuguese based on Kaldi toolkit

Parameter

Value

# TDNN-F layers

12

# Epochs

10

Time strides (TDNN-F layers)

{1, 1, 1, 0, 3, 3, 3, 3, 3, 3, 3, 3}

Dimension

768 on both TDNN-F and prefinal layers

Bottleneck dimension

96 on TDNN-F layers, 192 on prefinal layer

Bypass scale

0.66

Frame subsampling factor

3

Regularization parameter

0.015 for output layer, 0.03 otherwise

Learning rate

0.002 down to 0.0002