Skip to main content

Table 1 Features employed for emotion detection from the acoustic signal

From: Predicting user mental states in spoken dialogue systems

Groups

Features

Physiological changes related to emotion

Pitch

Minimum value, maximum value, mean, median, standard deviation, value in the first voiced segment, value in the last voiced segment, correlation coefficient, slope, and error of the linear regression

Tension of the vocal folds and the sub glottal air pressure

First two formant frequencies and their bandwidths

Minimum value, maximum value, range, mean, median, standard deviation and value in the first and last voiced segments

Vocal tract resonances

Energy

Minimum value, maximum value, mean, median, standard deviation, value in the first voiced segment, value in the last voiced segment, correlation, slope, and error of the energy linear regression

Vocal effort, arousal of emotions

Rhythm

Speech rate, duration of voiced segments, duration of unvoiced segments, duration of longest voiced segment and number of unvoiced segments

Duration and stress conditions

References

Hansen [59], Ververidis and Kotropoulos [60], Morrison et al. [61] and Batliner et al. [62]