From: Predicting user mental states in spoken dialogue systems
Groups | Features | Physiological changes related to emotion |
---|---|---|
Pitch | Minimum value, maximum value, mean, median, standard deviation, value in the first voiced segment, value in the last voiced segment, correlation coefficient, slope, and error of the linear regression | Tension of the vocal folds and the sub glottal air pressure |
First two formant frequencies and their bandwidths | Minimum value, maximum value, range, mean, median, standard deviation and value in the first and last voiced segments | Vocal tract resonances |
Energy | Minimum value, maximum value, mean, median, standard deviation, value in the first voiced segment, value in the last voiced segment, correlation, slope, and error of the energy linear regression | Vocal effort, arousal of emotions |
Rhythm | Speech rate, duration of voiced segments, duration of unvoiced segments, duration of longest voiced segment and number of unvoiced segments | Duration and stress conditions |
References | Hansen [59], Ververidis and Kotropoulos [60], Morrison et al. [61] and Batliner et al. [62] |