Overlapping speech, utterance duration and affective content in HHI and HCI — An comparison
read more
Citations
Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings
"Alexa in the wild" - Collecting Unconstrained Conversations with a Modern Voice Assistant in a Public Environment.
Aggression recognition using overlapping speech
Anticipating the User: Acoustic Disposition Recognition in Intelligent Interactions
Human agency beliefs affect older adults' interaction behaviours and task performance when learning with computerised partners
References
Pauses, gaps and overlaps in conversations
Cognitive Infocommunications (Coginfocom)
Acoustic emotion recognition: A benchmark comparison of performances
SmartKom: foundations of multimodal dialogue systems
Turn-competitive incomings
Related Papers (5)
Frequently Asked Questions (11)
Q2. What have the authors stated for future works in "Overlapping speech, utterance duration and affective content in hhi and hci – an comparison" ?
In their further research activities, the authors will develop a robust automatic identification of the different types of overlap. Together with the recognition of the user ’ s affective state, the authors are a step further to future Cognitive Infocommunication systems acting as a companion towards human users [ 13 ], [ 14 ].
Q3. What test was used to test the significance of the difference between the two utterance lengths?
the authors used the non-parametric MannWhitney-U-Test, to test the significance of the difference within the utterance lengths.
Q4. How many utterances are in the corpus?
The considered set of the SmartKom corpus contains 438 emotionally labeled dialogs with 12,076 utterances in total and 6,079 user utterances.
Q5. How many utterances are marked to contain overlapping speech?
Of the currently available 27,000 utterances in 1,600 dialogs 5,100 utterances (18.9%, 830 dialogs) are marked to contain overlapping speech.
Q6. What is the effect of overlapping speech?
For this investigation, the authors showed that overlapping speech goes along with changes in the affective states of dominance and valence in certain situations.
Q7. What annotation level is used for the Davero corpus?
This corpus has several annotation levels, of which for their investigation the turn segmentation and an affective annotation based on the acoustic channel is used [22].
Q8. What is the possible application of their investigations in HHI and HCI?
A possible application of their investigations in HHI and HCI is the identification of parts where the affective state changes based on the knowledge of overlapping speech and the dialog course:
Q9. How did the corpus authors measure the annotation correctness?
the corpus authors only measured the annotation correctness by comparing the results of different annotation rounds rather than calculating an inter-rater agreement measure like Krippendorff’s alpha or Fleiss’ kappa.
Q10. How many overlapping speech segments do the authors have?
As the authors only take into account the overlapping speech, the authors have 6,347 user utterances and 817 utterances contain overlapping speech.
Q11. How many annotators did the authors employ to conduct the affective assessment?
To conduct the affective assessment, the authors first employed a few annotators to manually segment the recordings into single dialogs including the speaker turns.