Open AccessProceedings Article
DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset
Yanran Li,Hui Su,Xiaoyu Shen,Wenjie Li,Ziqiang Cao,Shuzi Niu +5 more
- Vol. 1, pp 986-995
Reads0
Chats0
TLDR
This paper developed a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects, such as human-written and less noisy language, the dialogues in the dataset reflect our daily communication way and cover various topics about our daily life.Abstract:
We develop a high-quality multi-turn dialog dataset, DailyDialog, which is intriguing in several aspects. The language is human-written and less noisy. The dialogues in the dataset reflect our daily communication way and cover various topics about our daily life. We also manually label the developed dataset with communication intention and emotion information. Then, we evaluate existing approaches on DailyDialog dataset and hope it benefit the research field of dialog systems. The dataset is available on http://yanran.li/dailydialogread more
Citations
More filters
Posted Content
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
TL;DR: This work proposes a new benchmark for empathetic dialogue generation and EmpatheticDialogues, a novel dataset of 25k conversations grounded in emotional situations, and presents empirical comparisons of dialogue model adaptations forEmpathetic responding, leveraging existing models or datasets without requiring lengthy re-training of the full model.
Proceedings ArticleDOI
GoEmotions: A Dataset of Fine-Grained Emotions
TL;DR: GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral is introduced, and the high quality of the annotations via Principal Preserved Component Analysis is demonstrated.
Proceedings ArticleDOI
MojiTalk: Generating Emotional Responses at Scale
Xianda Zhou,William Yang Wang +1 more
TL;DR: This paper collects a large corpus of Twitter conversations that include emojis in the response and investigates several conditional variational autoencoders training on these conversations, which allow us to use emojes to control the emotion of the generated text.
Proceedings ArticleDOI
Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
TL;DR: This article proposed a new benchmark for empathetic dialogue generation and EmpatheticDialogues, a novel dataset of 25k conversations grounded in emotional situations, and experiments indicate that dialogue models that use their dataset are perceived to be more empathetically by human evaluators, compared to models merely trained on large-scale Internet conversation data.
Proceedings Article
An Analysis of Annotated Corpora for Emotion Classification in Text
TL;DR: A survey of the datasets is carried out, and a subset of corpora is better classified with models trained on a different corpus, which simplifies the choice of the most appropriate resources for developing a model for a novel domain.
References
More filters
Proceedings Article
A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues
Iulian Vlad Serban,Alessandro Sordoni,Ryan Lowe,Laurent Charlin,Joelle Pineau,Aaron Courville,Yoshua Bengio +6 more
TL;DR: The authors proposed a neural network-based generative architecture with stochastic latent variables that span a variable number of time steps to generate meaningful, long and diverse responses and maintain dialogue state.
News from OPUS — A collection of multilingual parallel corpora with tools and interfaces
TL;DR: This article introduces resources that have recently been added to opus and discusses the alignment of movie subtitles and the conversion of biomedical documents and localization data to a sentence aligned xml format.
Proceedings Article
Learning End-to-End Goal-Oriented Dialog
TL;DR: In this article, an end-to-end dialog system based on memory networks is proposed for goal-oriented reservation systems, which can reach promising, yet imperfect, performance and learn to perform non-trivial operations.
Proceedings ArticleDOI
A Robust System for Natural Spoken Dialogue
TL;DR: An evaluation of the system using time-to-completion and the quality of the final solution suggests that most native speakers of English can use the system successfully with virtually no training.
Proceedings Article
A Dataset for Research on Short-Text Conversations
TL;DR: This paper introduces a dataset of short-text conversation based on the real-world instances from Sina Weibo, which provides rich collection of instances for the research on finding natural and relevant short responses to a given short text, and useful for both training and testing of conversation models.