Home
/
Authors
/
Hyunwoo Kim

Author

Hyunwoo Kim

Bio: Hyunwoo Kim is an academic researcher from Seoul National University. The author has contributed to research in topics: Pragmatics & Consistency (negotiation). The author has an hindex of 5, co-authored 10 publications receiving 122 citations.

Topics: Pragmatics, Consistency (negotiation), Automatic summarization, Empathy, Persona ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Abstractive Summarization of Reddit Posts with Multi-level Memory Networks

[...]

Byeongchang Kim¹, Hyunwoo Kim¹, Gunhee Kim¹•Institutions (1)

Seoul National University¹

01 Jun 2019

TL;DR: This article proposed a novel abstractive summarization model named multi-level memory networks (MMN), which is equipped with multilevel memory to store the information of text from different levels of abstraction.

...read moreread less

Abstract: We address the problem of abstractive summarization in two directions: proposing a novel dataset and a new model. First, we collect Reddit TIFU dataset, consisting of 120K posts from the online discussion forum Reddit. We use such informal crowd-generated posts as text source, in contrast with existing datasets that mostly use formal documents as source such as news articles. Thus, our dataset could less suffer from some biases that key sentences usually located at the beginning of the text and favorable summary candidates are already inside the text in similar forms. Second, we propose a novel abstractive summarization model named multi-level memory networks (MMN), equipped with multi-level memory to store the information of text from different levels of abstraction. With quantitative evaluation and user studies via Amazon Mechanical Turk, we show the Reddit TIFU dataset is highly abstractive and the MMN outperforms the state-of-the-art summarization models.

...read moreread less

86 citations

Proceedings Article•DOI•

Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness

[...]

Hyunwoo Kim¹, Byeongchang Kim¹, Gunhee Kim¹•Institutions (1)

Seoul National University¹

01 Nov 2020

TL;DR: Inspired by social cognition and pragmatics, existing dialogue agents are endow with public self-consciousness on the fly through an imaginary listener to enforce dialogue agents to refrain from uttering contradiction and improve consistency of existing dialogue models.

...read moreread less

Abstract: We explore the task of improving persona consistency of dialogue agents. Recent models tackling consistency often train with additional Natural Language Inference (NLI) labels or attach trained extra modules to the generative agent for maintaining consistency. However, such additional labels and training can be demanding. Also, we find even the best-performing persona-based agents are insensitive to contradictory words. Inspired by social cognition and pragmatics, we endow existing dialogue agents with public self-consciousness on the fly through an imaginary listener. Our approach, based on the Rational Speech Acts framework (Frank and Goodman, 2012), can enforce dialogue agents to refrain from uttering contradiction. We further extend the framework by learning the distractor selection, which has been usually done manually or randomly. Results on Dialogue NLI (Welleck et al., 2019) and PersonaChat (Zhang et al., 2018) dataset show that our approach reduces contradiction and improves consistency of existing dialogue models. Moreover, we show that it can be generalized to improve context-consistency beyond persona in dialogues.

...read moreread less

34 citations

Proceedings Article•DOI•

ProsocialDialog: A Prosocial Backbone for Conversational Agents

[...]

Hyunwoo Kim, Youngjae Yu, Liwei Jiang, Ximing Lu, Daniel Khashabi, Gunhee Kim, Yejin Choi, Maarten Sap - Show less +4 more

25 May 2022

TL;DR: This work introduces P ROSOCIAL D IALOG, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms, and introduces a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.

...read moreread less

Abstract: Most existing dialogue systems fail to respond properly to potentially unsafe user utterances by either ignoring or passively agreeing with them. To address this issue, we introduce ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms. Covering diverse unethical, problematic, biased, and toxic situations, ProsocialDialog contains responses that encourage prosocial behavior, grounded in commonsense social rules (i.e., rules-of-thumb, RoTs). Created via a human-AI collaborative framework, ProsocialDialog consists of 58K dialogues, with 331K utterances, 160K unique RoTs, and 497K dialogue safety labels accompanied by free-form rationales.With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost. Empirical results show that Prost generates more socially acceptable dialogues compared to other state-of-the-art language and dialogue models in both in-domain and out-of-domain settings. Additionally, Canary effectively guides conversational agents and off-the-shelf language models to generate significantly more prosocial responses. Our work highlights the promise and importance of creating and steering conversational AI to be socially responsible.

...read moreread less

31 citations

Proceedings Article•

Curiosity-Bottleneck: Exploration By Distilling Task-Specific Novelty.

[...]

Youngjin Kim¹, Wontae Nam, Hyunwoo Kim¹, Ji-Hoon Kim², Gunhee Kim¹ - Show less +1 more•Institutions (2)

Seoul National University¹, Naver Corporation²

24 May 2019

TL;DR: This work shows that Curiosity-Bottleneck learns an effective exploration strategy by robustly measuring the state novelty in distractive environments where state-of-the-art exploration methods often degenerate.

...read moreread less

Abstract: Exploration based on state novelty has brought great success in challenging reinforcement learning problems with sparse rewards. However, existing novelty-based strategies become inefficient in real-world problems where observation contains not only task-dependent state novelty of our interest but also task-irrelevant information that should be ignored. We introduce an informationtheoretic exploration strategy named CuriosityBottleneck that distills task-relevant information from observation. Based on the information bottleneck principle, our exploration bonus is quantified as the compressiveness of observation with respect to the learned representation of a compressive value network. With extensive experiments on static image classification, grid-world and three hard-exploration Atari games, we show that Curiosity-Bottleneck learns an effective exploration strategy by robustly measuring the state novelty in distractive environments where stateof-the-art exploration methods often degenerate.

...read moreread less

25 citations

Journal Article•DOI•

SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

[...]

Hyunwoo Kim, Jack Hessel, Liwei Jiang, Ximing Lu, Youngjae Yu, Pei Zhou, Ronan LeBras, Malihe Alikhani, Gunhee Kim, Maarten Sap, Yejin Choi - Show less +7 more

20 Dec 2022-arXiv.org

TL;DR: The authors distill 1.5M socially-grounded dialogues from a large language model by contextualizing social commonsense knowledge from a knowledge graph and train a generalizable conversation model that is significantly more natural and consistent on unseen datasets.

...read moreread less

Abstract: We present SODA: the first publicly available, million-scale high-quality social dialogue dataset. In contrast to most existing crowdsourced, small-scale dialogue corpora, we distill 1.5M socially-grounded dialogues from a large language model (InstructGPT; Ouyang et al., 2022). Dialogues are distilled by contextualizing social commonsense knowledge from a knowledge graph (Atomic10x; West et al., 2022). Human evaluation shows that dialogues in SODA are more consistent, specific, and (surprisingly) natural than those in prior human-authored datasets. Using SODA, we train COSMO: a generalizable conversation model that is significantly more natural and consistent on unseen datasets than best-performing conversation models (e.g., GODEL, BlenderBot-1, Koala, Vicuna). Experiments reveal COSMO is sometimes even preferred to the original human-written gold responses. Additionally, our results shed light on the distinction between knowledge-enriched conversations and natural social chitchats. We make our data, models, and code public.

...read moreread less

15 citations

Cited by

PDF

Open Access

More filters

Posted Content•

Big Bird: Transformers for Longer Sequences

[...]

Manzil Zaheer, Guru Guruganesh, Avinava Dubey¹, Joshua Ainslie, Chris Alberti, Santiago Ontañón, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed - Show less +7 more•Institutions (1)

Google¹

28 Jul 2020-arXiv: Learning

TL;DR: It is shown that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model.

...read moreread less

Abstract: Transformers-based models, such as BERT, have been one of the most successful deep learning models for NLP. Unfortunately, one of their core limitations is the quadratic dependency (mainly in terms of memory) on the sequence length due to their full attention mechanism. To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear. We show that BigBird is a universal approximator of sequence functions and is Turing complete, thereby preserving these properties of the quadratic, full attention model. Along the way, our theoretical analysis reveals some of the benefits of having $O(1)$ global tokens (such as CLS), that attend to the entire sequence as part of the sparse attention mechanism. The proposed sparse attention can handle sequences of length up to 8x of what was previously possible using similar hardware. As a consequence of the capability to handle longer context, BigBird drastically improves performance on various NLP tasks such as question answering and summarization. We also propose novel applications to genomics data.

...read moreread less

939 citations

Posted Content•

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

[...]

Jingqing Zhang¹, Yao Zhao², Mohammad Saleh², Peter J. Liu²•Institutions (2)

Imperial College London¹, Google²

18 Dec 2019-arXiv: Computation and Language

TL;DR: This work proposes pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective, PEGASUS, and demonstrates it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores.

...read moreread less

Abstract: Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets.

...read moreread less

765 citations

Proceedings Article•

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

[...]

Jingqing Zhang¹, Yao Zhao², Mohammad Saleh², Peter J. Liu²•Institutions (2)

Imperial College London¹, Google²

12 Jul 2020

TL;DR: This article proposed to pre-train a large Transformer-based encoder-decoder model with a self-supervised objective for abstractive text summarization and achieved state-of-the-art performance on 12 downstream summarization tasks.

...read moreread less

457 citations

Journal Article•DOI•

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

[...]

Sascha Rothe¹, Shashi Narayan¹, Aliaksei Severyn¹•Institutions (1)

Google¹

17 Jun 2020-Transactions of the Association for Computational Linguistics

TL;DR: A Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2, and RoBERTa checkpoints is developed and an extensive empirical study on the utility of initializing the model, both encoder and decoder, with these checkpoints is conducted.

...read moreread less

Abstract: Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the ...

...read moreread less

350 citations

Proceedings Article•DOI•

Extractive Summarization as Text Matching

[...]

Ming Zhong¹, Pengfei Liu¹, Yiran Chen¹, Danqing Wang¹, Xipeng Qiu¹, Xuanjing Huang¹ - Show less +2 more•Institutions (1)

Fudan University¹

01 Jul 2020

TL;DR: This paper forms the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be matched in a semantic space to create a semantic matching framework.

...read moreread less

Abstract: This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems. Instead of following the commonly used framework of extracting sentences individually and modeling the relationship between sentences, we formulate the extractive summarization task as a semantic text matching problem, in which a source document and candidate summaries will be (extracted from the original text) matched in a semantic space. Notably, this paradigm shift to semantic matching framework is well-grounded in our comprehensive analysis of the inherent gap between sentence-level and summary-level extractors based on the property of the dataset. Besides, even instantiating the framework with a simple form of a matching model, we have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1). Experiments on the other five datasets also show the effectiveness of the matching framework. We believe the power of this matching-based summarization framework has not been fully exploited. To encourage more instantiations in the future, we have released our codes, processed dataset, as well as generated summaries in {https://github.com/maszhongming/MatchSum}.

...read moreread less

317 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

Collapse