Showing papers on "Natural language understanding published in 2020"

PDF

Open Access

Proceedings Article•DOI•

TinyBERT: Distilling BERT for Natural Language Understanding

[...]

Xiaoqi Jiao¹, Yichun Yin², Lifeng Shang², Xin Jiang², Xiao Chen², Linlin Li², Fang Wang², Qun Liu² - Show less +4 more•Institutions (2)

Huazhong University of Science and Technology¹, Huawei²

01 Nov 2020

TL;DR: TinyBERT as discussed by the authors proposes a two-stage learning framework for TinyBERT, which performs transformer distillation at both the pre-training and task-specific learning stages to capture the general-domain as well as the task specific knowledge in BERT.

...read moreread less

Abstract: Language model pre-training, such as BERT, has significantly improved the performances of many natural language processing tasks. However, pre-trained language models are usually computationally expensive, so it is difficult to efficiently execute them on resource-restricted devices. To accelerate inference and reduce model size while maintaining accuracy, we first propose a novel Transformer distillation method that is specially designed for knowledge distillation (KD) of the Transformer-based models. By leveraging this new KD method, the plenty of knowledge encoded in a large “teacher” BERT can be effectively transferred to a small “student” TinyBERT. Then, we introduce a new two-stage learning framework for TinyBERT, which performs Transformer distillation at both the pre-training and task-specific learning stages. This framework ensures that TinyBERT can capture the general-domain as well as the task-specific knowledge in BERT. TinyBERT4 with 4 layers is empirically effective and achieves more than 96.8% the performance of its teacher BERT-Base on GLUE benchmark, while being 7.5x smaller and 9.4x faster on inference. TinyBERT4 is also significantly better than 4-layer state-of-the-art baselines on BERT distillation, with only ~28% parameters and ~31% inference time of them. Moreover, TinyBERT6 with 6 layers performs on-par with its teacher BERT-Base.

...read moreread less

665 citations

Proceedings Article•DOI•

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data.

[...]

Emily M. Bender¹, Alexander Koller²•Institutions (2)

University of Washington¹, Saarland University²

01 Jul 2020

TL;DR: It is argued that a system trained only on form has a priori no way to learn meaning, and a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

...read moreread less

Abstract: The success of the large neural language models on many NLP tasks is exciting. However, we find that these successes sometimes lead to hype in which these models are being described as ``understanding'' language or capturing ``meaning''. In this position paper, we argue that a system trained only on form has a priori no way to learn meaning. In keeping with the ACL 2020 theme of ``Taking Stock of Where We've Been and Where We're Going'', we argue that a clear understanding of the distinction between form and meaning will help guide the field towards better science around natural language understanding.

...read moreread less

499 citations

Proceedings Article•DOI•

TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue

[...]

Chien-Sheng Wu¹, Steven C. H. Hoi¹, Richard Socher¹, Caiming Xiong¹•Institutions (1)

Salesforce.com¹

15 Apr 2020

TL;DR: The experimental results show that the pre-trained task- oriented dialogue BERT (ToD-BERT) surpasses BERT and other strong baselines in four downstream task-oriented dialogue applications, including intention detection, dialogue state tracking, dialogue act prediction, and response selection.

...read moreread less

Abstract: The underlying difference of linguistic patterns between general text and task-oriented dialogue makes existing pre-trained language models less useful in practice. In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling. To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling. We propose a contrastive objective function to simulate the response selection task. Our pre-trained task-oriented dialogue BERT (TOD-BERT) outperforms strong baselines like BERT on four downstream task-oriented dialogue applications, including intention recognition, dialogue state tracking, dialogue act prediction, and response selection. We also show that TOD-BERT has a stronger few-shot ability that can mitigate the data scarcity problem for task-oriented dialogue.

...read moreread less

206 citations

Journal Article•DOI•

Semantics-Aware BERT for Language Understanding

[...]

Zhuosheng Zhang¹, Yuwei Wu¹, Hai Zhao¹, Zuchao Li¹, Shuailiang Zhang¹, Xi Zhou, Xiang Zhou - Show less +3 more•Institutions (1)

Shanghai Jiao Tong University¹

03 Apr 2020

TL;DR: SemBERT as discussed by the authors incorporates explicit contextual semantics from pre-trained semantic role labeling, and introduces an improved language representation model, Semantics-aware BERT, which is capable of explicitly absorbing contextual semantics over a BERT backbone.

...read moreread less

Abstract: The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its BERT precursor in a light fine-tuning way without substantial task-specific modifications. Compared with BERT, semantics-aware BERT is as simple in concept but more powerful. It obtains new state-of-the-art or substantially improves results on ten reading comprehension and language inference tasks.

...read moreread less

199 citations

Proceedings Article•DOI•

CLUE: A Chinese Language Understanding Evaluation Benchmark

[...]

Liang Xu, Hai Hu¹, Xuanwei Zhang, Lu Li², Chenjie Cao³, Yudong Li, Yechen Xu, Kai Sun⁴, Dian Yu⁵, Cong Yu, Yin Tian, Qianqian Dong, Weitang Liu, Bo Shi, Yiming Cui⁶, Junyi Li, Jun Zeng, Rongzhao Wang, Weijian Xie, Yanting Li⁷, Yina Patterson, Zuoyu Tian¹, Yiwen Zhang¹, He Zhou¹, Shaoweihua Liu, Zhe Zhao⁵, Qipeng Zhao, Cong Yue, Xinrui Zhang, Zhengliang Yang, Kyle Richardson⁸, Zhenzhong Lan⁹ - Show less +28 more•Institutions (9)

Indiana University¹, Central China Normal University², East China University of Science and Technology³, Cornell University⁴, Tencent⁵, Harbin Institute of Technology⁶, Northwestern University⁷, Allen Institute for Artificial Intelligence⁸, Google⁹

01 Jan 2020

TL;DR: The first large-scale Chinese Language Understanding Evaluation (CLUE) benchmark is introduced, an open-ended, community-driven project that brings together 9 tasks spanning several well-established single-sentence/sentence-pair classification tasks, as well as machine reading comprehension, all on original Chinese text.

...read moreread less

Abstract: The advent of natural language understanding (NLU) benchmarks for English, such as GLUE and SuperGLUE allows new NLU models to be evaluated across a diverse set of tasks. These comprehensive benchmarks have facilitated a broad range of research and applications in natural language processing (NLP). The problem, however, is that most such benchmarks are limited to English, which has made it difficult to replicate many of the successes in English NLU for other languages. To help remedy this issue, we introduce the first large-scale Chinese Language Understanding Evaluation (CLUE) benchmark. CLUE is an open-ended, community-driven project that brings together 9 tasks spanning several well-established single-sentence/sentence-pair classification tasks, as well as machine reading comprehension, all on original Chinese text. To establish results on these tasks, we report scores using an exhaustive set of current state-of-the-art pre-trained Chinese models (9 in total). We also introduce a number of supplementary datasets and additional tools to help facilitate further progress on Chinese NLU. Our benchmark is released at https://www.cluebenchmarks.com

...read moreread less

190 citations

Proceedings Article•DOI•

XGLUE: A New Benchmark Datasetfor Cross-lingual Pre-training, Understanding and Generation

[...]

Yaobo Liang¹, Nan Duan¹, Yeyun Gong¹, Ning Wu, Fenfei Guo, Weizhen Qi², Ming Gong¹, Linjun Shou¹, Daxin Jiang³, Guihong Cao¹, Xiaodong Fan, Ruofei Zhang¹, Rahul Agrawal¹, Edward Cui¹, Sining Wei, Taroon Bharti¹, Ying Qiao¹, Jiun-Hung Chen¹, Winnie Wu, Shuguang Liu, Fan Yang¹, Daniel Campos¹, Rangan Majumder¹, Ming Zhou¹ - Show less +20 more•Institutions (3)

Microsoft¹, University of Science and Technology of China², Search Technologies³

03 Apr 2020

TL;DR: A recent cross-lingual pre-trained model Unicoder is extended to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline and the base versions of Multilingual BERT, XLM and XLM-R are evaluated for comparison.

...read moreread less

Abstract: In this paper, we introduce XGLUE, a new benchmark dataset to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora, and evaluate their performance across a diverse set of cross-lingual tasks. Comparing to GLUE (Wang et al.,2019), which is labeled in English and includes natural language understanding tasks only, XGLUE has three main advantages: (1) it provides two corpora with different sizes for cross-lingual pre-training; (2) it provides 11 diversified tasks that cover both natural language understanding and generation scenarios; (3) for each task, it provides labeled data in multiple languages. We extend a recent cross-lingual pre-trained model Unicoder (Huang et al., 2019) to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline. We also evaluate the base versions (12-layer) of Multilingual BERT, XLM and XLM-R for comparison.

...read moreread less

158 citations

Posted Content•

Unsupervised Commonsense Question Answering with Self-Talk

[...]

Vered Shwartz¹, Peter West¹, Ronan Le Bras¹, Chandra Bhagavatula¹, Yejin Choi¹ - Show less +1 more•Institutions (1)

Allen Institute for Artificial Intelligence¹

11 Apr 2020-arXiv: Computation and Language

TL;DR: An unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks, inspired by inquiry-based discovery learning, which improves performance on several benchmarks and competes with models that obtain knowledge from external KBs.

...read moreread less

Abstract: Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained language models as the sole implicit source of world knowledge, or resort to external knowledge bases (KBs) to incorporate additional relevant knowledge. We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks. Inspired by inquiry-based discovery learning (Bruner, 1961), our approach inquires language models with a number of information seeking questions such as "$\textit{what is the definition of ...}$" to discover additional background knowledge. Empirical results demonstrate that the self-talk procedure substantially improves the performance of zero-shot language model baselines on four out of six commonsense benchmarks, and competes with models that obtain knowledge from external KBs. While our approach improves performance on several benchmarks, the self-talk induced knowledge even when leading to correct answers is not always seen as useful by human judges, raising interesting questions about the inner-workings of pre-trained language models for commonsense reasoning.

...read moreread less

136 citations

Proceedings Article•

TabFact: A Large-scale Dataset for Table-based Fact Verification

[...]

Wenhu Chen¹, Hongmin Wang¹, Jianshu Chen¹, Yunkai Zhang, Hong Wang¹, Shiyang Li¹, Xiyou Zhou¹, William Yang Wang¹ - Show less +4 more•Institutions (1)

University of California, Santa Barbara¹

30 Apr 2020

TL;DR: Zhang et al. as mentioned in this paper designed two different models: Table-BERT and Latent Program Algorithm (LPA) to verify whether a textual hypothesis holds based on the given evidence, also known as fact verification.

...read moreread less

Abstract: The problem of verifying whether a textual hypothesis holds based on the given evidence, also known as fact verification, plays an important role in the study of natural language understanding and semantic representation. However, existing studies are mainly restricted to dealing with unstructured evidence (e.g., natural language sentences and documents, news, etc), while verification under structured evidence, such as tables, graphs, and databases, remains unexplored. This paper specifically aims to study the fact verification given semi-structured data as evidence. To this end, we construct a large-scale dataset called TabFact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED. TabFact is challenging since it involves both soft linguistic reasoning and hard symbolic reasoning. To address these reasoning challenges, we design two different models: Table-BERT and Latent Program Algorithm (LPA). Table-BERT leverages the state-of-the-art pre-trained language model to encode the linearized tables and statements into continuous vectors for verification. LPA parses statements into LISP-like programs and executes them against the tables to obtain the returned binary value for verification. Both methods achieve similar accuracy but still lag far behind human performance. We also perform a comprehensive analysis to demonstrate great future opportunities.

...read moreread less

131 citations

Posted Content•

XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation

[...]

Yaobo Liang, Nan Duan, Yeyun Gong, Ning Wu, Fenfei Guo, Weizhen Qi, Ming Gong, Linjun Shou, Daxin Jiang, Guihong Cao, Xiaodong Fan, Ruofei Zhang, Rahul Agrawal, Edward Cui, Sining Wei, Taroon Bharti, Ying Qiao, Jiun-Hung Chen, Winnie Wu, Shuguang Liu, Fan Yang, Daniel Campos, Rangan Majumder, Ming Zhou - Show less +20 more

03 Apr 2020-arXiv: Computation and Language

TL;DR: XGLUE as mentioned in this paper provides 11 diversified tasks that cover both natural language understanding and generation scenarios, and for each task, it provides labeled data in multiple languages, which can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora.

...read moreread less

Abstract: In this paper, we introduce XGLUE, a new benchmark dataset that can be used to train large-scale cross-lingual pre-trained models using multilingual and bilingual corpora and evaluate their performance across a diverse set of cross-lingual tasks. Comparing to GLUE(Wang et al., 2019), which is labeled in English for natural language understanding tasks only, XGLUE has two main advantages: (1) it provides 11 diversified tasks that cover both natural language understanding and generation scenarios; (2) for each task, it provides labeled data in multiple languages. We extend a recent cross-lingual pre-trained model Unicoder(Huang et al., 2019) to cover both understanding and generation tasks, which is evaluated on XGLUE as a strong baseline. We also evaluate the base versions (12-layer) of Multilingual BERT, XLM and XLM-R for comparison.

...read moreread less

130 citations

Proceedings Article•DOI•

SenseBERT: Driving Some Sense into BERT

[...]

Yoav Levine¹, Barak Lenz, Or Dagan, Dan Padnos², Or Sharir¹, Shai Shalev-Shwartz¹, Amnon Shashua¹, Yoav Shoham³ - Show less +4 more•Institutions (3)

Hebrew University of Jerusalem¹, The Racah Institute of Physics², Stanford University³

01 Jul 2020

TL;DR: The authors proposed a method to employ weak-supervision directly at the word sense level, which achieved state-of-the-art results on the SemEval Word Sense Disambiguation task.

...read moreread less

Abstract: The ability to learn from large unlabeled corpora has allowed neural language models to advance the frontier in natural language understanding. However, existing self-supervision techniques operate at the word form level, which serves as a surrogate for the underlying semantic content. This paper proposes a method to employ weak-supervision directly at the word sense level. Our model, named SenseBERT, is pre-trained to predict not only the masked words but also their WordNet supersenses. Accordingly, we attain a lexical-semantic level language model, without the use of human annotation. SenseBERT achieves significantly improved lexical understanding, as we demonstrate by experimenting on SemEval Word Sense Disambiguation, and by attaining a state of the art result on the ‘Word in Context’ task.

...read moreread less

109 citations

Proceedings Article•DOI•

Curriculum Learning for Natural Language Understanding

[...]

Benfeng Xu¹, Licheng Zhang¹, Zhendong Mao¹, Quan Wang², Hongtao Xie¹, Yongdong Zhang¹ - Show less +2 more•Institutions (2)

University of Science and Technology of China¹, Baidu²

01 Jul 2020

TL;DR: By reviewing the trainset in a crossed way, this work is able to distinguish easy examples from difficult ones, and arrange a curriculum for language models, and obtains significant and universal performance improvements on a wide range of NLU tasks.

...read moreread less

Abstract: With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks. At the fine-tune stage, target task data is usually introduced in a completely random order and treated equally. However, examples in NLU tasks can vary greatly in difficulty, and similar to human learning procedure, language models can benefit from an easy-to-difficult curriculum. Based on this idea, we propose our Curriculum Learning approach. By reviewing the trainset in a crossed way, we are able to distinguish easy examples from difficult ones, and arrange a curriculum for language models. Without any manual model architecture design or use of external data, our Curriculum Learning approach obtains significant and universal performance improvements on a wide range of NLU tasks.

...read moreread less

Proceedings Article•DOI•

tBERT: Topic Models and BERT Joining Forces for Semantic Similarity Detection

[...]

Nicole Peinelt, Dong Nguyen¹, Maria Liakata²•Institutions (2)

The Turing Institute¹, University of Warwick²

01 Jul 2020

TL;DR: This work proposes a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and shows that the model improves performance over strong neural baselines across a variety of English language datasets.

...read moreread less

Abstract: Semantic similarity detection is a fundamental task in natural language understanding. Adding topic information has been useful for previous feature-engineered semantic similarity models as well as neural models for other tasks. There is currently no standard way of combining topics with pretrained contextual representations such as BERT. We propose a novel topic-informed BERT-based architecture for pairwise semantic similarity detection and show that our model improves performance over strong neural baselines across a variety of English language datasets. We find that the addition of topics to BERT helps particularly with resolving domain-specific cases.

...read moreread less

Proceedings Article•

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

[...]

Hangbo Bao¹, Li Dong¹, Furu Wei¹, Wenhui Wang¹, Nan Yang¹, Xiaodong Liu², Yu Wang³, Jianfeng Gao¹, Songhao Piao⁴, Ming Zhou⁵, Hsiao-Wuen Hon¹ - Show less +7 more•Institutions (5)

Microsoft¹, University of Colorado Boulder², Tsinghua University³, Harbin Institute of Technology⁴, North China Electric Power University⁵

21 Nov 2020

TL;DR: This article proposed a pseudo-masked language model (PMLM) to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks.

...read moreread less

Abstract: We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

...read moreread less

Posted Content•

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue

[...]

Shikib Mehri, Mihail Eric, Dilek Hakkani-Tur

28 Sep 2020-arXiv: Computation and Language

TL;DR: DialoGLUE (Dialogue Language Understanding Evaluation), a public benchmark consisting of 7 task-oriented dialogue datasets covering 4 distinct natural language understanding tasks, is introduced, designed to encourage dialogue research in representation-based transfer, domain adaptation, and sample-efficient task learning.

...read moreread less

Abstract: A long-standing goal of task-oriented dialogue research is the ability to flexibly adapt dialogue models to new domains. To progress research in this direction, we introduce DialoGLUE (Dialogue Language Understanding Evaluation), a public benchmark consisting of 7 task-oriented dialogue datasets covering 4 distinct natural language understanding tasks, designed to encourage dialogue research in representation-based transfer, domain adaptation, and sample-efficient task learning. We release several strong baseline models, demonstrating performance improvements over a vanilla BERT architecture and state-of-the-art results on 5 out of 7 tasks, by pre-training on a large open-domain dialogue corpus and task-adaptive self-supervised training. Through the DialoGLUE benchmark, the baseline methods, and our evaluation scripts, we hope to facilitate progress towards the goal of developing more general task-oriented dialogue models.

...read moreread less

Posted Content•

IndoNLU: Benchmark and Resources for Evaluating Indonesian Natural Language Understanding

[...]

Bryan Wilie¹, Karissa Vincentio, Genta Indra Winata², Samuel Cahyawijaya³, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman⁴, Rahmad Mahendra⁵, Pascale Fung³, Syafri Bahar, Ayu Purwarianti¹ - Show less +7 more•Institutions (5)

Bandung Institute of Technology¹, Salesforce.com², Hong Kong University of Science and Technology³, Tokyo Institute of Technology⁴, University of Indonesia⁵

11 Sep 2020-arXiv: Computation and Language

TL;DR: The first-ever vast resource for training, evaluation, and benchmarking on Indonesian natural language understanding (IndoNLU) tasks is introduced, releasing baseline models for all twelve tasks, as well as the framework for benchmark evaluation, thus enabling everyone to benchmark their system performances.

...read moreread less

Abstract: Although Indonesian is known to be the fourth most frequently used language over the internet, the research progress on this language in the natural language processing (NLP) is slow-moving due to a lack of available resources. In response, we introduce the first-ever vast resource for the training, evaluating, and benchmarking on Indonesian natural language understanding (IndoNLU) tasks. IndoNLU includes twelve tasks, ranging from single sentence classification to pair-sentences sequence labeling with different levels of complexity. The datasets for the tasks lie in different domains and styles to ensure task diversity. We also provide a set of Indonesian pre-trained models (IndoBERT) trained from a large and clean Indonesian dataset Indo4B collected from publicly available sources such as social media texts, blogs, news, and websites. We release baseline models for all twelve tasks, as well as the framework for benchmark evaluation, and thus it enables everyone to benchmark their system performances.

...read moreread less

Proceedings Article•DOI•

Zero-Shot Cross-Lingual Transfer with Meta Learning

[...]

Farhad Nooralahzadeh¹, Giannis Bekoulis², Johannes Bjerva, Isabelle Augenstein³•Institutions (3)

University of Zurich¹, Katholieke Universiteit Leuven², University of Copenhagen³

01 Nov 2020

TL;DR: This work considers the setting of training models on multiple different languages at the same time, when little or no data is available for languages other than English, and demonstrates the consistent effectiveness of meta-learning for a total of 15 languages.

...read moreread less

Abstract: Learning what to share between tasks has become a topic of great importance, as strategic sharing of knowledge has been shown to improve downstream task performance. This is particularly important for multilingual applications, as most languages in the world are under-resourced. Here, we consider the setting of training models on multiple different languages at the same time, when little or no data is available for languages other than English. We show that this challenging setup can be approached using meta-learning: in addition to training a source language model, another model learns to select which training instances are the most beneficial to the first. We experiment using standard supervised, zero-shot cross-lingual, as well as few-shot cross-lingual settings for different natural language understanding tasks (natural language inference, question answering). Our extensive experimental setup demonstrates the consistent effectiveness of meta-learning for a total of 15 languages. We improve upon the state-of-the-art for zero-shot and few-shot NLI (on MultiNLI and XNLI) and QA (on the MLQA dataset). A comprehensive error analysis indicates that the correlation of typological features between languages can partly explain when parameter sharing learned via meta-learning is beneficial.

...read moreread less

Proceedings Article•

Graph Constrained Reinforcement Learning for Natural Language Action Spaces

[...]

Prithviraj Ammanabrolu¹, Matthew Hausknecht²•Institutions (2)

Georgia Institute of Technology¹, Microsoft²

30 Apr 2020

TL;DR: KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space is presented, arguing that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions.

...read moreread less

Abstract: Interactive Fiction games are text-based simulations in which an agent interacts with the world purely through natural language. They are ideal environments for studying how to extend reinforcement learning agents to meet the challenges of natural language understanding, partial observability, and action generation in combinatorially-large text-based action spaces. We present KG-A2C, an agent that builds a dynamic knowledge graph while exploring and generates actions using a template-based action space. We contend that the dual uses of the knowledge graph to reason about game state and to constrain natural language generation are the keys to scalable exploration of combinatorially large natural language actions. Results across a wide variety of IF games show that KG-A2C outperforms current IF agents despite the exponential increase in action space size.

...read moreread less

Posted Content•

LogiQA: A Challenge Dataset for Machine Reading Comprehension with Logical Reasoning

[...]

Jian Liu¹, Leyang Cui², Hanmeng Liu², Dandan Huang², Yile Wang², Yue Zhang² - Show less +2 more•Institutions (2)

Fudan University¹, Westlake University²

16 Jul 2020-arXiv: Computation and Language

TL;DR: A comprehensive dataset, named LogiQA, is built, which is sourced from expert-written questions for testing human Logical reasoning, and shows that state-of-the-art neural models perform by far worse than human ceiling.

...read moreread less

Abstract: Machine reading is a fundamental task for testing the capability of natural language understanding, which is closely related to human cognition in many aspects. With the rising of deep learning techniques, algorithmic models rival human performances on simple QA, and thus increasingly challenging machine reading datasets have been proposed. Though various challenges such as evidence integration and commonsense knowledge have been integrated, one of the fundamental capabilities in human reading, namely logical reasoning, is not fully investigated. We build a comprehensive dataset, named LogiQA, which is sourced from expert-written questions for testing human Logical reasoning. It consists of 8,678 QA instances, covering multiple types of deductive reasoning. Results show that state-of-the-art neural models perform by far worse than human ceiling. Our dataset can also serve as a benchmark for reinvestigating logical AI under the deep learning NLP setting. The dataset is freely available at this https URL

...read moreread less

Proceedings Article•DOI•

Unsupervised Commonsense Question Answering with Self-Talk

[...]

Vered Shwartz¹, Peter West¹, Ronan Le Bras¹, Chandra Bhagavatula¹, Yejin Choi¹ - Show less +1 more•Institutions (1)

Allen Institute for Artificial Intelligence¹

11 Apr 2020

TL;DR: This paper propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks, inspired by inquiry-based discovery learning (Bruner, 1961).

...read moreread less

Abstract: Natural language understanding involves reading between the lines with implicit background knowledge. Current systems either rely on pre-trained language models as the sole implicit source of world knowledge, or resort to external knowledge bases (KBs) to incorporate additional relevant knowledge. We propose an unsupervised framework based on self-talk as a novel alternative to multiple-choice commonsense tasks. Inspired by inquiry-based discovery learning (Bruner, 1961), our approach inquires language models with a number of information seeking questions such as "what is the definition of..." to discover additional background knowledge. Empirical results demonstrate that the self-talk procedure substantially improves the performance of zero-shot language model baselines on four out of six commonsense benchmarks, and competes with models that obtain knowledge from external KBs. While our approach improves performance on several benchmarks, the self-talk induced knowledge even when leading to correct answers is not always seen as helpful by human judges, raising interesting questions about the inner-workings of pre-trained language models for commonsense reasoning.

...read moreread less

Journal Article•DOI•

KBot : a Knowledge graph based chatBot for natural language understanding over linked data

[...]

Addi Ait-Mlouk¹, Lili Jiang¹•Institutions (1)

Umeå University¹

12 Aug 2020-IEEE Access

TL;DR: This work designs and develops an architecture to provide an interactive user interface and proposes a machine learning approach based on intent classification and natural language understanding to understand user intents and generate SPARQL queries to extend the chatbot capabilities by understanding analytical queries.

...read moreread less

Abstract: With the rapid progress of the semantic web, a huge amount of structured data has become available on the web in the form of knowledge bases (KBs). Making these data accessible and useful for end-users is one of the main objectives of chatbots over linked data. Building a chatbot over linked data raises different challenges, including user queries understanding, multiple knowledge base support, and multilingual aspect. To address these challenges, we first design and develop an architecture to provide an interactive user interface. Secondly, we propose a machine learning approach based on intent classification and natural language understanding to understand user intents and generate SPARQL queries. We especially process a new social network dataset (i.e., myPersonality) and add it to the existing knowledge bases to extend the chatbot capabilities by understanding analytical queries. The system can be extended with a new domain on-demand, flexible, multiple knowledge base, multilingual, and allows intuitive creation and execution of different tasks for an extensive range of topics. Furthermore, evaluation and application cases in the chatbot are provided to show how it facilitates interactive semantic data towards different real application scenarios and showcase the proposed approach for a knowledge graph and data-driven chatbot.

...read moreread less

Posted Content•

End-to-End Slot Alignment and Recognition for Cross-Lingual NLU

[...]

Weijia Xu¹, Batool Haider², Saab Mansour²•Institutions (2)

University of Maryland, College Park¹, Amazon.com²

29 Apr 2020-arXiv: Computation and Language

TL;DR: The authors proposed an end-to-end model that learns to align and predict target slot labels jointly for cross-lingual transfer, which outperforms a simple label projection method using fast-align on most languages, and achieves competitive performance to the more complex, state-of-the-art projection method with only half of the training time.

...read moreread less

Abstract: Natural language understanding (NLU) in the context of goal-oriented dialog systems typically includes intent classification and slot labeling tasks. Existing methods to expand an NLU system to new languages use machine translation with slot label projection from source to the translated utterances, and thus are sensitive to projection errors. In this work, we propose a novel end-to-end model that learns to align and predict target slot labels jointly for cross-lingual transfer. We introduce MultiATIS++, a new multilingual NLU corpus that extends the Multilingual ATIS corpus to nine languages across four language families, and evaluate our method using the corpus. Results show that our method outperforms a simple label projection method using fast-align on most languages, and achieves competitive performance to the more complex, state-of-the-art projection method with only half of the training time. We release our MultiATIS++ corpus to the community to continue future research on cross-lingual NLU.

...read moreread less

Posted Content•

ToD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogues

[...]

Chien-Sheng Wu, Steven C. H. Hoi, Richard Socher, Caiming Xiong¹•Institutions (1)

Salesforce.com¹

15 Apr 2020-arXiv: Computation and Language

TL;DR: In this paper, a task-oriented dialogue BERT (ToD-BERT) model was proposed to improve the performance of task-orientated dialogues by combining nine English-based, human-human, multi-turn and publicly available taskoriented dialogue datasets.

...read moreread less

Abstract: The use of pre-trained language models has emerged as a promising direction for improving dialogue systems. However, the underlying difference of linguistic patterns between conversational data and general text makes the existing pre-trained language models not as effective as they have been shown to be. Recently, there are some pre-training approaches based on open-domain dialogues, leveraging large-scale social media data such as Twitter or Reddit. Pre-training for task-oriented dialogues, on the other hand, is rarely discussed because of the long-standing and crucial data scarcity problem. In this work, we combine nine English-based, human-human, multi-turn and publicly available task-oriented dialogue datasets to conduct language model pre-training. The experimental results show that our pre-trained task-oriented dialogue BERT (ToD-BERT) surpasses BERT and other strong baselines in four downstream task-oriented dialogue applications, including intention detection, dialogue state tracking, dialogue act prediction, and response selection. Moreover, in the simulated limited data experiments, we show that ToD-BERT has stronger few-shot capacity that can mitigate the data scarcity problem in task-oriented dialogues.

...read moreread less

Journal Article•DOI•

Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models.

[...]

James L. McClelland¹, Felix Hill, Maja Rudolph², Jason Baldridge³, Hinrich Schütze⁴ - Show less +1 more•Institutions (4)

Stanford University¹, Bosch², Google³, Ludwig Maximilian University of Munich⁴

28 Sep 2020-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: This work describes the organization of the brain’s distributed understanding system, which includes a fast learning system that addresses the memory problem and sketches a framework for future models of understanding drawing equally on cognitive neuroscience and artificial intelligence and exploiting query-based attention.

...read moreread less

Abstract: Language is crucial for human intelligence, but what exactly is its role? We take language to be a part of a system for understanding and communicating about situations. In humans, these abilities emerge gradually from experience and depend on domain-general principles of biological neural networks: connection-based learning, distributed representation, and context-sensitive, mutual constraint satisfaction-based processing. Current artificial language processing systems rely on the same domain general principles, embodied in artificial neural networks. Indeed, recent progress in this field depends on query-based attention, which extends the ability of these systems to exploit context and has contributed to remarkable breakthroughs. Nevertheless, most current models focus exclusively on language-internal tasks, limiting their ability to perform tasks that depend on understanding situations. These systems also lack memory for the contents of prior situations outside of a fixed contextual span. We describe the organization of the brain's distributed understanding system, which includes a fast learning system that addresses the memory problem. We sketch a framework for future models of understanding drawing equally on cognitive neuroscience and artificial intelligence and exploiting query-based attention. We highlight relevant current directions and consider further developments needed to fully capture human-level language understanding in a computational system.

...read moreread less

Proceedings Article•DOI•

Question Directed Graph Attention Network for Numerical Reasoning over Text.

[...]

Kunlong Chen, Weidi Xu¹, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang², Le Song², Taifeng Wang³, Yuan Qi, Wei Chu⁴ - Show less +5 more•Institutions (4)

Peking University¹, Georgia Institute of Technology², Alibaba Group³, Sichuan University⁴

01 Nov 2020

TL;DR: A heterogeneous graph representation is proposed for the context of the passage and question needed for numerical reasoning, and a question directed graph attention network is designed to drive multi-step numerical reasoning over this context graph.

...read moreread less

Abstract: Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this challenge, we propose a heterogeneous graph representation for the context of the passage and question needed for such reasoning, and design a question directed graph attention network to drive multi-step numerical reasoning over this context graph. Our model, which combines deep learning and graph reasoning, achieves remarkable results in benchmark datasets such as DROP.

...read moreread less

Posted Content•

It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners

[...]

Timo Schick¹, Hinrich Schütze¹•Institutions (1)

Ludwig Maximilian University of Munich¹

15 Sep 2020-arXiv: Computation and Language

TL;DR: This article showed that performance similar to GPT-3 can be achieved with language models that are much "greener" in that their parameter count is several orders of magnitude smaller, by converting textual inputs into cloze questions that contain a task description, combined with gradient-based optimization.

...read moreread less

Abstract: When scaled to hundreds of billions of parameters, pretrained language models such as GPT-3 (Brown et al., 2020) achieve remarkable few-shot performance. However, enormous amounts of compute are required for training and applying such big models, resulting in a large carbon footprint and making it difficult for researchers and practitioners to use them. We show that performance similar to GPT-3 can be obtained with language models that are much "greener" in that their parameter count is several orders of magnitude smaller. This is achieved by converting textual inputs into cloze questions that contain a task description, combined with gradient-based optimization; exploiting unlabeled data gives further improvements. We identify key factors required for successful natural language understanding with small language models.

...read moreread less

Posted Content•

Self-training Improves Pre-training for Natural Language Understanding

[...]

Jingfei Du¹, Edouard Grave¹, Beliz Gunel², Vishrav Chaudhary¹, Onur Celebi¹, Michael Auli¹, Veselin Stoyanov³, Alexis Conneau¹ - Show less +4 more•Institutions (3)

Facebook¹, Stanford University², Johns Hopkins University³

05 Oct 2020-arXiv: Computation and Language

TL;DR: This paper proposed SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data to retrieve sentences from a bank of billions of unlabeled sentences crawled from the web.

...read moreread less

Abstract: Unsupervised pre-training has led to much recent progress in natural language understanding. In this paper, we study self-training as another way to leverage unlabeled data through semi-supervised learning. To obtain additional data for a specific task, we introduce SentAugment, a data augmentation method which computes task-specific query embeddings from labeled data to retrieve sentences from a bank of billions of unlabeled sentences crawled from the web. Unlike previous semi-supervised methods, our approach does not require in-domain unlabeled data and is therefore more generally applicable. Experiments show that self-training is complementary to strong RoBERTa baselines on a variety of tasks. Our augmentation approach leads to scalable and effective self-training with improvements of up to 2.6% on standard text classification benchmarks. Finally, we also show strong gains on knowledge-distillation and few-shot learning.

...read moreread less

Proceedings Article•

Dialogue-AMR: Abstract Meaning Representation for Dialogue.

[...]

Claire Bonial¹, Lucia Donatelli², Mitchell Abrams³, Stephanie M. Lukin¹, Stephen Tratz¹, Matthew Marge¹, Ron Artstein⁴, David Traum⁴, Clare R. Voss¹ - Show less +5 more•Institutions (4)

United States Army Research Laboratory¹, Saarland University², Uppsala University³, University of Southern California⁴

01 May 2020

TL;DR: A schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems is described and an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect is presented.

...read moreread less

Abstract: This paper describes a schema that enriches Abstract Meaning Representation (AMR) in order to provide a semantic representation for facilitating Natural Language Understanding (NLU) in dialogue systems. AMR offers a valuable level of abstraction of the propositional content of an utterance; however, it does not capture the illocutionary force or speaker’s intended contribution in the broader dialogue context (e.g., make a request or ask a question), nor does it capture tense or aspect. We explore dialogue in the domain of human-robot interaction, where a conversational robot is engaged in search and navigation tasks with a human partner. To address the limitations of standard AMR, we develop an inventory of speech acts suitable for our domain, and present “Dialogue-AMR”, an enhanced AMR that represents not only the content of an utterance, but the illocutionary force behind it, as well as tense and aspect. To showcase the coverage of the schema, we use both manual and automatic methods to construct the “DialAMR” corpus—a corpus of human-robot dialogue annotated with standard AMR and our enriched Dialogue-AMR schema. Our automated methods can be used to incorporate AMR into a larger NLU pipeline supporting human-robot dialogue.

...read moreread less

Proceedings Article•DOI•

Probing Linguistic Systematicity

[...]

Emily Goodwin¹, Koustuv Sinha¹, Timothy J. O'Donnell¹•Institutions (1)

McGill University¹

01 Jul 2020

TL;DR: Evidence that current state-of-the-art NLU systems do not generalize systematically, despite overall high performance is provided.

...read moreread less

Abstract: Recently, there has been much interest in the question of whether deep natural language understanding (NLU) models exhibit systematicity, generalizing such that units like words make consistent contributions to the meaning of the sentences in which they appear. There is accumulating evidence that neural models do not learn systematically. We examine the notion of systematicity from a linguistic perspective, defining a set of probing tasks and a set of metrics to measure systematic behaviour. We also identify ways in which network architectures can generalize non-systematically, and discuss why such forms of generalization may be unsatisfying. As a case study, we perform a series of experiments in the setting of natural language inference (NLI). We provide evidence that current state-of-the-art NLU systems do not generalize systematically, despite overall high performance.

...read moreread less

Posted Content•

Pre-training Text-to-Text Transformers for Concept-centric Common Sense

[...]

Wangchunshu Zhou, Dong-Ho Lee, Ravi Kiran Selvam, Seyeon Lee, Bill Yuchen Lin, Xiang Ren - Show less +2 more

24 Oct 2020-arXiv: Computation and Language

TL;DR: It is shown that while only incrementally pre-trained on a relatively small corpus for a few steps, CALM outperforms baseline methods by a consistent margin and even comparable with some larger PTLMs, which suggests that CALM can serve as a general, plug-and-play method for improving the commonsense reasoning ability of a PTLM.

...read moreread less

Abstract: Pre-trained language models (PTLM) have achieved impressive results in a range of natural language understanding (NLU) and generation (NLG) tasks. However, current pre-training objectives such as masked token prediction (for BERT-style PTLMs) and masked span infilling (for T5-style PTLMs) do not explicitly model the relational commonsense knowledge about everyday concepts, which is crucial to many downstream tasks that need common sense to understand or generate. To augment PTLMs with concept-centric commonsense knowledge, in this paper, we propose both generative and contrastive objectives for learning common sense from the text, and use them as intermediate self-supervised learning tasks for incrementally pre-training PTLMs (before task-specific fine-tuning on downstream datasets). Furthermore, we develop a joint pre-training framework to unify generative and contrastive objectives so that they can mutually reinforce each other. Extensive experimental results show that our method, concept-aware language model (CALM), can pack more commonsense knowledge into the parameters of a pre-trained text-to-text transformer without relying on external knowledge graphs, yielding better performance on both NLU and NLG tasks. We show that while only incrementally pre-trained on a relatively small corpus for a few steps, CALM outperforms baseline methods by a consistent margin and even comparable with some larger PTLMs, which suggests that CALM can serve as a general, plug-and-play method for improving the commonsense reasoning ability of a PTLM.

...read moreread less

Proceedings Article•

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

[...]

Wei Wang¹, Bin Bi¹, Ming Yan¹, Chen Wu¹, Jiangnan Xia¹, Zuyi Bao¹, Liwei Peng¹, Luo Si¹ - Show less +4 more•Institutions (1)

Alibaba Group¹

30 Apr 2020

TL;DR: StructBERT as mentioned in this paper extends BERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively.

...read moreread less

Abstract: Recently, the pre-trained language model, BERT (and its robustly optimized version RoBERTa), has attracted a lot of attention in natural language understanding (NLU), and achieved state-of-the-art accuracy in various NLU tasks, such as sentiment classification, natural language inference, semantic textual similarity and question answering. Inspired by the linearization exploration work of Elman, we extend BERT to a new model, StructBERT, by incorporating language structures into pre-training. Specifically, we pre-train StructBERT with two auxiliary tasks to make the most of the sequential order of words and sentences, which leverage language structures at the word and sentence levels, respectively. As a result, the new model is adapted to different levels of language understanding required by downstream tasks. The StructBERT with structural pre-training gives surprisingly good empirical results on a variety of downstream tasks, including pushing the state-of-the-art on the GLUE benchmark to 89.0 (outperforming all published models), the F1 score on SQuAD v1.1 question answering to 93.0, the accuracy on SNLI to 91.7.

...read moreread less

Collapse