Top 14 papers published by Luke Zettlemoyer from Facebook in 2017

Proceedings Article•DOI•

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

[...]

Mandar Joshi¹, Eunsol Choi¹, Daniel S. Weld¹, Luke Zettlemoyer²•Institutions (2)

University of Washington¹, Allen Institute for Artificial Intelligence²

09 May 2017

TL;DR: It is shown that, in comparison to other recently introduced large-scale datasets, TriviaQA has relatively complex, compositional questions, has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and requires more cross sentence reasoning to find answers.

...read moreread less

Abstract: We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study.

...read moreread less

1,266 citations

Proceedings Article•DOI•

End-to-end Neural Coreference Resolution

[...]

Kenton Lee¹, Luheng He², Michael Lewis³, Luke Zettlemoyer¹•Institutions (3)

University of Washington¹, Google², University of Pittsburgh³

21 Jul 2017

TL;DR: This work introduces the first end-to-end coreference resolution model, trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions.

...read moreread less

Abstract: We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. The key idea is to directly consider all spans in a document as potential mentions and learn distributions over possible antecedents for each. The model computes span embeddings that combine context-dependent boundary representations with a head-finding attention mechanism. It is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments demonstrate state-of-the-art performance, with a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model ensemble, despite the fact that this is the first approach to be successfully trained with no external resources.

...read moreread less

705 citations

Proceedings Article•DOI•

Deep Semantic Role Labeling: What Works and What’s Next

[...]

Luheng He¹, Kenton Lee², Michael Lewis³, Luke Zettlemoyer²•Institutions (3)

Google¹, University of Washington², University of Pittsburgh³

01 Jul 2017

TL;DR: A new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations are introduced.

...read moreread less

Abstract: We introduce a new deep learning model for semantic role labeling (SRL) that significantly improves the state of the art, along with detailed analyses to reveal its strengths and limitations. We use a deep highway BiLSTM architecture with constrained decoding, while observing a number of recent best practices for initialization and regularization. Our 8-layer ensemble model achieves 83.2 F1 on theCoNLL 2005 test set and 83.4 F1 on CoNLL 2012, roughly a 10% relative error reduction over the previous state of the art. Extensive empirical analysis of these gains show that (1) deep models excel at recovering long-distance dependencies but can still make surprisingly obvious errors, and (2) that there is still room for syntactic parsers to improve these results.

...read moreread less

474 citations

Proceedings Article•DOI•

Learning a Neural Semantic Parser from User Feedback

[...]

Srinivasan Iyer¹, Ioannis Konstas¹, Alvin Cheung¹, Jayant Krishnamurthy², Luke Zettlemoyer² - Show less +1 more•Institutions (2)

University of Washington¹, Allen Institute for Artificial Intelligence²

27 Apr 2017

TL;DR: An approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention is presented.

...read moreread less

Abstract: We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. To achieve this, we adapt neural sequence models to map utterances directly to SQL with its full expressivity, bypassing any intermediate meaning representations. These models are immediately deployed online to solicit feedback from real users to flag incorrect queries. Finally, the popularity of SQL facilitates gathering annotations for incorrect predictions using the crowd, which is directly used to improve our models. This complete feedback loop, without intermediate representations or database specific engineering, opens up new ways of building high quality semantic parsers. Experiments suggest that this approach can be deployed quickly for any new target domain, as we show by learning a semantic parser for an online academic database from scratch.

...read moreread less

289 citations

Proceedings Article•DOI•

Zero-Shot Relation Extraction via Reading Comprehension

[...]

Omer Levy¹, Minjoon Seo², Eunsol Choi², Luke Zettlemoyer²•Institutions (2)

Bar-Ilan University¹, University of Washington²

13 Jun 2017

TL;DR: The authors propose to reduce relation extraction to answering simple reading comprehension questions, by associating one or more natural language questions with each relation slot, and show that the approach can generalize to new questions for known relation types with high accuracy, and that zero-shot generalization to unseen relation types is possible, at lower accuracy levels.

...read moreread less

Abstract: We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot. This reduction has several advantages: we can (1) learn relation-extraction models by extending recent neural reading-comprehension techniques, (2) build very large training sets for those models by combining relation-specific crowd-sourced questions with distant supervision, and even (3) do zero-shot learning by extracting new relation types that are only specified at test-time, for which we have no labeled training examples. Experiments on a Wikipedia slot-filling task demonstrate that the approach can generalize to new questions for known relation types with high accuracy, and that zero-shot generalization to unseen relation types is possible, at lower accuracy levels, setting the bar for future work on this task.

...read moreread less

274 citations

Proceedings Article•DOI•

Neural AMR: Sequence-to-Sequence Models for Parsing and Generation

[...]

Ioannis Konstas¹, Srinivasan Iyer¹, Mark Yatskar¹, Yejin Choi¹, Luke Zettlemoyer¹ - Show less +1 more•Institutions (1)

University of Washington¹

01 Jul 2017

TL;DR: This article proposed a sequence-to-sequence model for AMR parsing and generated text using Abstract Meaning Representation (AMR), which achieved state-of-the-art performance in BLEU 33.8.

...read moreread less

Abstract: Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text using Abstract Meaning Representation (AMR) has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1 SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.

...read moreread less

219 citations

Posted Content•

Neural AMR: Sequence-to-Sequence Models for Parsing and Generation

[...]

Ioannis Konstas¹, Srinivasan Iyer¹, Mark Yatskar¹, Yejin Choi¹, Luke Zettlemoyer¹ - Show less +1 more•Institutions (1)

University of Washington¹

26 Apr 2017-arXiv: Computation and Language

TL;DR: This work presents a novel training procedure that can lift the limitation of the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs, and presents strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.

...read moreread less

Abstract: Sequence-to-sequence models have shown strong performance across a broad range of applications. However, their application to parsing and generating text usingAbstract Meaning Representation (AMR)has been limited, due to the relatively limited amount of labeled data and the non-sequential nature of the AMR graphs. We present a novel training procedure that can lift this limitation using millions of unlabeled sentences and careful preprocessing of the AMR graphs. For AMR parsing, our model achieves competitive results of 62.1SMATCH, the current best score reported without significant use of external semantic resources. For AMR generation, our model establishes a new state-of-the-art performance of BLEU 33.8. We present extensive ablative and qualitative analysis including strong evidence that sequence-based AMR models are robust against ordering variations of graph-to-sequence conversions.

...read moreread less

173 citations

Posted Content•

TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension

[...]

Mandar Joshi¹, Eunsol Choi¹, Daniel S. Weld¹, Luke Zettlemoyer²•Institutions (2)

University of Washington¹, Allen Institute for Artificial Intelligence²

09 May 2017-arXiv: Computation and Language

TL;DR: TriviaQA as mentioned in this paper ) is a large-scale dataset containing over 650k question-answer-evidence triples, including 95k question answer pairs authored by trivia enthusiasts and independently gathered evidence documents.

...read moreread less

Abstract: We present TriviaQA, a challenging reading comprehension dataset containing over 650K question-answer-evidence triples. TriviaQA includes 95K question-answer pairs authored by trivia enthusiasts and independently gathered evidence documents, six per question on average, that provide high quality distant supervision for answering the questions. We show that, in comparison to other recently introduced large-scale datasets, TriviaQA (1) has relatively complex, compositional questions, (2) has considerable syntactic and lexical variability between questions and corresponding answer-evidence sentences, and (3) requires more cross sentence reasoning to find answers. We also present two baseline algorithms: a feature-based classifier and a state-of-the-art neural network, that performs well on SQuAD reading comprehension. Neither approach comes close to human performance (23% and 40% vs. 80%), suggesting that TriviaQA is a challenging testbed that is worth significant future study. Data and code available at -- this http URL

...read moreread less

172 citations

Posted Content•

Zero-Shot Relation Extraction via Reading Comprehension

[...]

Omer Levy¹, Minjoon Seo², Eunsol Choi², Luke Zettlemoyer²•Institutions (2)

Bar-Ilan University¹, University of Washington²

13 Jun 2017-arXiv: Computation and Language

TL;DR: It is shown that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot, and that zero-shot generalization to unseen relation types is possible, at lower accuracy levels.

...read moreread less

Abstract: We show that relation extraction can be reduced to answering simple reading comprehension questions, by associating one or more natural-language questions with each relation slot. This reduction has several advantages: we can (1) learn relation-extraction models by extending recent neural reading-comprehension techniques, (2) build very large training sets for those models by combining relation-specific crowd-sourced questions with distant supervision, and even (3) do zero-shot learning by extracting new relation types that are only specified at test-time, for which we have no labeled training examples. Experiments on a Wikipedia slot-filling task demonstrate that the approach can generalize to new questions for known relation types with high accuracy, and that zero-shot generalization to unseen relation types is possible, at lower accuracy levels, setting the bar for future work on this task.

...read moreread less

152 citations

Posted Content•

End-to-end Neural Coreference Resolution

[...]

Kenton Lee¹, Luheng He², Michael Lewis³, Luke Zettlemoyer¹•Institutions (3)

University of Washington¹, Google², University of Pittsburgh³

21 Jul 2017-arXiv: Computation and Language

TL;DR: This paper proposed an end-to-end coreference resolution model that directly considers all spans in a document as potential mentions and learns distributions over possible antecedents for each, which is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions.

...read moreread less

Abstract: We introduce the first end-to-end coreference resolution model and show that it significantly outperforms all previous work without using a syntactic parser or hand-engineered mention detector. The key idea is to directly consider all spans in a document as potential mentions and learn distributions over possible antecedents for each. The model computes span embeddings that combine context-dependent boundary representations with a head-finding attention mechanism. It is trained to maximize the marginal likelihood of gold antecedent spans from coreference clusters and is factored to enable aggressive pruning of potential mentions. Experiments demonstrate state-of-the-art performance, with a gain of 1.5 F1 on the OntoNotes benchmark and by 3.1 F1 using a 5-model ensemble, despite the fact that this is the first approach to be successfully trained with no external resources.

...read moreread less

83 citations

Posted Content•

Learning a Neural Semantic Parser from User Feedback

[...]

Srinivasan Iyer¹, Ioannis Konstas¹, Alvin Cheung¹, Jayant Krishnamurthy², Luke Zettlemoyer² - Show less +1 more•Institutions (2)

University of Washington¹, Allen Institute for Artificial Intelligence²

27 Apr 2017-arXiv: Computation and Language

TL;DR: The authors adapt neural sequence models to map utterances directly to SQL with its full expressivity, bypassing any intermediate meaning representations, and then deploy these models online to solicit feedback from real users to flag incorrect queries.

...read moreread less

Abstract: We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention. To achieve this, we adapt neural sequence models to map utterances directly to SQL with its full expressivity, bypassing any intermediate meaning representations. These models are immediately deployed online to solicit feedback from real users to flag incorrect queries. Finally, the popularity of SQL facilitates gathering annotations for incorrect predictions using the crowd, which is directly used to improve our models. This complete feedback loop, without intermediate representations or database specific engineering, opens up new ways of building high quality semantic parsers. Experiments suggest that this approach can be deployed quickly for any new target domain, as we show by learning a semantic parser for an online academic database from scratch.

...read moreread less

Posted Content•

Crowdsourcing Question-Answer Meaning Representations

[...]

Julian Michael¹, Gabriel Stanovsky¹, Luheng He², Ido Dagan³, Luke Zettlemoyer¹ - Show less +1 more•Institutions (3)

University of Washington¹, Google², Bar-Ilan University³

16 Nov 2017-arXiv: Computation and Language

TL;DR: The authors proposed Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs, and developed a crowdsourcing scheme to show that QAMRs can be labeled with very little training.

...read moreread less

Abstract: We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs. We also develop a crowdsourcing scheme to show that QAMRs can be labeled with very little training, and gather a dataset with over 5,000 sentences and 100,000 questions. A detailed qualitative analysis demonstrates that the crowd-generated question-answer pairs cover the vast majority of predicate-argument relationships in existing datasets (including PropBank, NomBank, QA-SRL, and AMR) along with many previously under-resourced ones, including implicit arguments and relations. The QAMR data and annotation code is made publicly available to enable future work on how best to model these complex phenomena.

...read moreread less

Posted Content•

Recurrent Additive Networks

[...]

Kenton Lee, Omer Levy, Luke Zettlemoyer

21 May 2017-arXiv: Computation and Language

TL;DR: Recurrent additive networks are introduced, a new gated RNN which is distinguished by the use of purely additive latent state updates, and it is formally shown that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums.

...read moreread less

Abstract: We introduce recurrent additive networks (RANs), a new gated RNN which is distinguished by the use of purely additive latent state updates. At every time step, the new state is computed as a gated component-wise sum of the input and the previous state, without any of the non-linearities commonly used in RNN transition dynamics. We formally show that RAN states are weighted sums of the input vectors, and that the gates only contribute to computing the weights of these sums. Despite this relatively simple functional form, experiments demonstrate that RANs perform on par with LSTMs on benchmark language modeling problems. This result shows that many of the non-linear computations in LSTMs and related networks are not essential, at least for the problems we consider, and suggests that the gates are doing more of the computational work than previously understood.

...read moreread less

Proceedings Article•DOI•

Commonly Uncommon: Semantic Sparsity in Situation Recognition

[...]

Mark Yatskar¹, Vicente Ordonez², Luke Zettlemoyer¹, Ali Farhadi¹•Institutions (2)

University of Washington¹, Allen Institute for Artificial Intelligence²

01 Jul 2017

TL;DR: This paper studies semantic sparsity in situation recognition, the task of producing structured summaries of what is happening in images, including activities, objects and the roles objects play within the activity.

...read moreread less

Abstract: Semantic sparsity is a common challenge in structured visual classification problems, when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set. This paper studies semantic sparsity in situation recognition, the task of producing structured summaries of what is happening in images, including activities, objects and the roles objects play within the activity. For this problem, we find empirically that most substructures required for prediction are rare, and current state-of-the-art model performance dramatically decreases if even one such rare substructure exists in the target output. We avoid many such errors by (1) introducing a novel tensor composition function that learns to share examples across substructures more effectively and (2) semantically augmenting our training data with automatically gathered examples of rarely observed outputs using web data. When integrated within a complete CRF-based structured prediction model, the tensor-based approach outperforms existing state of the art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role accuracy, respectively. Adding 5 million images with our semantic augmentation techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb and noun-role accuracy.

...read moreread less

Showing papers by "Luke Zettlemoyer published in 2017"