Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers

Home
/
Papers
/
Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers

Proceedings Article•

Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers

André F. T. Martins¹, Miguel Almeida¹, Noah A. Smith¹•Institutions (1)

01 Aug 2013-pp 617-622

TL;DR: Fast, accurate, direct nonprojective dependency parsers with thirdorder features with parsing speeds competitive to projective parsers, with state-ofthe-art accuracies for the largest datasets (English, Czech, and German).

read less

Abstract: We present fast, accurate, direct nonprojective dependency parsers with thirdorder features. Our approach uses AD 3 , an accelerated dual decomposition algorithm which we extend to handle specialized head automata and sequential head bigram models. Experiments in fourteen languages yield parsing speeds competitive to projective parsers, with state-ofthe-art accuracies for the largest datasets (English, Czech, and German).

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations

[...]

Eliyahu Kiperwasser¹, Yoav Goldberg¹•Institutions (1)

Bar-Ilan University¹

20 Jul 2016-Transactions of the Association for Computational Linguistics

TL;DR: This paper proposed a simple and effective scheme for dependency parsing which is based on bidirectional-LSTMs (BiLSTM) and feature vectors are constructed by concatenating a few BiLSTMM vectors.

...read moreread less

Abstract: We present a simple and effective scheme for dependency parsing which is based on bidirectional-LSTMs (BiLSTMs). Each sentence token is associated with a BiLSTM vector representing the token in its sentential context, and feature vectors are constructed by concatenating a few BiLSTM vectors. The BiLSTM is trained jointly with the parser objective, resulting in very effective feature extractors for parsing. We demonstrate the effectiveness of the approach by applying it to a greedy transition-based parser as well as to a globally optimized graph-based parser. The resulting parsers have very simple architectures, and match or surpass the state-of-the-art accuracies on English and Chinese.

...read moreread less

702 citations

Proceedings Article•DOI•

Findings of the 2016 Conference on Machine Translation

[...]

Ondˇrej Bojar, Rajen Chatterjee¹, Christian Federmann¹, Yvette Graham², Barry Haddow, Matthias Huck, Antonio Jimeno Yepes³, Philipp Koehn¹, Varvara Logacheva⁴, Christof Monz⁵, Matteo Negri⁶, Aurélie Névéol⁷, Mariana Neves⁸, Martin Popel⁹, Matt Post¹⁰, Raphael Rubino², Carolina Scarton⁴, Lucia Specia⁴, Marco Turchi⁶, Karin Verspoor¹¹, Marcos Zampieri¹² - Show less +17 more•Institutions (12)

University of Edinburgh¹, Dublin City University², IBM³, University of Sheffield⁴, University of Amsterdam⁵, fondazione bruno kessler⁶, Université Paris-Saclay⁷, Hasso Plattner Institute⁸, Charles University in Prague⁹, Johns Hopkins University¹⁰, University of Melbourne¹¹, Saarland University¹²

12 Aug 2016

TL;DR: The results of the WMT16 shared tasks are presented, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task.

...read moreread less

Abstract: This paper presents the results of the WMT16 shared tasks, which included five machine translation (MT) tasks (standard news, IT-domain, biomedical, multimodal, pronoun), three evaluation tasks (metrics, tuning, run-time estimation of MT quality), and an automatic post-editing task and bilingual document alignment task. This year, 102 MT systems from 24 institutions (plus 36 anonymized online systems) were submitted to the 12 translation directions in the news translation task. The IT-domain task received 31 submissions from 12 institutions in 7 directions and the Biomedical task received 15 submissions systems from 5 institutions. Evaluation was both automatic and manual (relative ranking and 100-point scale assessments). The quality estimation task had three subtasks, with a total of 14 teams, submitting 39 entries. The automatic post-editing task had a total of 6 teams, submitting 11 entries.

...read moreread less

616 citations

Cites methods from "Turning on the Turbo: Fast Third-Or..."

...The following external resources were used: part-of-speech tags and extra syntactic dependency information obtained with TurboTagger and TurboParser (Martins et al., 2013), trained on the Penn Treebank (for English) and on the version of the German TIGER corpus used in the SPMRL shared task (Seddah et al., 2014) for German....
[...]
...The following external resources were used: part-of-speech tags and extra syntactic dependency information obtained with TurboTagger and TurboParser (Martins et al., 2013), trained on the Penn Treebank (for English) and on the version of the German TIGER corpus used in the SPMRL shared task (Seddah…...
[...]
...The syntactic dependencies are predicted with TurboParser trained on the TIGER German treebank....
[...]
...The following external resources were used: part-of-speech tags and extra syntactic dependency information obtained with TurboTagger and TurboParser (Martins et al., 2013), trained on the Penn Treebank (for English) and on the version of the German TIGER corpus used in the SPMRL shared task (Seddah et al....
[...]

Proceedings Article•DOI•

A Discriminative Graph-Based Parser for the Abstract Meaning Representation

[...]

Jeffrey Flanigan¹, Sam Thomson¹, Jaime G. Carbonell¹, Chris Dyer¹, Noah A. Smith¹ - Show less +1 more•Institutions (1)

Carnegie Mellon University¹

01 Jun 2014

TL;DR: The first approach to parse sentences into meaning representation, a semantic formalism for which a grow- ing set of annotated examples is available, is introduced, providing a strong baseline for improvement.

...read moreread less

Abstract: Meaning Representation (AMR) is a semantic formalism for which a grow- ing set of annotated examples is avail- able. We introduce the first approach to parse sentences into this representa- tion, providing a strong baseline for fu- ture improvement. The method is based on a novel algorithm for finding a maxi- mum spanning, connected subgraph, em- bedded within a Lagrangian relaxation of an optimization problem that imposes lin- guistically inspired constraints. Our ap- proach is described in the general frame- work of structured prediction, allowing fu- ture incorporation of additional features and constraints, and may extend to other formalisms as well. Our open-source sys- tem, JAMR, is available at: http://github.com/jflanigan/jamr

...read moreread less

342 citations

Cites methods from "Turning on the Turbo: Fast Third-Or..."

...TurboParser (Martins et al., 2013) uses AD3 (Martins et al....
[...]
...TurboParser (Martins et al., 2013) uses AD3 (Martins et al., 2011), a type of augmented Lagrangian relaxation, to integrate third-order features into a CLE backbone....
[...]

Proceedings Article•DOI•

CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

[...]

Daniel Zeman¹, Martin Popel¹, Milan Straka¹, Jan Hajič¹, Joakim Nivre², Filip Ginter³, Juhani Luotolahti³, Sampo Pyysalo⁴, Slav Petrov⁵, Martin Potthast⁶, Francis M. Tyers⁷, Elena Badmaeva⁸, Memduh Gökırmak⁹, Anna Nedoluzhko¹, Silvie Cinková¹, Jaroslava Hlaváčová¹, Václava Kettnerová¹, Zdenka Uresova¹, Jenna Kanerva³, Stina Ojala³, Anna Missilä³, Christopher D. Manning¹⁰, Sebastian Schuster¹⁰, Siva Reddy¹⁰, Dima Taji¹¹, Nizar Habash¹¹, Herman Leung¹², Marie-Catherine de Marneffe¹³, Manuela Sanguinetti¹⁴, Maria Simi¹⁵, Hiroshi Kanayama¹⁶, Valeria dePaiva¹⁷, Kira Droganova¹, Héctor Martínez Alonso¹⁸, Ça ugrÄ± Çöltekin¹⁹, Umut Sulubacak, Hans Uszkoreit²⁰, Vivien Macketanz²⁰, Aljoscha Burchardt²⁰, Kim Harris, Katrin Marheinecke, Georg Rehm²⁰, Tolga Kayadelen⁵, Mohammed Attia⁵, Ali Elkahky⁵, Zhuoran Yu⁵, Emily Pitler⁵, Saran Lertpradit⁵, Michael Mandl⁵, Jesse Kirchner⁵, Hector Fernandez Alcalde⁵, Jana Strnadová⁵, Esha Banerjee⁵, Ruli Manurung⁵, Antonio Stella⁵, Atsuko Shimada⁵, Sookyoung Kwak⁵, Gustavo Mendonça⁵, Tatiana Lando⁵, Rattima Nitisaroj⁵, Josie Li⁵ - Show less +57 more•Institutions (20)

Charles University in Prague¹, Uppsala University², University of Turku³, University of Cambridge⁴, Google⁵, Bauhaus University, Weimar⁶, National Research University – Higher School of Economics⁷, University of the Basque Country⁸, Istanbul Technical University⁹, Stanford University¹⁰, New York University¹¹, University of California, Berkeley¹², Ohio State University¹³, University of Turin¹⁴, University of Pisa¹⁵, IBM¹⁶, Nuance Communications¹⁷, Thomson Reuters¹⁸, University of Tübingen¹⁹, German Research Centre for Artificial Intelligence²⁰

01 Jan 2017

TL;DR: The task and evaluation methodology is defined, how the data sets were prepared, report and analyze the main results, and a brief categorization of the different approaches of the participating systems are provided.

...read moreread less

Abstract: The Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their learning systems on the same data sets. In 2017, the task was devoted to learning dependency parsers for a large number of languages, in a real-world setting without any gold-standard annotation on input. All test sets followed a unified annotation scheme, namely that of Universal Dependencies. In this paper, we define the task and evaluation methodology, describe how the data sets were prepared, report and analyze the main results, and provide a brief categorization of the different approaches of the participating systems.

...read moreread less

281 citations

Proceedings Article•DOI•

A Dependency Parser for Tweets

[...]

Lingpeng Kong¹, Nathan Schneider, Swabha Swayamdipta¹, Archna Bhatia¹, Chris Dyer¹, Noah A. Smith¹ - Show less +2 more•Institutions (1)

Carnegie Mellon University¹

01 Oct 2014

TL;DR: A new dependency parser for English tweets, TWEEBOPARSER, which builds on several contributions: new syntactic annotations for a corpus of tweets, with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data.

...read moreread less

Abstract: We describe a new dependency parser for English tweets, TWEEBOPARSER. The parser builds on several contributions: new syntactic annotations for a corpus of tweets (TWEEBANK), with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data. Our experiments show that the parser achieves over 80% unlabeled attachment accuracy on our new, high-quality test set and measure the benefit of our contributions. Our dataset and parser can be found at http://www.ark.cs.cmu.edu/TweetNLP.

...read moreread less

227 citations

Cites methods from "Turning on the Turbo: Fast Third-Or..."

...For parsing, we start with TurboParser, which is open-source and has been found to perform well on a range of parsing problems in different languages (Martins et al., 2013; Kong and Smith, 2014)....
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

Collapse

References

PDF

Open Access

More filters

Book•

Numerical Optimization

[...]

Jorge Nocedal¹, Stephen J. Wright²•Institutions (2)

Northwestern University¹, University of Wisconsin-Madison²

01 Nov 2008

TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.

...read moreread less

Abstract: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization. It responds to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems. For this new edition the book has been thoroughly updated throughout. There are new chapters on nonlinear interior methods and derivative-free methods for optimization, both of which are used widely in practice and the focus of much current research. Because of the emphasis on practical methods, as well as the extensive illustrations and exercises, the book is accessible to a wide audience. It can be used as a graduate text in engineering, operations research, mathematics, computer science, and business. It also serves as a handbook for researchers and practitioners in the field. The authors have strived to produce a text that is pleasant to read, informative, and rigorous - one that reveals both the beautiful nature of the discipline and its practical side.

...read moreread less

17,420 citations

Journal Article•

Online Passive-Aggressive Algorithms

[...]

Koby Crammer¹, Koby Crammer², Ofer Dekel², Joseph Keshet², Shai Shalev-Shwartz², Yoram Singer² - Show less +2 more•Institutions (2)

University of Pennsylvania¹, Hebrew University of Jerusalem²

01 Dec 2006-Journal of Machine Learning Research

TL;DR: This work presents a unified view for online classification, regression, and uni-class problems, and proves worst case loss bounds for various algorithms for both the realizable case and the non-realizable case.

...read moreread less

Abstract: We present a family of margin based online learning algorithms for various prediction tasks. In particular we derive and analyze algorithms for binary and multiclass categorization, regression, uniclass prediction and sequence prediction. The update steps of our different algorithms are all based on analytical solutions to simple constrained optimization problems. This unified view allows us to prove worst-case loss bounds for the different algorithms and for the various decision problems based on a single lemma. Our bounds on the cumulative loss of the algorithms are relative to the smallest loss that can be attained by any fixed hypothesis, and as such are applicable to both realizable and unrealizable settings. We demonstrate some of the merits of the proposed algorithms in a series of experiments with synthetic and real data sets.

...read moreread less

1,690 citations

"Turning on the Turbo: Fast Third-Or..." refers methods in this paper

...11 We trained by running 10 epochs of cost-augmented MIRA (Crammer et al., 2006)....
[...]
...To this end, we converted the Penn Treebank to dependencies through (i) the head rules of Yamada and Matsumoto (2003) (PTB-YM) and (ii) basic dependencies from the Stanford parser 2.0.5 (PTB-S).11 We trained by running 10 epochs of cost-augmented MIRA (Crammer et al., 2006)....
[...]

Proceedings Article•

Online Passive-Aggressive Algorithms

[...]

Shai Shalev-Shwartz¹, Koby Crammer¹, Ofer Dekel¹, Yoram Singer¹•Institutions (1)

Hebrew University of Jerusalem¹

09 Dec 2003

TL;DR: In this article, a unified view for online classification, regression, and uni-class problems is presented, which leads to a single algorithmic framework for the three problems, and the authors prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case.

...read moreread less

Abstract: We present a unified view for online classification, regression, and uni-class problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. A conversion of our main online algorithm to the setting of batch learning is also discussed. The end result is new algorithms and accompanying loss bounds for the hinge-loss.

...read moreread less

1,543 citations

Proceedings Article•DOI•

CoNLL-X Shared Task on Multilingual Dependency Parsing

[...]

Sabine Buchholz¹, Erwin Marsi²•Institutions (2)

Toshiba¹, Tilburg University²

08 Jun 2006

TL;DR: How treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured is described and general conclusions about multi-lingual parsing are drawn.

...read moreread less

Abstract: Each year the Conference on Computational Natural Language Learning (CoNLL) features a shared task, in which participants train and test their systems on exactly the same data sets, in order to better compare systems. The tenth CoNLL (CoNLL-X) saw a shared task on Multilingual Dependency Parsing. In this paper, we describe how treebanks for 13 languages were converted into the same dependency format and how parsing performance was measured. We also give an overview of the parsing approaches that participants took and the results that they achieved. Finally, we try to draw general conclusions about multi-lingual parsing: What makes a particular language, treebank or annotation scheme easier or harder to parse and which phenomena are challenging for any dependency parser?

...read moreread less

1,011 citations

"Turning on the Turbo: Fast Third-Or..." refers methods in this paper

...3), we used 14 datasets, most of which are non-projective, from the CoNLL 2006 and 2008 shared tasks (Buchholz and Marsi, 2006; Surdeanu et al., 2008)....
[...]

Proceedings Article•DOI•

Non-Projective Dependency Parsing using Spanning Tree Algorithms

[...]

Ryan McDonald¹, Fernando Pereira¹, Kiril Ribarov², Jan Hajič²•Institutions (2)

University of Pennsylvania¹, Charles University in Prague²

06 Oct 2005

TL;DR: Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n3) time and is extended naturally to non-projective parsing using Chu-Liu-Edmonds (Chu and Liu, 1965; Edmonds, 1967) MST algorithm, yielding an O( n2) parsing algorithm.

...read moreread less

Abstract: We formalize weighted dependency parsing as searching for maximum spanning trees (MSTs) in directed graphs. Using this representation, the parsing algorithm of Eisner (1996) is sufficient for searching over all projective trees in O(n3) time. More surprisingly, the representation is extended naturally to non-projective parsing using Chu-Liu-Edmonds (Chu and Liu, 1965; Edmonds, 1967) MST algorithm, yielding an O(n2) parsing algorithm. We evaluate these methods on the Prague Dependency Treebank using online large-margin learning techniques (Crammer et al., 2003; McDonald et al., 2005) and show that MST parsing increases efficiency and accuracy for languages with non-projective dependencies.

...read moreread less

980 citations

"Turning on the Turbo: Fast Third-Or..." refers background or methods in this paper

...We use an arc-factored score function (McDonald et al., 2005): f TREE(z) =∑L m=1 σARC(π(m),m), where π(m) is the parent of the mth word according to the parse tree z, and σARC(h,m) is the score of an individual arc....
[...]
...We use an arc-factored score function (McDonald et al., 2005): f (z) = ∑L m=1 σARC(π(m),m), where π(m) is the parent of the mth word according to the parse tree z, and σARC(h,m) is the score of an individual arc....
[...]
...Firstorder models factor over arcs (Eisner, 1996; McDonald et al., 2005), and second-order models include also consecutive siblings and grandparents (Carreras, 2007)....
[...]