Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs

Open AccessPosted Content

Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs

- 04 Aug 2015 -

TLDR

Extensions to a continuousstate dependency parsing method that makes it applicable to morphologically rich languages replace lookup-based word representations with representations constructed from the orthographic representations of the words, also using LSTMs.

Abstract:

We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages. Starting with a high-performance transition-based parser that uses long short-term memory (LSTM) recurrent neural networks to learn representations of the parser state, we replace lookup-based word representations with representations constructed from the orthographic representations of the words, also using LSTMs. This allows statistical sharing across word forms that are similar on the surface. Experiments for morphologically rich languages show that the parsing model benefits from incorporating the character-based encodings of words.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Enriching Word Vectors with Subword Information

Piotr Bojanowski, +3 more

- 12 Jun 2017 -

Transactions of the Association for Comp...

TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.

...read moreread less

Posted Content

Enriching Word Vectors with Subword Information

Piotr Bojanowski, +3 more

- 15 Jul 2016 -

arXiv: Computation and Language

TL;DR: A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.

...read moreread less

Proceedings Article

Character-aware neural language models

Yoon Kim, +3 more

TL;DR: A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.

...read moreread less

Posted Content

Exploring the limits of language modeling

Rafal Jozefowicz, +4 more

- 07 Feb 2016 -

arXiv: Computation and Language

TL;DR: This work explores recent advances in Recurrent Neural Networks for large scale Language Modeling, and extends current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.

...read moreread less

Book

Neural Network Methods in Natural Language Processing

Yoav Goldberg, +1 more

TL;DR: Neural networks are a family of powerful machine learning models as mentioned in this paper, and they have been widely used in natural language processing applications such as machine translation, syntactic parsing, and multi-task learning.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997 -

Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

ReportDOI

Building a large annotated corpus of English: the penn treebank

Mitchell Marcus, +2 more

- 01 Jun 1993 -

Computational Linguistics

TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.

...read moreread less

Proceedings Article

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

Richard Socher, +6 more

TL;DR: A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.

...read moreread less

Journal ArticleDOI

LSTM: A Search Space Odyssey

Klaus Greff, +4 more

- 01 Oct 2017 -

IEEE Transactions on Neural Networks

TL;DR: This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

...read moreread less

Posted Content

Generating Sequences With Recurrent Neural Networks

Alex Graves

- 04 Aug 2013 -

arXiv: Neural and Evolutionary Computing

TL;DR: This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.

...read moreread less