Open AccessPosted Content
Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs
TLDR
Extensions to a continuousstate dependency parsing method that makes it applicable to morphologically rich languages replace lookup-based word representations with representations constructed from the orthographic representations of the words, also using LSTMs.Abstract:
We present extensions to a continuous-state dependency parsing method that makes it applicable to morphologically rich languages. Starting with a high-performance transition-based parser that uses long short-term memory (LSTM) recurrent neural networks to learn representations of the parser state, we replace lookup-based word representations with representations constructed from the orthographic representations of the words, also using LSTMs. This allows statistical sharing across word forms that are similar on the surface. Experiments for morphologically rich languages show that the parsing model benefits from incorporating the character-based encodings of words.read more
Citations
More filters
Journal ArticleDOI
Enriching Word Vectors with Subword Information
TL;DR: This paper proposed a new approach based on skip-gram model, where each word is represented as a bag of character n-grams, words being represented as the sum of these representations, allowing to train models on large corpora quickly and allowing to compute word representations for words that did not appear in the training data.
Posted Content
Enriching Word Vectors with Subword Information
TL;DR: A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
Proceedings Article
Character-aware neural language models
TL;DR: A simple neural language model that relies only on character-level inputs that is able to encode, from characters only, both semantic and orthographic information and suggests that on many languages, character inputs are sufficient for language modeling.
Posted Content
Exploring the limits of language modeling
TL;DR: This work explores recent advances in Recurrent Neural Networks for large scale Language Modeling, and extends current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.
Book
Neural Network Methods in Natural Language Processing
Yoav Goldberg,Graeme Hirst +1 more
TL;DR: Neural networks are a family of powerful machine learning models as mentioned in this paper, and they have been widely used in natural language processing applications such as machine translation, syntactic parsing, and multi-task learning.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
ReportDOI
Building a large annotated corpus of English: the penn treebank
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Proceedings Article
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
Richard Socher,Alex Perelygin,Jean Y. Wu,Jason Chuang,Christopher D. Manning,Andrew Y. Ng,Christopher Potts +6 more
TL;DR: A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
Journal ArticleDOI
LSTM: A Search Space Odyssey
TL;DR: This paper presents the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling, and observes that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
Posted Content
Generating Sequences With Recurrent Neural Networks
TL;DR: This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.