Open AccessProceedings Article
Better Word Representations with Recursive Neural Networks for Morphology
Thang Luong,Richard Socher,Christopher D. Manning +2 more
- pp 104-113
Reads0
Chats0
TLDR
This paper combines recursive neural networks, where each morpheme is a basic unit, with neural language models to consider contextual information in learning morphologicallyaware word representations and proposes a novel model capable of building representations for morphologically complex words from their morphemes.Abstract:
Vector-space word representations have been very successful in recent years at improving performance across a variety of NLP tasks. However, common to most existing work, words are regarded as independent entities without any explicit relationship among morphologically related words being modeled. As a result, rare and complex words are often poorly estimated, and all unknown words are represented in a rather crude way using only one or a few vectors. This paper addresses this shortcoming by proposing a novel model that is capable of building representations for morphologically complex words from their morphemes. We combine recursive neural networks (RNNs), where each morpheme is a basic unit, with neural language models (NLMs) to consider contextual information in learning morphologicallyaware word representations. Our learned models outperform existing word representations by a good margin on word similarity tasks across many datasets, including a new dataset we introduce focused on rare words to complement existing ones in an interesting way.read more
Citations
More filters
Proceedings ArticleDOI
Multimodal Word Distributions
TL;DR: The authors introduce multimodal word distributions formed from Gaussian mixtures, for multiple word meanings, entailment, and rich uncertainty information, and propose an energy-based max-margin objective.
Posted Content
Maybe Deep Neural Networks are the Best Choice for Modeling Source Code
TL;DR: This work presents a new open-vocabulary neural language model for code that is not limited to a fixed vocabulary of identifier names, and achieves best in class performance, outperforming even the state-of-the-art methods of Hellendoorn and Devanbu that are designed specifically to model code.
Proceedings ArticleDOI
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations
Aditi Chaudhary,Chunting Zhou,Lori Levin,Graham Neubig,David R. Mortensen,Jaime G. Carbonell +5 more
TL;DR: Two approaches for improving generalization to low-resourced languages by adapting continuous word representations using linguistically motivated subword units: phonemes, morphemes and graphemes are presented.
Posted Content
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
Tu Anh Nguyen,Maureen de Seyssel,Patricia Rozé,Morgane Riviere,Evgeny Kharitonov,Alexei Baevski,Ewan Dunbar,Emmanuel Dupoux +7 more
TL;DR: A new unsupervised task, spoken language modeling: the learning of linguistic representations from raw audio signals without any labels, along with the Zero Resource Speech Benchmark 2021: a suite of 4 black-box, zero-shot metrics probing for the quality of the learned models at 4 linguistic levels: phonetics, lexicon, syntax and semantics.
Proceedings ArticleDOI
Learning Word Representations by Jointly Modeling Syntagmatic and Paradigmatic Relations
TL;DR: This paper proposes two novel distributional models for word representation using both syntagmatic and paradigmatic relations via a joint training objective and demonstrates that the proposed models can perform significantly better than all the state-of-the-art baseline methods on both tasks.
References
More filters
Journal ArticleDOI
WordNet: a lexical database for English
TL;DR: WordNet1 provides a more effective combination of traditional lexicographic information and modern computing, and is an online lexical database designed for use under program control.
Journal ArticleDOI
A neural probabilistic language model
TL;DR: The authors propose to learn a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences, which can be expressed in terms of these representations.
Journal Article
Natural Language Processing (Almost) from Scratch
TL;DR: A unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling is proposed.
Proceedings ArticleDOI
A unified architecture for natural language processing: deep neural networks with multitask learning
Ronan Collobert,Jason Weston +1 more
TL;DR: This work describes a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense using a language model.
Proceedings Article
Recurrent neural network based language model
TL;DR: Results indicate that it is possible to obtain around 50% reduction of perplexity by using mixture of several RNN LMs, compared to a state of the art backoff language model.