Open AccessPosted Content
Enriching Word Vectors with Subword Information
Reads0
Chats0
TLDR
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.Abstract:
Continuous word representations, trained on large unlabeled corpora are useful for many natural language processing tasks. Popular models that learn such representations ignore the morphology of words, by assigning a distinct vector to each word. This is a limitation, especially for languages with large vocabularies and many rare words. In this paper, we propose a new approach based on the skipgram model, where each word is represented as a bag of character $n$-grams. A vector representation is associated to each character $n$-gram; words being represented as the sum of these representations. Our method is fast, allowing to train models on large corpora quickly and allows us to compute word representations for words that did not appear in the training data. We evaluate our word representations on nine different languages, both on word similarity and analogy tasks. By comparing to recently proposed morphological word representations, we show that our vectors achieve state-of-the-art performance on these tasks.read more
Citations
More filters
Posted Content
Recent Trends in Deep Learning Based Natural Language Processing
TL;DR: Deep learning methods employ multiple processing layers to learn hierarchical representations of data and have produced state-of-the-art results in many domains as mentioned in this paper, such as natural language processing (NLP).
Posted Content
Poincar\'e Embeddings for Learning Hierarchical Representations
Maximilian Nickel,Douwe Kiela +1 more
TL;DR: For example, the authors embeds symbolic data into an n-dimensional Poincare ball to learn parsimonious representations of symbolic data by simultaneously capturing hierarchy and similarity, and then uses Riemannian optimization to learn the embeddings.
Proceedings ArticleDOI
Deep Learning for Entity Matching: A Design Space Exploration
Sidharth Mudgal,Han Li,Theodoros Rekatsinas,AnHai Doan,Youngchoon Park,Ganesh Krishnan,Rohit Deep,Esteban Arcaute,Vijay Raghavendra +8 more
TL;DR: The results show that DL does not outperform current solutions on structured EM, but it can significantly outperform them on textual and dirty EM, which suggests that practitioners should seriously consider using DL for textual anddirty EM problems.
Posted Content
Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks
Nils Reimers,Iryna Gurevych +1 more
TL;DR: This paper evaluates the importance of different network design choices and hyperparameters for five common linguistic sequence tagging tasks and found, that some parameters, like the pre-trained word embeddings or the last layer of the network, have a large impact on the performance, while other parameters, for example the number of LSTM layers or theNumber of recurrent units, are of minor importance.
Journal ArticleDOI
Deep Recurrent neural network vs. support vector machine for aspect-based sentiment analysis of Arabic hotels’ reviews
TL;DR: The state-of-the-art approaches based on supervised machine learning are presented to address the challenges of aspect-based sentiment analysis (ABSA) of Arabic Hotels’ reviews and the SVM approach outperforms the other deep RNN approach in the research investigated tasks.
References
More filters
Posted Content
Efficient Estimation of Word Representations in Vector Space
TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Posted Content
Distributed Representations of Words and Phrases and their Compositionality
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Proceedings ArticleDOI
Neural Machine Translation of Rare Words with Subword Units
TL;DR: This paper introduces a simpler and more effective approach, making the NMT model capable of open-vocabulary translation by encoding rare and unknown words as sequences of subword units, and empirically shows that subword models improve over a back-off dictionary baseline for the WMT 15 translation tasks English-German and English-Russian by 1.3 BLEU.
Proceedings ArticleDOI
A unified architecture for natural language processing: deep neural networks with multitask learning
Ronan Collobert,Jason Weston +1 more
TL;DR: This work describes a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense using a language model.