scispace - formally typeset
Open AccessJournal ArticleDOI

Deep sentence embedding using long short-term memory networks: analysis and application to information retrieval

Reads0
Chats0
TLDR
In this article, the LSTM-RNN model was used for sentence embedding in a web search engine and the results showed that the proposed method significantly outperformed the Paragraph Vector method for web document retrieval task.
Abstract
This paper develops a model that addresses sentence embedding, a hot topic in current natural language processing research, using recurrent neural networks (RNN) with Long Short-Term Memory (LSTM) cells. The proposed LSTM-RNN model sequentially takes each word in a sentence, extracts its information, and embeds it into a semantic vector. Due to its ability to capture long term memory, the LSTM-RNN accumulates increasingly richer information as it goes through the sentence, and when it reaches the last word, the hidden layer of the network provides a semantic representation of the whole sentence. In this paper, the LSTM-RNN is trained in a weakly supervised manner on user click-through data logged by a commercial web search engine. Visualization and analysis are performed to understand how the embedding process works. The model is found to automatically attenuate the unimportant words and detect the salient keywords in the sentence. Furthermore, these detected keywords are found to automatically activate different cells of the LSTM-RNN, where words belonging to a similar topic activate the same cell. As a semantic representation of the sentence, the embedding vector can be used in many different applications. These automatic keyword detection and topic allocation abilities enabled by the LSTM-RNN allow the network to perform document retrieval, a difficult language processing task, where the similarity between the query and documents can be measured by the distance between their corresponding sentence embedding vectors computed by the LSTM-RNN. On a web search task, the LSTM-RNN embedding is shown to significantly outperform several existing state of the art methods. We emphasize that the proposed model generates sentence embedding vectors that are specially useful for web document retrieval tasks. A comparison with a well known general sentence embedding method, the Paragraph Vector, is performed. The results show that the proposed method in this paper significantly outperforms Paragraph Vector method for web document retrieval task.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network

TL;DR: A Machine Learning practitioner seeking guidance for implementing the new augmented LSTM model in software for experimentation and research will find the insights and derivations in this treatise valuable as well.
Journal ArticleDOI

A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures

TL;DR: The LSTM cell and its variants are reviewed and their variants are explored to explore the learning capacity of the LSTm cell and the L STM networks are divided into two broad categories:LSTM-dominated networks and integrated LSTS networks.
Journal ArticleDOI

A deep learning framework for financial time series using stacked autoencoders and long-short term memory

TL;DR: A novel deep learning framework where wavelet transforms, stacked autoencoders and long-short term memory are combined for stock price forecasting and shows that the proposed model outperforms other similar models in both predictive accuracy and profitability performance.
Proceedings ArticleDOI

Deep code search

TL;DR: A novel deep neural network named CODEnn (Code-Description Embedding Neural Network) is proposed, which jointly embeds code snippets and natural language descriptions into a high-dimensional vector space, in such a way that code snippet and its corresponding description have similar vectors.
Proceedings ArticleDOI

Learning to Match using Local and Distributed Representations of Text for Web Search

TL;DR: This work proposes a novel document ranking model composed of two separate deep neural networks, one that matches the query and the document using a local representation, and another that Matching with distributed representations complements matching with traditional local representations.
References
More filters
Journal ArticleDOI

Long short-term memory

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article

Distributed Representations of Words and Phrases and their Compositionality

TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Proceedings Article

Neural Machine Translation by Jointly Learning to Align and Translate

TL;DR: It is conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and it is proposed to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.
Proceedings Article

Sequence to Sequence Learning with Neural Networks

TL;DR: The authors used a multilayered Long Short-Term Memory (LSTM) to map the input sequence to a vector of a fixed dimensionality, and then another deep LSTM to decode the target sequence from the vector.
Journal ArticleDOI

Finding Structure in Time

TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.
Related Papers (5)