scispace - formally typeset
Search or ask a question
Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a novel approach for text classification based on clustering word embeddings, inspired by the bag of visual words model, and reports results on two text mining tasks, namely text categorization by topic and polarity classification.

19 citations

01 Jan 2016
TL;DR: This study indicates that the task of assessing the validity of a system-response given a human-utterance is subjective to an important extent, and is thus a difficult task.
Abstract: This paper studies corpus-based process to select a system-response usable both in chatterbot or as a fallback strategy. It presents, evaluates and compares two selection methods that retrieve and adapt a system-response from the OpenSubtitles2016 corpus given a human-utterance. A corpus of 800 annotated pairs is constituted. Evaluation consists in objective metrics and subjective annotation based on the validity schema proposed in the RE-WOCHAT shared task. Our study indicates that the task of assessing the validity of a system-response given a human-utterance is subjective to an important extent, and is thus a difficult task. Comparisons show that the selection method based on word embedding performs objectively better than the one based on TF-IDF in terms of response variety and response length.

19 citations

Proceedings ArticleDOI
01 Feb 2020
TL;DR: Zhang et al. as mentioned in this paper designed a multi-level embedding (i.e., word embedding and character embedding) approach to represent the semantics provided by code changes and reviews.
Abstract: Code review is a common process that is used by developers, in which a reviewer provides useful comments or points out defects in the submitted source code changes via pull request. Code review has been widely used for both industry and open-source projects due to its capacity in early defect identification, project maintenance, and code improvement. With rapid updates on project developments, code review becomes a non-trivial and labor-intensive task for reviewers. Thus, an automated code review engine can be beneficial and useful for project development in practice. Although there exist prior studies on automating the code review process by adopting static analysis tools or deep learning techniques, they often require external sources such as partial or full source code for accurate review suggestion. In this paper, we aim at automating the code review process only based on code changes and the corresponding reviews but with better performance. The hinge of accurate code review suggestion is to learn good representations for both code changes and reviews. To achieve this with limited source, we design a multi-level embedding (i.e., word embedding and character embedding) approachto represent the semantics provided by code changes and reviews. The embeddings are then well trained through a proposed attentional deep learning model, as a whole named CORE. We evaluate the effectiveness of CORE on code changes and reviews collected from 19 popular Java projects hosted on Github. Experimental results show that our model CORE can achieve significantly better performance than the state-of-the-art model (DeepMem), with an increase of 131.03% in terms of Recall@10 and 150.69% in terms of Mean Reciprocal Rank. Qualitative general word analysis among project developers also demonstrates the performance of CORE in automating code review.

19 citations

Journal ArticleDOI
TL;DR: An alarm prediction method based on word embedding and recurrent neural networks to predict the next alarm in a process setting is presented, which represents both a novel approach to alarm management as well as a novel application of natural language processing and deep learning techniques to this problem.
Abstract: Industrial alarm systems play an essential role for the safe management of process operations. With the increase in automation and instrumentation of modern process plants, the number of alarms that the operators manage has also increased significantly. The operators are expected to make critical decisions in the presence of flooding alarms, poorly configured and maintained alarms and many nuisance alarms. In this environment, if the incoming alarms can be correctly predicted before they actually occur, the operators may have a chance to address and possibly avoid abnormal behaviors by taking corrective actions in time. Inspired by the application of deep learning in natural language processing, this paper presents an alarm prediction method based on word embedding and recurrent neural networks to predict the next alarm in a process setting. This represents both a novel approach to alarm management as well as a novel application of natural language processing and deep learning techniques to this problem. The proposed method is applied to an actual case study to demonstrate its performance.

19 citations

Journal ArticleDOI
TL;DR: Results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%.
Abstract: Reviews of users on social networks have been gaining rapidly interest on the usage of sentiment analysis which serve as feedback to the government, public and private companies. Text Mining has a wide variety of applications such as sentiment analysis, spam detection, sarcasm detection, and news classification. Reviews classification using user sentiments is an important and collaborative task for many organizations. During recent years, text classification is mostly studied with machine learning models and hand–crafted features which are not able to give promising results on short text classification. In this research, a deep neural network–based model Long Short Term Memory (LSTM) with word embedding features is proposed. The proposed model has been evaluated on the large dataset of Hotel reviews based on accuracy, precision, recall, and F1-score. This research is a classification study on the hotel review sentiments given by guests of the hotel. The results reveal that the proposed model performs better as compared to the existing state-of-the-art models when combined word embedding with LSTM and shows an accuracy of 97%, precision 83%, recall 71%, and F1-score 76.53%. These promising results reveal the effectiveness of the proposed model on any type of review classification tasks.

18 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
87% related
Unsupervised learning
22.7K papers, 1M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Reinforcement learning
46K papers, 1M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023317
2022716
2021736
20201,025
20191,078
2018788