scispace - formally typeset
Search or ask a question
Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.


Papers
More filters
Journal ArticleDOI
TL;DR: This study applied word embedding to feature for named entity recognition (NER) training, and used CRF as a learning algorithm, and found that CCA exhibited the best performance.
Abstract: This study applied word embedding to feature for named entity recognition (NER) training, and used CRF as a learning algorithm. Named entities are phrases that contain the names of persons, organizations and locations and recognizing these entities in text is one of the important task of information extraction. Word embedding is helpful in many learning algorithms of NLP, indicating that words in a sentence are mapped by a real vector in a low-dimension space. We used GloVe, Word2Vec, and CCA as the embedding methods. The Reuters Corpus Volume 1 was used to create word embedding and the 2003 shared task corpus (English) of CoNLL was used for training and testing. As a result of comparing the performance of multiple techniques for word embedding to NER, it was found that CCA (85.96%) in Test A and Word2Vec (80.72%) in Test B exhibited the best performance. When using the word embedding as a feature of NER, it is possible to obtain better results than baseline that do not use word embedding. Also, to check that the word embedding well performed, we did additional experiment calculating the similarity between words.

39 citations

Journal ArticleDOI
TL;DR: A novel hybrid model of extractive-abstractive to combine BERT (Bidirectional Encoder Representations from Transformers) word embedding with reinforcement learning is proposed, which converts the human-written abstractive summaries to the ground truth labels.
Abstract: As a core task of natural language processing and information retrieval, automatic text summarization is widely applied in many fields. There are two existing methods for text summarization task at present: abstractive and extractive. On this basis we propose a novel hybrid model of extractive-abstractive to combine BERT (Bidirectional Encoder Representations from Transformers) word embedding with reinforcement learning. Firstly, we convert the human-written abstractive summaries to the ground truth labels. Secondly, we use BERT word embedding as text representation and pre-train two sub-models respectively. Finally, the extraction network and the abstraction network are bridged by reinforcement learning. To verify the performance of the model, we compare it with the current popular automatic text summary model on the CNN/Daily Mail dataset, and use the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics as the evaluation method. Extensive experimental results show that the accuracy of the model is improved obviously.

38 citations

Proceedings ArticleDOI
01 Oct 2018
TL;DR: A sentiment analysis method by incorporating Continuous Bag-of-Words model and Stacked Bidirectional long short-term memory model to enhance the performance of sentiment prediction achieves better performance than other machine learning models.
Abstract: In this paper, we propose a sentiment analysis method by incorporating Continuous Bag-of-Words (CBOW) model and Stacked Bidirectional long short-term memory (Stacked Bi-LSTM) model to enhance the performance of sentiment prediction. Firstly, a word embedding model, CBOW model, is employed to capture semantic features of words and transfer words into high dimensional word vectors. Secondly, we introduce Stacked Bi-LSTM model to conduct the feature extraction of sequential word vectors at a deep level. Finally, a binary softmax classifier utilizes semantic and contextual features to predict the sentiment orientation. Extensive experiments on real dataset collected from Weibo (i.e., one of the most popular Chinese microblogs) show that our proposed approach achieves better performance than other machine learning models.

38 citations

Journal ArticleDOI
Yifan Nie1, Wenge Rong1, Yiyuan Zhang1, Yuanxin Ouyang1, Zhang Xiong1 
TL;DR: A word embedding assisted neural network prediction model is proposed to conduct event trigger identification and it is believed that this study could offer researchers insights into semantic-aware solutions for eventtrigger identification.
Abstract: Molecular events normally have significant meanings since they describe important biological interactions or alternations such as binding of a protein. As a crucial step of biological event extraction, event trigger identification has attracted much attention and many methods have been proposed. Traditionally those methods can be categorised into rule-based approach and machine learning approach and machine learning-based approaches have demonstrated its potential and outperformed rule-based approaches in many situations. However, machine learning-based approaches still face several challenges among which a notable one is how to model semantic and syntactic information of different words and incorporate it into the prediction model. There exist many ways to model semantic and syntactic information, among which word embedding is an effective one. Therefore, in order to address this challenge, in this study, a word embedding assisted neural network prediction model is proposed to conduct event trigger identification. The experimental study on commonly used dataset has shown its potential. It is believed that this study could offer researchers insights into semantic-aware solutions for event trigger identification.

38 citations

Journal ArticleDOI
TL;DR: This work aims to explore different machine learning techniques to process Big Data EMR, lessening the needed efforts for performing epidemiological studies about anaphylaxis, and employs a novel undersampling technique based on clustering to balance the dataset.

38 citations


Network Information
Related Topics (5)
Recurrent neural network
29.2K papers, 890K citations
87% related
Unsupervised learning
22.7K papers, 1M citations
86% related
Deep learning
79.8K papers, 2.1M citations
85% related
Reinforcement learning
46K papers, 1M citations
84% related
Graph (abstract data type)
69.9K papers, 1.2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023317
2022716
2021736
20201,025
20191,078
2018788