Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Comparative Analysis on Suicidal Ideation Detection Using NLP, Machine, and Deep Learning

[...]

Rezaul Haque, Naimul Islam, Manjurul Ahsan

29 Apr 2022-Technologies (Basel)

TL;DR: A comparative analysis of multiple machine learning and deep learning models to identify suicidal thoughts from the social media platform Twitter reveals that the RF model can achieve the highest classification score among machine learning algorithms, but training the deep learning classifiers with word embedding increases the performance of ML models.

...read moreread less

Abstract: Social networks are essential resources to obtain information about people’s opinions and feelings towards various issues as they share their views with their friends and family. Suicidal ideation detection via online social network analysis has emerged as an essential research topic with significant difficulties in the fields of NLP and psychology in recent years. With the proper exploitation of the information in social media, the complicated early symptoms of suicidal ideations can be discovered and hence, it can save many lives. This study offers a comparative analysis of multiple machine learning and deep learning models to identify suicidal thoughts from the social media platform Twitter. The principal purpose of our research is to achieve better model performance than prior research works to recognize early indications with high accuracy and avoid suicide attempts. We applied text pre-processing and feature extraction approaches such as CountVectorizer and word embedding, and trained several machine learning and deep learning models for such a goal. Experiments were conducted on a dataset of 49,178 instances retrieved from live tweets by 18 suicidal and non-suicidal keywords using Python Tweepy API. Our experimental findings reveal that the RF model can achieve the highest classification score among machine learning algorithms, with an accuracy of 93% and an F1 score of 0.92. However, training the deep learning classifiers with word embedding increases the performance of ML models, where the BiLSTM model reaches an accuracy of 93.6% and a 0.93 F1 score.

...read moreread less

19 citations

Journal Article•DOI•

Applying Internet information technology combined with deep learning to tourism collaborative recommendation system.

[...]

Meng Wang¹•Institutions (1)

Henan University¹

03 Dec 2020-PLOS ONE

TL;DR: The results have enriched and developed the theory of tourism service supply chain, providing a reference for constructing a personalized tourism service system.

...read moreread less

Abstract: Recently, more personalized travel methods have emerged in the tourism industry, such as individual travel and self-guided travel. The service models of traditional tourism limit the diversity of service options and cannot fully meet the individual needs of tourists anymore. The aim is to integrate sparse tourism information on the Internet, thereby providing more convenient, faster, and more personalized tourism services. Based on the shortcomings of the traditional tourism recommendation system, a deep learning-based classification processing method of tourism product information is proposed. This method uses word embedding in the data preprocessing stage. The Convolutional Neural Network (CNN) is used to process review information of users and tourism service items. The Deep Neural Network (DNN) is used to process the necessary information of users and tourism service items. Also, factorization machine technology is used to learn the interaction between the extracted features to improve the prediction model. The results show that the proposed model can maintain an excellent precision of 64.2% when generating personalized recommendation lists for users. The sensitivity and accuracy of the recommendation list are better than other algorithms. By adding DNN, the word embedding method, and the factorization machine model, the precision is improved by 30%, 33.3%, and 40%, respectively. The model accuracy is the highest with 40 hidden factors, 100 convolutions, and a 100+50 combination hidden layer. Compared with traditional methods, the proposed algorithm can provide users with personalized travel products more accurately in personalized travel recommendations. The results have enriched and developed the theory of tourism service supply chain, providing a reference for constructing a personalized tourism service system.

...read moreread less

19 citations

Proceedings Article•DOI•

Learning Numeral Embedding

[...]

Chengyue Jiang¹, Zhonglin Nian, Kaihao Guo¹, Shanbo Chu, Yinggong Zhao², Libin Shen, Kewei Tu³ - Show less +3 more•Institutions (3)

ShanghaiTech University¹, Nanjing University², National University of Singapore³

01 Nov 2020

TL;DR: Two novel numeral embedding methods that can handle the out-of-vocabulary (OOV) problem for numerals are proposed and shown its effectiveness on four intrinsic and extrinsic tasks: word similarity, embedding numeracy, numeral prediction, and sequence labeling.

...read moreread less

Abstract: Word embedding is an essential building block for deep learning methods for natural language processing. Although word embedding has been extensively studied over the years, the problem of how to effectively embed numerals, a special subset of words, is still underexplored. Existing word embedding methods do not learn numeral embeddings well because there are an infinite number of numerals and their individual appearances in training corpora are highly scarce. In this paper, we propose two novel numeral embedding methods that can handle the out-of-vocabulary (OOV) problem for numerals. We first induce a finite set of prototype numerals using either a self-organizing map or a Gaussian mixture model. We then represent the embedding of a numeral as a weighted average of the prototype number embeddings. Numeral embeddings represented in this manner can be plugged into existing word embedding learning approaches such as skip-gram for training. We evaluated our methods and showed its effectiveness on four intrinsic and extrinsic tasks: word similarity, embedding numeracy, numeral prediction, and sequence labeling.

...read moreread less

19 citations

Book Chapter•DOI•

Personality Recognition from Facebook Text

[...]

Barbara Barbosa Claudino Silva¹, Ivandré Paraboni¹•Institutions (1)

University of São Paulo¹

24 Sep 2018

TL;DR: Results suggest that word embedding models slightly outperform the alternatives under consideration, with the advantage of not requiring any language-specific lexical resources.

...read moreread less

Abstract: This work concerns a study in the Natural Language Processing field aiming to recognise personality traits in Portuguese written text. To this end, we first built a corpus of Facebook status updates labelled with the personality traits of their authors, from which we trained a number of computational models of personality recognition. The models include a range of alternatives ranging from a standard approach relying on lexical knowledge from the LIWC dictionary and others, to purely text-based methods such as bag of words, word embeddings and others. Results suggest that word embedding models slightly outperform the alternatives under consideration, with the advantage of not requiring any language-specific lexical resources.

...read moreread less

19 citations

Posted Content•

CAN-NER: Convolutional Attention Network for Chinese Named Entity Recognition

[...]

Yuying Zhu¹, Guoxin Wang², Börje F. Karlsson²•Institutions (2)

Nankai University¹, Microsoft²

03 Apr 2019-arXiv: Computation and Language

TL;DR: Wang et al. as mentioned in this paper investigated a convolutional attention network called CAN for Chinese NER, which consists of a character-based CNN with local attention layer and a gated recurrent unit (GRU) with global self attention layer to capture the information from adjacent characters and sentence contexts.

...read moreread less

Abstract: Named entity recognition (NER) in Chinese is essential but difficult because of the lack of natural delimiters. Therefore, Chinese Word Segmentation (CWS) is usually considered as the first step for Chinese NER. However, models based on word-level embeddings and lexicon features often suffer from segmentation errors and out-of-vocabulary (OOV) words. In this paper, we investigate a Convolutional Attention Network called CAN for Chinese NER, which consists of a character-based convolutional neural network (CNN) with local-attention layer and a gated recurrent unit (GRU) with global self-attention layer to capture the information from adjacent characters and sentence contexts. Also, compared to other models, not depending on any external resources like lexicons and employing small size of char embeddings make our model more practical. Extensive experimental results show that our approach outperforms state-of-the-art methods without word embedding and external lexicon resources on different domain datasets including Weibo, MSRA and Chinese Resume NER dataset.

...read moreread less

19 citations

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics