Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

HeteGCN: Heterogeneous Graph Convolutional Networks for Text Classification

[...]

Rahul Ragesh¹, Sundararajan Sellamanickam¹, Arun Iyer¹, Ramakrishna Bairi¹, Vijay Lingam¹ - Show less +1 more•Institutions (1)

Microsoft¹

08 Mar 2021

TL;DR: In this paper, a heterogeneous graph convolutional network (HeteGCN) is proposed to learn efficient and inductive graph convolutionsal networks for text classification with a large number of examples and features.

...read moreread less

Abstract: We consider the problem of learning efficient and inductive graph convolutional networks for text classification with a large number of examples and features. Existing state-of-the-art graph embedding based methods such as predictive text embedding (PTE) and TextGCN have shortcomings in terms of predictive performance, scalability and inductive capability. To address these limitations, we propose a heterogeneous graph convolutional network (HeteGCN) modeling approach that unites the best aspects of PTE and TextGCN together. The main idea is to learn feature embeddings and derive document embeddings using a HeteGCN architecture with different graphs used across layers. We simplify TextGCN by dissecting into several HeteGCN models which (a) helps to study the usefulness of individual models and (b) offers flexibility in fusing learned embeddings from different models. In effect, the number of model parameters is reduced significantly, enabling faster training and improving performance in small labeled training set scenario. Our detailed experimental studies demonstrate the efficacy of the proposed approach.

...read moreread less

29 citations

Book Chapter•DOI•

Pre-trained Word Embeddings for Arabic Aspect-Based Sentiment Analysis of Airline Tweets

[...]

Mohammed Matuq Ashi¹, Muazzam Ahmed Siddiqui¹, Farrukh Nadeem¹•Institutions (1)

King Abdulaziz University¹

01 Sep 2018

TL;DR: This paper compared two word embedding models for aspect-based sentiment analysis (ABSA) of Arabic tweets and indicated that fastText Arabic Wikipedia word embeddings performed slightly better than AraVec-Web.

...read moreread less

Abstract: Recently, the use of word embeddings has become one of the most significant advancements in natural language processing (NLP). In this paper, we compared two word embedding models for aspect-based sentiment analysis (ABSA) of Arabic tweets. The ABSA problem was formulated as a two step process of aspect detection followed by sentiment polarity classification of the detected aspects. The compared embeddings models include fastText Arabic Wikipedia and AraVec-Web, both available as pre-trained models. Our corpus consisted of 5K airline service related tweets in Arabic, manually labeled for ABSA with imbalanced aspect categories. For classification, we used a support vector machine classifier for both, aspect detection, and sentiment polarity classification. Our results indicated that fastText Arabic Wikipedia word embeddings performed slightly better than AraVec-Web.

...read moreread less

28 citations

Proceedings Article•DOI•

Joint Topic-Semantic-aware Social Recommendation for Online Voting

[...]

Hongwei Wang¹, Jia Wang¹, Miao Zhao¹, Jiannong Cao¹, Minyi Guo² - Show less +1 more•Institutions (2)

Hong Kong Polytechnic University¹, Shanghai Jiao Tong University²

06 Nov 2017

TL;DR: Wang et al. as mentioned in this paper proposed a joint topic-semantic-aware social matrix factorization (JTS-MF) model for voting recommendation, which calculates similarity among users and votes by combining their TEWE representation and structural information of social networks.

...read moreread less

Abstract: Online voting is an emerging feature in social networks, in which users can express their attitudes toward various issues and show their unique interest. Online voting imposes new challenges on recommendation, because the propagation of votings heavily depends on the structure of social networks as well as the content of votings. In this paper, we investigate how to utilize these two factors in a comprehensive manner when doing voting recommendation. First, due to the fact that existing text mining methods such as topic model and semantic model cannot well process the content of votings that is typically short and ambiguous, we propose a novel Topic-Enhanced Word Embedding (TEWE) method to learn word and document representation by jointly considering their topics and semantics. Then we propose our Joint Topic-Semantic-aware social Matrix Factorization (JTS-MF) model for voting recommendation. JTS-MF model calculates similarity among users and votings by combining their TEWE representation and structural information of social networks, and preserves this topic-semantic-social similarity during matrix factorization. To evaluate the performance of TEWE representation and JTS-MF model, we conduct extensive experiments on real online voting dataset. The results prove the efficacy of our approach against several state-of-the-art baselines.

...read moreread less

28 citations

Posted Content•

Bayesian Neural Word Embedding

[...]

Oren Barkan¹•Institutions (1)

Tel Aviv University¹

21 Mar 2016-arXiv: Computation and Language

TL;DR: Experimental results demonstrate the performance of the proposed scalable Bayesian neural word embedding algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method.

...read moreread less

Abstract: Recently, several works in the domain of natural language processing presented successful methods for word embedding. Among them, the Skip-Gram with negative sampling, known also as word2vec, advanced the state-of-the-art of various linguistics tasks. In this paper, we propose a scalable Bayesian neural word embedding algorithm. The algorithm relies on a Variational Bayes solution for the Skip-Gram objective and a detailed step by step description is provided. We present experimental results that demonstrate the performance of the proposed algorithm for word analogy and similarity tasks on six different datasets and show it is competitive with the original Skip-Gram method.

...read moreread less

28 citations

Posted Content•

Did You Really Just Have a Heart Attack? Towards Robust Detection of Personal Health Mentions in Social Media

[...]

Payam Karisani¹, Eugene Agichtein¹•Institutions (1)

Emory University¹

26 Feb 2018-arXiv: Computation and Language

TL;DR: This paper proposed a general, robust method for detecting personal health events in social media, which combines lexical, syntactic, word embedding-based, and context-based features.

...read moreread less

Abstract: Millions of users share their experiences on social media sites, such as Twitter, which in turn generate valuable data for public health monitoring, digital epidemiology, and other analyses of population health at global scale. The first, critical, task for these applications is classifying whether a personal health event was mentioned, which we call the (PHM) problem. This task is challenging for many reasons, including typically short length of social media posts, inventive spelling and lexicons, and figurative language, including hyperbole using diseases like "heart attack" or "cancer" for emphasis, and not as a health self-report. This problem is even more challenging for rarely reported, or frequent but ambiguously expressed conditions, such as "stroke". To address this problem, we propose a general, robust method for detecting PHMs in social media, which we call WESPAD, that combines lexical, syntactic, word embedding-based, and context-based features. WESPAD is able to generalize from few examples by automatically distorting the word embedding space to most effectively detect the true health mentions. Unlike previously proposed state-of-the-art supervised and deep-learning techniques, WESPAD requires relatively little training data, which makes it possible to adapt, with minimal effort, to each new disease and condition. We evaluate WESPAD on both an established publicly available Flu detection benchmark, and on a new dataset that we have constructed with mentions of multiple health conditions. The experiments show that WESPAD outperforms the baselines and state-of-the-art methods, especially in cases when the number and proportion of true health mentions in the training data is small.

...read moreread less

28 citations

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics