Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Semantic Clustering-Based Deep Hypergraph Model for Online Reviews Semantic Classification in Cyber-Physical-Social Systems

[...]

Xu Yuan¹, Mingyang Sun¹, Zhikui Chen¹, Jing Gao¹, Peng Li¹ - Show less +1 more•Institutions (1)

Dalian University of Technology¹

01 Jan 2018-IEEE Access

TL;DR: A word embedding clustering-based deep hypergraph model (ECDHG) is proposed for the sentiment analysis of online reviews and demonstrates that the model outperforms the compared methods in classification accuracy.

...read moreread less

Abstract: Sentiment classification of online reviews is playing an increasingly important role for both consumers and businesses in cyber-physical-social systems. However, existing works ignore the semantic correlation among different reviews, causing the ineffectiveness for sentiment classification. In this paper, a word embedding clustering-based deep hypergraph model (ECDHG) is proposed for the sentiment analysis of online reviews. The ECDHG introduces external knowledge by employing the pre-training word embeddings to express reviews. Then, semantic units are detected under the supervision of semantic cliques discovered by an improved hierarchical fast clustering algorithm. Convolutional neural networks are connected to extract the high-order textual and semantic features of reviews. Finally, the hypergraph can be constructed based on high-order relations of samples for the sentiment classification of reviews. Experiments are performed on five-domain data sets including movie, book, DVD, kitchen, and electronic to assess the performance of the proposed model compared with other seven models. The results validate that our model outperforms the compared methods in classification accuracy.

...read moreread less

16 citations

Proceedings Article•DOI•

Synergistic Union of Word2Vec and Lexicon for Domain Specific Semantic Similarity

[...]

Keet Sugathadasa¹, Buddhi Ayesha¹, Nisansa de Silva¹, Amal Shehan Perera¹, Vindula Jayawardana¹, Dimuthu Lakmal¹, Madhavi Perera² - Show less +3 more•Institutions (2)

University of Moratuwa¹, University of London²

06 Jun 2017-arXiv: Computation and Language

TL;DR: This article proposed a domain specific semantic similarity measure that was created by the synergistic union of word2vec, a word embedding method that is used for semantic similarity calculation and lexicon based (lexical) semantic similarity methods.

...read moreread less

Abstract: Semantic similarity measures are an important part in Natural Language Processing tasks. However Semantic similarity measures built for general use do not perform well within specific domains. Therefore in this study we introduce a domain specific semantic similarity measure that was created by the synergistic union of word2vec, a word embedding method that is used for semantic similarity calculation and lexicon based (lexical) semantic similarity methods. We prove that this proposed methodology out performs word embedding methods trained on generic corpus and methods trained on domain specific corpus but do not use lexical semantic similarity methods to augment the results. Further, we prove that text lemmatization can improve the performance of word embedding methods.

...read moreread less

16 citations

Journal Article•DOI•

Learning Sentence-to-Hashtags Semantic Mapping for Hashtag Recommendation on Microblogs

[...]

Riccardo Cantini¹, Fabrizio Marozzo¹, Giovanni Bruno¹, Paolo Trunfio¹•Institutions (1)

University of Calabria¹

04 Sep 2021-ACM Transactions on Knowledge Discovery From Data

TL;DR: This article proposed a new model, called HASHET (HAshtag recommendation using Sentence-to-Hashtag Embedding Translation), aimed at suggesting a relevant set of hashtags for a given post.

...read moreread less

Abstract: The growing use of microblogging platforms is generating a huge amount of posts that need effective methods to be classified and searched. In Twitter and other social media platforms, hashtags are exploited by users to facilitate the search, categorization, and spread of posts. Choosing the appropriate hashtags for a post is not always easy for users, and therefore posts are often published without hashtags or with hashtags not well defined. To deal with this issue, we propose a new model, called HASHET (HAshtag recommendation using Sentence-to-Hashtag Embedding Translation), aimed at suggesting a relevant set of hashtags for a given post. HASHET is based on two independent latent spaces for embedding the text of a post and the hashtags it contains. A mapping process based on a multi-layer perceptron is then used for learning a translation from the semantic features of the text to the latent representation of its hashtags. We evaluated the effectiveness of two language representation models for sentence embedding and tested different search strategies for semantic expansion, finding out that the combined use of BERT (Bidirectional Encoder Representation from Transformer) and a global expansion strategy leads to the best recommendation results. HASHET has been evaluated on two real-world case studies related to the 2016 United States presidential election and COVID-19 pandemic. The results reveal the effectiveness of HASHET in predicting one or more correct hashtags, with an average F-score up to 0.82 and a recommendation hit-rate up to 0.92. Our approach has been compared to the most relevant techniques used in the literature (generative models, unsupervised models, and attention-based supervised models) by achieving up to 15% improvement in F-score for the hashtag recommendation task and 9% for the topic discovery task.

...read moreread less

16 citations

Journal Article•DOI•

A function-based computational method for design concept evaluation

[...]

Jia Hao¹, Jia Hao², Qiangfu Zhao¹, Yan Yan²•Institutions (2)

University of Aizu¹, Beijing Institute of Technology²

01 Apr 2017-Advanced Engineering Informatics

TL;DR: This work provides a computational framework for measuring the novelty, feasibility and diversity of design concept and shows that these metrics can be used to roughly filter a big number of design concepts, and then expert-based method can be further used.

...read moreread less

16 citations

Book Chapter•DOI•

AMUSE: Multilingual Semantic Parsing for Question Answering over Linked Data

[...]

Sherzod Hakimov¹, Soufian Jebbara¹, Philipp Cimiano¹•Institutions (1)

Bielefeld University¹

21 Oct 2017

TL;DR: This article presented a multilingual QALD pipeline that induces a model from training data for mapping a natural language question into logical form as probabilistic inference, which is then mapped to a SPARQL query as a deterministic second step.

...read moreread less

Abstract: The task of answering natural language questions over RDF data has received wide interest in recent years, in particular in the context of the series of QALD benchmarks. The task consists of mapping a natural language question to an executable form, e.g. SPARQL, so that answers from a given KB can be extracted. So far, most systems proposed are (i) monolingual and (ii) rely on a set of hard-coded rules to interpret questions and map them into a SPARQL query. We present the first multilingual QALD pipeline that induces a model from training data for mapping a natural language question into logical form as probabilistic inference. In particular, our approach learns to map universal syntactic dependency representations to a language-independent logical form based on DUDES (Dependency-based Underspecified Discourse Representation Structures) that are then mapped to a SPARQL query as a deterministic second step. Our model builds on factor graphs that rely on features extracted from the dependency graph and corresponding semantic representations. We rely on approximate inference techniques, Markov Chain Monte Carlo methods in particular, as well as Sample Rank to update parameters using a ranking objective. Our focus lies on developing methods that overcome the lexical gap and present a novel combination of machine translation and word embedding approaches for this purpose. As a proof of concept for our approach, we evaluate our approach on the QALD-6 datasets for English, German & Spanish.

...read moreread less

16 citations

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics