Topic
Word embedding
About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.
Papers
More filters
••
01 Nov 2018TL;DR: This study demonstrates that existing bilingual embedding techniques are not ideal for code-mixed text processing and there is a need for learning multilingual word embedding from the code- mixed text.
Abstract: We compare three existing bilingual word embedding approaches, and a novel approach of training skip-grams on synthetic code-mixed text generated through linguistic models of code-mixing, on two tasks - sentiment analysis and POS tagging for code-mixed text. Our results show that while CVM and CCA based embeddings perform as well as the proposed embedding technique on semantic and syntactic tasks respectively, the proposed approach provides the best performance for both tasks overall. Thus, this study demonstrates that existing bilingual embedding techniques are not ideal for code-mixed text processing and there is a need for learning multilingual word embedding from the code-mixed text.
55 citations
••
TL;DR: Comparison with the previous state-of-the-art segment convolutional neural network (Seg-CNN) suggests that adding syntactic dependency information helps refine medical word embedding and improves concept relation classification without manual feature engineering.
55 citations
••
01 May 2020TL;DR: The proposed model which is the blend of convolutional neural network and recurrent neural networks architecture has achieved benchmark results in fake news prediction, with the utility of word embeddings complementing the model altogether.
Abstract: Fake news and its consequences carry the potential of impacting different aspects of different entities, ranging from a citizen’s lifestyle to a country’s global relations, there are many related works for collecting and determining fake news, but no reliable system is commercially available. This study aims to propose a deep learning model which predicts the nature of an article when given as an input. It solely uses text processing and is insensitive to history and credibility of the author or the source. In this paper, authors have discussed and experimented using word embedding (GloVe) for text pre-processing in order to construct a vector space of words and establish a lingual relationship. The proposed model which is the blend of convolutional neural network and recurrent neural networks architecture has achieved benchmark results in fake news prediction, with the utility of word embeddings complementing the model altogether. Further, to ensure the quality of prediction, various model parameters have been tuned and recorded for the best results possible. Among other variations, addition of dropout layer reduces overfitting in the model, hence generating significantly higher accuracy values. It can be a better solution than already existing ones, viz: gated recurrent units, recurrent neural networks or feed-forward networks for the given problem, which generates better precision values of 97.21% while considering more input features.
55 citations
••
03 Apr 2017TL;DR: An innovative word embedding-based system devoted to calculate the semantic similarity in Arabic sentences by exploiting vectors as word representations in a multidimensional space in order to capture the semantic and syntactic properties of words.
Abstract: Semantic textual similarity is the basis of countless applications and plays an important role in diverse areas, such as information retrieval, plagiarism detection, information extraction and machine translation. This article proposes an innovative word embedding-based system devoted to calculate the semantic similarity in Arabic sentences. The main idea is to exploit vectors as word representations in a multidi-mensional space in order to capture the semantic and syntactic properties of words. IDF weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly descriptive in each sentence. The performance of our proposed system is confirmed through the Pearson correlation between our assigned semantic similarity scores and human judgments.
55 citations
••
19 Apr 2015TL;DR: Experimental results show that word embedding can significantly improve the performance of BLSTM-RNN based TTS synthesis without using features of TOBI and Part of Speech (POS).
Abstract: The current state of the art TTS synthesis can produce synthesized speech with highly decent quality if rich segmental and suprasegmental information are given. However, some suprasegmental features, e.g., Tone and Break (TOBI), are time consuming due to being manually labeled with a high inconsistency among different annotators. In this paper, we investigate the use of word embedding, which represents word with low dimensional continuous-valued vector and being assumed to carry a certain syntactic and semantic information, for bidirectional long short term memory (BLSTM), recurrent neural network (RNN) based TTS synthesis. Experimental results show that word embedding can significantly improve the performance of BLSTM-RNN based TTS synthesis without using features of TOBI and Part of Speech (POS).
54 citations