Topic

Word embedding

About: Word embedding is a research topic. Over the lifetime, 4683 publications have been published within this topic receiving 153378 citations. The topic is also known as: word embeddings.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Abstractive method of text summarization with sequence to sequence RNNs

[...]

Abu Kaisar Mohammad Masum¹, Sheikh Abujar¹, Ashraful Islam Talukder¹, Akm Shahariar Azad Rabby¹, Syed Akhter Hossain¹ - Show less +1 more•Institutions (1)

Daffodil International University¹

06 Jul 2019

TL;DR: The main goal was increased the efficiency and reduce train loss of sequence to sequence model for making a better abstractive text summarizer and successfully reduced the training loss with a value of 0.036.

...read moreread less

Abstract: Text summarization is one of the famous problems in natural language processing and deep learning in recent years. Generally, text summarization contains a short note on a large text document. Our main purpose is to create a short, fluent and understandable abstractive summary of a text document. For making a good summarizer we have used amazon fine food reviews dataset, which is available on Kaggle. We have used reviews text descriptions as our input data, and generated a simple summary of that review descriptions as our output. To assist produce some extensive summary, we have used a bi-directional RNN with LSTM's in encoding layer and attention model in decoding layer. And we applied the sequence to sequence model to generate a short summary of food descriptions. There are some challenges when we working with abstractive text summarizer such as text processing, vocabulary counting, missing word counting, word embedding, the efficiency of the model or reduce value of loss and response machine fluent summary. In this paper, the main goal was increased the efficiency and reduce train loss of sequence to sequence model for making a better abstractive text summarizer. In our experiment, we've successfully reduced the training loss with a value of 0.036 and our abstractive text summarizer able to create a short summary of English to English text.

...read moreread less

27 citations

Proceedings Article•

Hash Embeddings for Efficient Word Representations

[...]

Dan Tito Svenstrup, Jonas Meinertz Hansen, Ole Winther

01 Jan 2017

TL;DR: Hash embeddings as discussed by the authors is an efficient method for representing words in a continuous vector form, where each token is represented by $k$ $d$-dimensional embedding vectors and one$k$ dimensional weight vector.

...read moreread less

Abstract: We present hash embeddings, an efficient method for representing words in a continuous vector form. A hash embedding may be seen as an interpolation between a standard word embedding and a word embedding created using a random hash function (the hashing trick). In hash embeddings each token is represented by $k$ $d$-dimensional embeddings vectors and one $k$ dimensional weight vector. The final $d$ dimensional representation of the token is the product of the two. Rather than fitting the embedding vectors for each token these are selected by the hashing trick from a shared pool of $B$ embedding vectors. Our experiments show that hash embeddings can easily deal with huge vocabularies consisting of millions tokens. When using a hash embedding there is no need to create a dictionary before training nor to perform any kind of vocabulary pruning after training. We show that models trained using hash embeddings exhibit at least the same level of performance as models trained using regular embeddings across a wide range of tasks. Furthermore, the number of parameters needed by such an embedding is only a fraction of what is required by a regular embedding. Since standard embeddings and embeddings constructed using the hashing trick are actually just special cases of a hash embedding, hash embeddings can be considered an extension and improvement over the existing regular embedding types.

...read moreread less

27 citations

Book Chapter•DOI•

TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations

[...]

Noureddine Azzouza¹, Karima Akli-Astouati¹, Roliana Ibrahim²•Institutions (2)

University of Science and Technology Houari Boumediene¹, Universiti Teknologi Malaysia²

22 Sep 2019

TL;DR: This study proposes a four-phase framework for Twitter Sentiment Analysis based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder for generating sentence depictions to enhance sentiment classification.

...read moreread less

Abstract: Sentiment analysis has been a topic of discussion in the exploration domain of language understanding. Yet, the neural networks deployed in it are deficient to some extent. Currently, the majority of the studies proceeds on identifying the sentiments by focusing on vocabulary and syntax. Moreover, the task is recognised in Natural Language Processing (NLP) and, for calculating the noteworthy and exceptional results, Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) have been employed. In this study, we propose a four-phase framework for Twitter Sentiment Analysis. This setup is based on the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model as an encoder for generating sentence depictions. For more effective utilisation of this model, we deploy various classification models. Additionally, we concatenate pre-trained representations of word embeddings with BERT representation method to enhance sentiment classification. Experimental results show better implementation when it is evaluated against the baseline framework on all datasets. For example, our best model attains an F1-score of 71.82% on the SemEval 2017 dataset. A comparative analysis on experimental results offers some recommendations on choosing pre-training steps to obtain improved results. The outcomes of the experiment confirm the effectiveness of our system.

...read moreread less

27 citations

Proceedings Article•DOI•

Clinical Text Classification with Word Embedding Features vs. Bag-of-Words Features

[...]

Yijun Shao¹, Stephanie Taylor, Nell Marshall², Craig A. Morioka, Qing Zeng-Treitler¹ - Show less +1 more•Institutions (2)

George Washington University¹, VA Palo Alto Healthcare System²

01 Dec 2018

TL;DR: The study showed that the word2vec features performed better than the BOW-1-gram features, however, when 2-grams were added to BOW, comparison results were mixed.

...read moreread less

Abstract: Word embedding motivated by deep learning have shown promising results over traditional bag-of-words features for natural language processing. When trained on large text corpora, word embedding methods such as word2vec and doc2vec methods have the advantage of learning from unlabeled data and reduce the dimension of the feature space. In this study, we experimented with word2vec and doc2vec features for a set of clinical text classification tasks and compared the results with using the traditional bag-of-words (BOW) features. The study showed that the word2vec features performed better than the BOW-1-gram features. However, when 2-grams were added to BOW, comparison results were mixed.

...read moreread less

27 citations

Posted Content•

Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles.

[...]

Amirmohammad Kazameini, Samin Fatehi, Yash Mehta, Sauleh Eetemadi, Erik Cambria - Show less +1 more

03 Oct 2020-arXiv: Computation and Language

TL;DR: This work presents a novel deep learning-based approach for automated personality detection from text which outperforms the previous state of the art by 1.04% and is significantly more computationally efficient to train.

...read moreread less

Abstract: Recently, the automatic prediction of personality traits has received increasing attention and has emerged as a hot topic within the field of affective computing. In this work, we present a novel deep learning-based approach for automated personality detection from text. We leverage state of the art advances in natural language understanding, namely the BERT language model to extract contextualized word embeddings from textual data for automated author personality detection. Our primary goal is to develop a computationally efficient, high-performance personality prediction model which can be easily used by a large number of people without access to huge computation resources. Our extensive experiments with this ideology in mind, led us to develop a novel model which feeds contextualized embeddings along with psycholinguistic features toa Bagged-SVM classifier for personality trait prediction. Our model outperforms the previous state of the art by 1.04% and, at the same time is significantly more computationally efficient to train. We report our results on the famous gold standard Essays dataset for personality detection.

...read moreread less

27 citations

Collapse

Network Information

Performance

Metrics

5,718

Papers

201,647

Citations

No. of papers in the topic in previous years
Year	Papers
2023	317
2022	716
2021	736
2020	1,025
2019	1,078
2018	788

Word embedding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics