scispace - formally typeset
Proceedings ArticleDOI

Topic Modelling Twitter Data with Latent Dirichlet Allocation Method

TLDR
Using the LDA method as an algorithm to produce topic modeling, each topic similarity, and visualization of topic clusters from the tweet data generated as many as 4 topics (Economic, Military, Sports, Technology) in Indonesian is successfully carried out.
Abstract
Twitter is a popular social media for every user to issue thoughts and emotional forms which are tweets, tweets that only have 140 characters with limitations to write in text. Twitter is one of the social media places to get information that is always up to date, tweets are categorized into big data because tweets are information that can be used as a source of data for research. Latent Dirichlet Allocation (LDA) as an algorithm that can process large text data (big data). In this study using the LDA method as an algorithm to produce topic modeling, each topic similarity, and visualization of topic clusters from the tweet data generated as many as 4 topics (Economic, Military, Sports, Technology) in Indonesian, where each topic has a number different tweets. The LDA method used in the processing of tweet data is successfully carried out and works optimally, in each topic extraction, topic modeling, generating index words that are in each topic cluster and computer visualization in the topic.LDA output shows optimal performance in the process of word indexing in Sport topics with 1260 tweets with an accuracy of 98% better than the LSI method in Topic Modeling.

read more

Citations
More filters
Journal ArticleDOI

Twitter-based analysis reveals differential COVID-19 concerns across areas with socioeconomic disparities.

TL;DR: In this article, the authors mined coronavirus-related tweets from January 23rd to March 25th, 2020 and applied topic modeling to identify and monitor topics of concern over time.
Journal ArticleDOI

Topic based Sentiment Analysis for COVID-19 Tweets

TL;DR: The research findings revealed the appearance of conflicting topics throughout the two Coronavirus pandemic periods and the expectations and interests of all individuals regarding the various topics were well represented.
Journal ArticleDOI

Citizen Science on Twitter: Using Data Analytics to Understand Conversations and Networks

TL;DR: A long-term study on how the public engage with discussions around citizen science and crowdsourcing topics on Twitter, particularly outside the scope of individual projects and recommendations for stakeholders for engaging with citizen science topics.
Journal ArticleDOI

Thematic analysis of sustainable ultra-precision machining by using text mining and unsupervised learning method

TL;DR: In this article , the main themes of sustainable ultra-precision machining (UPM) were identified by utilizing the latent Dirichlet allocation (LDA) method to analyze the abstracts of the relevant publications.
Journal ArticleDOI

Sustainability disclosure for container shipping: A text-mining approach

TL;DR: In this article, a hierarchical unsupervised text-mining method was used to explore the latent information of major listed container shipping companies' sustainability reports and a unified framework was produced comprising three primary dimensions: employee training and management, sustainable business management, and sustainable shipping operation.
References
More filters
Journal ArticleDOI

Latent dirichlet allocation

TL;DR: This work proposes a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hofmann's aspect model.
Proceedings Article

Latent Dirichlet Allocation

TL;DR: This paper proposed a generative model for text and other collections of discrete data that generalizes or improves on several previous models including naive Bayes/unigram, mixture of unigrams, and Hof-mann's aspect model, also known as probabilistic latent semantic indexing (pLSI).
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI

Probabilistic topic models

TL;DR: Surveying a suite of algorithms that offer a solution to managing large document archives suggests they are well-suited to handle large amounts of data.
Journal ArticleDOI

An introduction to latent semantic analysis

TL;DR: The adequacy of LSA's reflection of human knowledge has been established in a variety of ways, for example, its scores overlap those of humans on standard vocabulary and subject matter tests; it mimics human word sorting and category judgments; it simulates word‐word and passage‐word lexical priming data.
Related Papers (5)