scispace - formally typeset
Open AccessJournal ArticleDOI

SocialTERM-Extractor: Identifying and Predicting Social-Problem-Specific Key Noun Terms from a Large Number of Online News Articles Using Text Mining and Machine Learning Techniques

Jong Hwan Suh
- 02 Jan 2019 - 
- Vol. 11, Iss: 1, pp 196
TLDR
This paper has its novelty as the first trial to identify and predict the SocialTERMs from a large number of online news articles, and it contributes to literature by proposing three types of text-mining-based features, namely temporal weight, sentiment, and complex network structural features, and by comparing the performances of such features with various machine learning techniques including deep learning.
Abstract
In the digital age, the abundant unstructured data on the Internet, particularly online news articles, provide opportunities for identifying social problems and understanding social systems for sustainability. However, the previous works have not paid attention to the social-problem-specific perspectives of such big data, and it is currently unclear how information technologies can use the big data to identify and manage the ongoing social problems. In this context, this paper introduces and focuses on social-problem-specific key noun terms, namely SocialTERMs, which can be used not only to search the Internet for social-problem-related data, but also to monitor the ongoing and future events of social problems. Moreover, to alleviate time-consuming human efforts in identifying the SocialTERMs, this paper designs and examines the SocialTERM-Extractor, which is an automatic approach for identifying the key noun terms of social-problem-related topics, namely SPRTs, in a large number of online news articles and predicting the SocialTERMs among the identified key noun terms. This paper has its novelty as the first trial to identify and predict the SocialTERMs from a large number of online news articles, and it contributes to literature by proposing three types of text-mining-based features, namely temporal weight, sentiment, and complex network structural features, and by comparing the performances of such features with various machine learning techniques including deep learning. Particularly, when applied to a large number of online news articles that had been published in South Korea over a 12-month period and mostly written in Korean, the experimental results showed that Boosting Decision Tree gave the best performances with the full feature sets. They showed that the SocialTERMs can be predicted with high performances by the proposed SocialTERM-Extractor. Eventually, this paper can be beneficial for individuals or organizations who want to explore and use social-problem-related data in a systematical manner for understanding and managing social problems even though they are unfamiliar with ongoing social problems.

read more

Citations
More filters
DissertationDOI

Brand equity assessment: a computational model for mining consumer perceptions in social media

TL;DR: In this paper, a computational model that combines topic and sentiment classification to elicit influential subjects from consumer perceptions in social media is proposed to improve clustering of tweets in semantically coherent groups, which act as an essential prerequisite when searching for prevailing topics and sentiment in big pools of data.
Journal ArticleDOI

Forecasting Spare Parts Demand of Military Aircraft: Comparisons of Data Mining Techniques and Managerial Features from the Case of South Korea

Boram Choi, +1 more
- 01 Jul 2020 - 
TL;DR: The reliability and operation environment are valuable feature sets in a significant way, so they should be collected, managed more carefully, and included for better prediction of spare parts demand of military aircraft.
Journal ArticleDOI

Big data as a value generator in decision support systems: a literature review

TL;DR: Describing how decision support systems manage Big data to obtain value allows authors to understand the relationship in which descriptive, predictive and prescriptive analyses are used according to an inverse relationship of complexity in data analysis and the need for human decision-making.
Journal ArticleDOI

Machine-Learning-Based Gender Distribution Prediction from Anonymous News Comments: The Case of Korean News Portal

Jong Hwan Suh
- 11 Aug 2022 - 
TL;DR: This study showed that a machine-learning-based approach can overcome the incomplete gender information problem of anonymous social media users, and when the gender distributions of the unlabeled news articles were predicted using the best neural network model, their distribution turned out different from the labeled news articles.
References
More filters
Journal ArticleDOI

SMOTE: synthetic minority over-sampling technique

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Journal ArticleDOI

Fast unfolding of communities in large networks

TL;DR: This work proposes a heuristic method that is shown to outperform all other known community detection methods in terms of computation time and the quality of the communities detected is very good, as measured by the so-called modularity.
Journal ArticleDOI

SMOTE: Synthetic Minority Over-sampling Technique

TL;DR: In this article, a method of over-sampling the minority class involves creating synthetic minority class examples, which is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
Journal ArticleDOI

Fast unfolding of communities in large networks

TL;DR: In this paper, the authors proposed a simple method to extract the community structure of large networks based on modularity optimization, which is shown to outperform all other known community detection methods in terms of computation time.
Journal ArticleDOI

Complex networks: Structure and dynamics

TL;DR: The major concepts and results recently achieved in the study of the structure and dynamics of complex networks are reviewed, and the relevant applications of these ideas in many different disciplines are summarized, ranging from nonlinear science to biology, from statistical mechanics to medicine and engineering.
Related Papers (5)