Topic Modeling on Online News Extraction

doi:10.1007/978-981-10-7245-1_60

Book ChapterDOI

Topic Modeling on Online News Extraction

Aashka Sahni, +1 more

- pp 611-622

Chats0

TLDR

A word co-occurrence network-based model named WNTM is presented, which works for both long and short news by overcoming its shortcomings, and is intended to create a news recommendation system, which would recommend news to the user according to user preference.

Abstract:

News media includes print media, broadcast news, and Internet (online newspapers, news blogs, etc.). The proposed system intends to collect news data from such diverse sources, capture the varied perceptions, summarize, and present the news. It involves identifying topic from real-time news extractions, then perform clustering of the news documents based on the topics. Previous approaches, like LDA, identify topics efficiently for long news texts, however, fail to do so in case of short news texts. In short news texts, the issues of acute sparsity and irregularity are prevalent. In this paper, we present a solution for topic modeling, i.e, a word co-occurrence network-based model named WNTM, which works for both long and short news by overcoming its shortcomings. It effectively works without wasting much time and space complexity. Further, we intend to create a news recommendation system, which would recommend news to the user according to user preference.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering

Junaid Rashid, +2 more

- 01 Jan 2020 -

Mehran University Research Journal of En...

TL;DR: The proposed k-means topic modeling (KTM) approach is applicable for classification and clustering tasks in text mining and achieves higher performance with a comparison of its competitors LDA and LSA.

...read moreread less

Proceedings ArticleDOI

Automatic Text summarization in Gujarati language

TL;DR: In this article , a statistical text summarization technique on Gujarati text which is one of the resource-poor South Asian languages has been performed by using TF-IDF, LSA, and LDA methods on the custom dataset.

...read moreread less

DOI

Automatic Text summarization in Gujarati language

Harsh Mehta, +2 more

TL;DR: In this paper , a statistical text summarization technique on Gujarati text which is one of the resource-poor South Asian languages has been performed by using TF-IDF, LSA, and LDA methods on the custom dataset.

...read moreread less

Proceedings ArticleDOI

Text summarization using Secretary problem

TL;DR: In this article , a mathematical model was proposed to generate summary that does not include some important sentences, which is called secretary problem, which comes under the extractive text summarization method.

...read moreread less

Journal ArticleDOI

A novel centroid based sentence classification approach for extractive summarization of COVID-19 news reports

Sumanta Banerjee, +2 more

- 24 Mar 2023 -

International journal of information tec...

TL;DR: In this paper , a vector space model (VSM) is used to extract a centroid having the lexical pattern of the sentences on those subtopics by the frequently used words in them, which is then used as a query in the VSM for sentence classification and extraction.

...read moreread less

References

PDF

Open Access

More filters

Book ChapterDOI

Comparing twitter and traditional media using topic models

Wayne Xin Zhao, +6 more

TL;DR: This paper empirically compare the content of Twitter with a traditional news medium, New York Times, using unsupervised topic modeling, and finds interesting and useful findings for downstream IR or DM applications.

...read moreread less

Proceedings ArticleDOI

Learning to classify short and sparse text & web with hidden topics from large-scale data collections

Xuan-Hieu Phan, +2 more

TL;DR: A general framework for building classifiers that deal with short and sparse text & Web segments by making the most of hidden topics discovered from large-scale data collections that is general enough to be applied to different data domains and genres ranging from Web search results to medical text.

...read moreread less

Proceedings ArticleDOI

A web-based kernel function for measuring the similarity of short text snippets

Mehran Sahami, +1 more

TL;DR: This paper defines a similarity kernel function, mathematically analyze some of its properties, and provides examples of its efficacy, and shows the use of this kernel function in a large-scale system for suggesting related queries to search engine users.

...read moreread less

Proceedings Article

Characterizing Microblogs with Topic Models

Daniel Ramage, +2 more

TL;DR: A scalable implementation of a partially supervised learning model (Labeled LDA) that maps the content of the Twitter feed into dimensions that correspond roughly to substance, style, status, and social characteristics of posts is presented.

...read moreread less

Journal ArticleDOI

BTM: Topic Modeling over Short Texts

Xueqi Cheng, +3 more

- 01 Dec 2014 -

IEEE Transactions on Knowledge and Data ...

TL;DR: This paper proposes a novel way for short text topic modeling, referred as biterm topic model (BTM), which learns topics by directly modeling the generation of word co-occurrence patterns in the corpus, making the inference effective with the rich corpus-level information.

...read moreread less

Topic Modeling on Online News Extraction

Citations

An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering

Automatic Text summarization in Gujarati language

Automatic Text summarization in Gujarati language

Text summarization using Secretary problem

A novel centroid based sentence classification approach for extractive summarization of COVID-19 news reports

References

Comparing twitter and traditional media using topic models

Learning to classify short and sparse text & web with hidden topics from large-scale data collections

A web-based kernel function for measuring the similarity of short text snippets

Characterizing Microblogs with Topic Models

BTM: Topic Modeling over Short Texts

Related Papers (5)

Matrix-based news aggregation: exploring different news perspectives

NPA: Neural News Recommendation with Personalized Attention

Neural News Recommendation with Multi-Head Self-Attention.

Application of News Features in News Recommendation Methods: A Survey

Ranking a stream of news