scispace - formally typeset
Journal ArticleDOI

A review of recent advances in text mining of Indian languages

TLDR
This paper reviews the research work done so far, availability of language resources and various challenges of text mining tasks in Indian languages, and discusses the challenges of mining text in English language through text-mining techniques.
Abstract
Text mining in English language has been researched extensively in past and significant amount of resources, tools and techniques have been developed. India is a country of high language diversity. A large amount of textual data is available in Indian languages. Knowledge can be discovered from this text by applying text-mining techniques. Due to the characteristics of Indian languages, tools, techniques and resources available for mining text in English language cannot be applied directly to text in Indian languages. We could not find any comprehensive literature describing the research work related to mining of text written in Indian languages. In this paper, we review the research work done so far, availability of language resources and various challenges of text mining tasks in Indian languages.

read more

Citations
More filters
Journal ArticleDOI

A comparative analysis of emerging scientific themes in business analytics

TL;DR: The purpose of this research is to investigate the emerging scientific themes in business analytics through the utilisation of burst detection, text-clustering and word occurrence analysis in top information systems journals in order to provide an insight about the future scientific trends of business analytics for scholars and practitioners in the field.
Book ChapterDOI

Introduction to Sentiment Analysis Covering Basics, Tools, Evaluation Metrics, Challenges, and Applications

TL;DR: In this paper, the authors discuss the basics of sentiment analysis and its methodology, including data collection, data pre-processing, and feature extraction methods, and discuss enhancement techniques for sentiment analysis, including text categorization, feature selection, data integration, ontology-based approaches, and so on.
Journal ArticleDOI

Text structuring methods based on complex network: a systematic review

TL;DR: This article conducted a systematic review of papers dealing with the use of complex networks approaches for the process of analyzing text and concluded that complex network topological properties, measures and modeling can capture and identify text structures concerning different purposes such as text analysis, classification, topic and keyword extraction, and summarization.
References
More filters

A Fall-back Strategy for Sentiment Analysis in Hindi: a Case Study

TL;DR: This paper proposes in this paper a fall-back strategy to do sentiment analysis for Hindi documents, a problem on which, to the best of the knowledge, no work has been done until now.

SentiWordNet for Indian Languages

TL;DR: Multiple computational techniques like, WordNet based, dictionary based, Dictionary based, corpus based or generative approaches for generating SentiWordNet(s) for Indian languages are proposed.
Journal ArticleDOI

A Survey on Sentiment Analysis and Opinion Mining Techniques

TL;DR: A survey on main approaches for performing sentiment extraction of Natural Language processing applications for sentiment analysis on various Indian languages like Bengali, Hindi, Telugu and Malayalam is described.
Journal ArticleDOI

Automatic classification of Tamil documents using vector space model and artificial neural network

TL;DR: In this paper, the authors proposed the application of vector space model (VSM) and ANN for the classification of Tamil language documents, and the experimental results show that ANN model achieves 93.33% which is better than the performance of VSM which yields 90.33%.
Journal ArticleDOI

Unsupervised morphological parsing of Bengali

TL;DR: This paper introduces a simple, yet highly effective algorithm for unsupervised morphological learning for Bengali, an Indo–Aryan language that is highly inflectional in nature.
Related Papers (5)
Trending Questions (1)
What are the challenges in text mining for Indian languages?

The challenges in text mining for Indian languages include tokenization, POS tagging, feature selection, subjective lexicon creation, translation, and lack of training data.