scispace - formally typeset
Search or ask a question
Author

Pramod B. Patil

Bio: Pramod B. Patil is an academic researcher. The author has contributed to research in topics: The Internet & Knowledge extraction. The author has an hindex of 1, co-authored 2 publications receiving 76 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: This survey paper discusses such successful techniques and methods to give effectiveness over information retrieval in text mining, the types of situations where each technology may be useful in order to help users are discussed.
Abstract: In recent years growth of digital data is increasing, knowledge discovery and data mining have attracted great attention with coming up need for turning such data into useful information and knowledge. The use of the information and knowledge extracted from a large amount of data benefits many applications like market analysis and business management. In many applications database stores information in text form so text mining is the one of the most resent area for research. To extract user required information is the challenging issue. Text Mining is an important step of knowledge discovery process. Text mining extracts hidden information from notstructured to semi-structured data. Text mining is the discovery by automatically extracting information from different written resources and also by computer for extracting new, previously unknown information. This survey paper tries to cover the text mining techniques and methods that solve these challenges. In this survey paper we discuss such successful techniques and methods to give effectiveness over information retrieval in text mining. The types of situations where each technology may be useful in order to help users are also discussed.

99 citations

Journal ArticleDOI
TL;DR: The performance of the two versions of Internet Protocol IPv4 and IPv6 is tested as well as compared on CentOS and windows 2007 operating systems for different voice samples, DNS traffic, data traffic and Internet gaming traffic characteristics like counterstrike and Quake III.
Abstract: Today’s era of packet switched networks demands larger bandwidth to suffice the need to integrate multimedia applications like Internet gaming, transmission of voice etc. It becomes necessary to judge the network performance with the allocated bandwidth. Network performance depends mainly on the efficiency of the protocol used in addition to load on the network, the transmission system type and the connected hardware capabilities. The performance of the two versions of Internet Protocol IPv4 and IPv6 is tested as well as compared on CentOS and windows 2007 operating systems for different voice samples, DNS traffic, data traffic and Internet gaming traffic characteristics like counterstrike and Quake III. The transport layer data traffic and the application layer DNS and voice traffic was generated using the latest version of Distributed Internet Traffic Grapher tool; D-ITG 2.8.0 rc1.The effect of transmitting voice over IP with compressed RTP and with and without voice activity detection is also observed.

Cited by
More filters
Journal ArticleDOI
TL;DR: This survey focused on analyzing the text mining studies related to Facebook and Twitter; the two dominant social media in the world, to describe how studies in social media have used text analytics and text mining techniques for the purpose of identifying the key themes in the data.
Abstract: Text mining has become one of the trendy fields that has been incorporated in several research fields such as computational linguistics, Information Retrieval (IR) and data mining Natural Language Processing (NLP) techniques were used to extract knowledge from the textual text that is written by human beings Text mining reads an unstructured form of data to provide meaningful information patterns in a shortest time period Social networking sites are a great source of communication as most of the people in today’s world use these sites in their daily lives to keep connected to each other It becomes a common practice to not write a sentence with correct grammar and spelling This practice may lead to different kinds of ambiguities like lexical, syntactic, and semantic and due to this type of unclear data, it is hard to find out the actual data order Accordingly, we are conducting an investigation with the aim of looking for different text mining methods to get various textual orders on social media websites This survey aims to describe how studies in social media have used text analytics and text mining techniques for the purpose of identifying the key themes in the data This survey focused on analyzing the text mining studies related to Facebook and Twitter; the two dominant social media in the world Results of this survey can serve as the baselines for future text mining research

158 citations

Book ChapterDOI
01 Jan 2018
TL;DR: A comprehensive overview about text mining and its current research status is demonstrated and experimental results indicated that Springer database represents the main source for research articles in the field of mobile education for the medical domain.
Abstract: Nowadays, research in text mining has become one of the widespread fields in analyzing natural language documents. The present study demonstrates a comprehensive overview about text mining and its current research status. As indicated in the literature, there is a limitation in addressing Information Extraction from research articles using Data Mining techniques. The synergy between them helps to discover different interesting text patterns in the retrieved articles. In our study, we collected, and textually analyzed through various text mining techniques, three hundred refereed journal articles in the field of mobile learning from six scientific databases, namely: Springer, Wiley, Science Direct, SAGE, IEEE, and Cambridge. The selection of the collected articles was based on the criteria that all these articles should incorporate mobile learning as the main component in the higher educational context. Experimental results indicated that Springer database represents the main source for research articles in the field of mobile education for the medical domain. Moreover, results where the similarity among topics could not be detected were due to either their interrelations or ambiguity in their meaning. Furthermore, findings showed that there was a booming increase in the number of published articles during the years 2015 through 2016. In addition, other implications and future perspectives are presented in the study.

125 citations

Journal ArticleDOI
TL;DR: In this paper, the authors investigated what are the key attributes and the structural relationship of those key attributes in hotel reviews and applied semantic network analysis, factor analysis and regression analysis to understand the experience and satisfaction of the hotel customer.
Abstract: With the development of social media, customers are sharing their experiences, and it is rapidly spreading as a form of online review. That is why the online review has become a significant information source affecting customers’ purchase intention and behavior. Therefore, it is important to understand the customer’s experience shown in the online review in order to maintain sustainable customer satisfaction and loyalty. The purpose of this study is to investigate what are the key attributes and the structural relationship of those key attributes. To accomplish this purpose, a total of 6596 hotel reviews were collected from Google (google.com). A frequency analysis using text mining was performed to figure out the most frequently mentioned attributes. In addition, semantic network analysis, factor analysis, and regression analysis were applied to understand the experience and satisfaction of the hotel customer. As a result, the top 99 keywords were divided into four groups such as “Intangible Service”, “Physical Environment”, “Purpose”, and “Location”. The factor analysis reduced the dimension of the original 64 keywords to 22 keywords, and grouped them into five factors, which are “Access”, “F&B (Food and Beverage)”, “Purpose”, “Tangibles”, and “Empathy”. Based on these results, theoretical and practical implications for sustainable hotel marketing strategies are suggested.

50 citations

Journal ArticleDOI
TL;DR: A comprehensive review of text analytics finds that the ontology- and rule-based approach has been dominant, at the same time, recent research has attempted to apply the state-of-the-art machine learning methods.

30 citations

Journal ArticleDOI
TL;DR: This paper addresses a comparison study on scientific unstructured text document classification (e-books) based on the full text where applying the most popular topic modeling approach (LDA, LSA) to cluster the words into a set of topics as important keywords for classification.
Abstract: With the rapid growth of information technology, the amount of unstructured text data in digital libraries is rapidly increased and has become a big challenge in analyzing, organizing and how to classify text automatically in E-research repository to get the benefit from them is the cornerstone. The manual categorization of text documents requires a lot of financial, human resources for management. In order to get so, topic modeling are used to classify documents. This paper addresses a comparison study on scientific unstructured text document classification (e-books) based on the full text where applying the most popular topic modeling approach (LDA, LSA) to cluster the words into a set of topics as important keywords for classification. Our dataset consists of (300) books contain about 23 million words based on full text. In the used topic models (LSA, LDA) each word in the corpus of vocabulary is connected with one or more topics with a probability, as estimated by the model. Many (LDA, LSA) models were built with different values of coherence and pick the one that produces the highest coherence value. The result of this paper showed that LDA has better results than LSA and the best results obtained from the LDA method was ( 0.592179 ) of coherence value when the number of topics was 20 while the LSA coherence value was (0.5773026) when the number of topics was 10.

30 citations