scispace - formally typeset
Search or ask a question
Author

G. Poornalatha

Bio: G. Poornalatha is an academic researcher from Manipal University. The author has contributed to research in topics: Web page & Web mining. The author has an hindex of 4, co-authored 11 publications receiving 52 citations. Previous affiliations of G. Poornalatha include National Institute of Technology, Karnataka & Manipal Institute of Technology.

Papers
More filters
Book ChapterDOI
22 Jul 2011
TL;DR: This paper proposes an effective clustering technique to group users’ sessions by modifying K-means algorithm and suggests a method to compute the distance between sessions based on similarity of their web access path, which takes care of the issue of the user sessions that are of variable length.
Abstract: The proliferation of internet along with the attractiveness of the web in recent years has made web mining as the research area of great magnitude. Web mining essentially has many advantages which makes this technology attractive to researchers. The analysis of web user’s navigational pattern within a web site can provide useful information for applications like, server performance enhancements, restructuring a web site, direct marketing in ecommerce etc. The navigation paths may be explored based on some similarity criteria, in order to get the useful inference about the usage of web. The objective of this paper is to propose an effective clustering technique to group users’ sessions by modifying K-means algorithm and suggest a method to compute the distance between sessions based on similarity of their web access path, which takes care of the issue of the user sessions that are of variable length.

22 citations

Proceedings ArticleDOI
26 Aug 2012
TL;DR: The present paper attempts to solve the problem of predicting the next page to be accessed by the user based on the mining of web server logs that maintains the information of users who access the web site.
Abstract: The tremendous progress of the internet and the World Wide Web in the recent era has emphasized the requirement for reducing the latency at the client or the user end. In general, caching and prefetching techniques are used to reduce the delay experienced by the user while waiting to get the web page from the remote web server. The present paper attempts to solve the problem of predicting the next page to be accessed by the user based on the mining of web server logs that maintains the information of users who access the web site. The prediction of next page to be visited by the user may be pre fetched by the browser which in turn reduces the latency for user. Thus analyzing user's past behavior to predict the future web pages to be navigated by the user is of great importance. The proposed model yields good prediction accuracy compared to the existing methods like Markov model, association rule, ANN etc.

11 citations

Journal ArticleDOI
TL;DR: This paper proposes a technique, to measure the similarity between any two user sessions based on sequence alignment technique that uses the dynamic programming method.

8 citations

Proceedings ArticleDOI
30 Nov 2018
TL;DR: This paper has worked on improving the algorithms so that the sentiment conveyed can be classified in the appropriate class it belongs to, and aims at developing a system that perceives the opinion of people about a specific product or a person.
Abstract: Customer satisfaction has become a part of many business. Unlike in the past, companies just do not rely on pure advertisement to make their product more desirable. Their prime concern now has turned towards customer satisfaction. Similarly, people are more curious to know about the current popular opinion on events happening around the world and information about the favorite celebrities, favorite product, etc. People have turned towards social media to share their experiences and views about products as well as other people. The current work aims at using this as a base for developing a system that perceives the opinion of people about a specific product or a person. Till now, there is a lot of research that has been done in this topic. Various papers have showed different strategies to enhance sentiment analysis. In this paper, we have worked on improving the algorithms so that the sentiment conveyed can be classified in the appropriate class it belongs to.

8 citations

Book ChapterDOI
01 Jan 2021
TL;DR: The regular kNN classifier is compared with the various classifiers conceptually and the ARSkNN that uses mass estimation has been proved to be commensurate to kNN in accuracy and has reduced computation time drastically on datasets chosen for this analysis.
Abstract: We are living in a data age and with the expansion of ‘Internet of Things’ platform, there is an upsurge in devices connected to the Internet. Everything from smart sensors to smartphones and tablets, systems installed in manufacturing units, hospitals, vehicles, etc. is generating data. Such developments in the technological world have escalated the generation of data and require an analysis to be performed on the raw data to identify patterns. The data mining techniques are deployed extensively to extract information and they yield far-reaching effects on the trade and the lives of the people concerned. The accuracy and effectiveness of data mining techniques in providing better outcomes and cost-effective methods in various domains have been established. Usually, in supervised learning, density estimation is used by instance-based learning classifiers like k-nearest neighbor (kNN). In this paper, the regular kNN classifier is compared with the various classifiers conceptually and the ARSkNN that uses mass estimation has been proved to be commensurate to kNN in accuracy and has reduced computation time drastically on datasets chosen for this analysis. Tenfold cross-validation is used for testing.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This research work has presented a novel web-based recommender system which is based on sequential information of user’s navigation on web pages, and a comparison between the existing model and the proposed model showed that the accuracy of the proposed system is almost three times better than some existing systems.
Abstract: With the exponential development of the number of users browsing the internet, an important factor that now the developer community is focussing on is the user experience Recommender systems are the platforms that make personalized recommendations for a particular user by predicting the ratings for various items Recommender systems majorly ignore the sequential information and rather focus on content information, but sequential information also provides much information about the behavior of the user In this research work, we have presented a novel web-based recommender system which is based on sequential information of user’s navigation on web pages We received top-N clusters when Fuzzy C-mean (FCM) clustering is employed We determined the similar users for the target user and also evaluated the weight for each web page We have tried to solve that problem of recommender systems as we offered a system to forecast a user’s next Web page visit In our work, we proposed a system which generates recommendations to the users, by considering the sequential information that exists in their usage patterns of Web pages We employed fuzzy clustering to give recommender system a sequential approach We calculated weights for each page category considered in our system and predict top page recommendation for the target user The real-world dataset of MNSBC is used in the experiments The dataset consists of 5000 user entries with 6, entries per user When we performed a comparison between the existing model with our proposed model, then it clearly showed that the accuracy of the proposed model is almost three times better than some existing systems The accuracy of our proposed model is nearly 33 %

47 citations

Proceedings ArticleDOI
24 Oct 2019
TL;DR: An algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text to obtain the sentiment and an algorithm to train the Support Vector Machine, Deep Learning, and Naïve Bayes classifiers to process Twitter data.
Abstract: In the big data era, data is made in real-time or closer to real-time. Thus, businesses can utilize this evergrowing volume of data for the data-driven or information-driven decision-making process to improve their businesses. Social media, like Twitter, generates an enormous amount of such data. However, social media data are often unstructured and difficult to manage. Hence, this study proposes an effective text data preprocessing technique and develop an algorithm to train the Support Vector Machine (SVM), Deep Learning (DL) and Naive Bayes (NB) classifiers to process Twitter data. We develop an algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text. In this study, we (i) compare different preprocessing techniques on the data collected from Twitter using various techniques such as (stemming, lemmatization and spelling correction) to obtain the efficient method (ii) develop an algorithm to weight the scores of the hashtag and cleaned text to obtain the sentiment. We retrieved N=1,314,000 Twitter data, and we compared the popularity of two products, Google Now and Amazon Alexa. Using our data preprocessing algorithm and sentiment weight score algorithm, we train SVM, DL, NB models. The results show that stemming technique performed best in terms of computational speed. Additionally, the accuracy of the algorithm was tested against manually sorted sentiments and sentiments produced before text data preprocessing. The result demonstrated that the impact produced by the algorithm was close to the manually annotated sentiments. In terms of model performance, the SVM performed better with the accuracy of 90.3%, perhaps, due to the unstructured nature of Twitter data. Previous studies used conventional techniques; hence, no precise methods were utilized on cleaning the text. Therefore, our approach confirms that proper text data preprocessing technique plays a significant role in the prediction accuracy and computational time of the classifier when using the unstructured Twitter data.

28 citations

Journal ArticleDOI
TL;DR: A novel detection method robust against evasion strategies based on mimicry, demonstrating great precision against conventional masqueraders and a success rate of 80.2% when identifying mimicry attacks, hence outperforming the best contributions of bibliography.
Abstract: A framework for online detection of masquerade attacks is proposed.At the analysis stage, local alignment algorithms are introduced.At the verification stage, a validation scheme based on the U-test is implemented.For mimicry recognition, the parallel analysis of monitored actions is performed.For evaluating the approach, the SEA dataset is applied. Masquerade attackers are internal intruders acting through impersonating legitimate users of the victim system. Most of the proposals for their detection suggested recognition methods based on the comparison of use models of the protected environment. However recent studies have shown their vulnerability against adversarial attacks based on imitating the behavior of legitimate users. In order to contribute to their identification, this article introduces a novel detection method robust against evasion strategies based on mimicry. The proposal described two levels of information processing: analysis and verification. At the analysis stage, local alignment algorithms are implemented. In this way it is possible to score the similarity between action sequences performed by users, bearing in mind their regions of greatest resemblance. On the other hand, a novel validation scheme based on the statistical non-parametric U-test is implemented. Through this it is possible to refine the labeling of sequences to avoid making hasty decisions when their nature is not sufficiently clear. In order to strengthen their effectiveness against mimicry attacks, the analysis of the monitored sequences is performed in concurrency. This involves partitioning long sequences with two purposes: making subsequences of small intrusions more visible and analyzing new sequences when suspicious situations occur, such as the execution of never before seen commands or the discovery of potentially harmful activities. The proposal has been evaluated from the functional standard SEA and mimicry attacks. Promising experimental results have been shown, demonstrating great precision against conventional masqueraders (TPR=98.3%, FPR=0.77%) and a success rate of 80.2% when identifying mimicry attacks, hence outperforming the best contributions of bibliography.

22 citations

Proceedings ArticleDOI
01 Sep 2014
TL;DR: This paper intends to implement intelligent web mining in the form of Web Page Ranking Tool so as to improve the web page ranking process through incorporation of Back Propagation neural networks.
Abstract: In a short span of time, less than 15 years, the web search process is modified enormously because of magnificent growth in web based information resources. The speedy expansion of web is enjoyable because of the increase in information resources but at the same time its huge size and interference of SEOs in search process lead to increased difficulty in extracting relevant information from the web. Personalized web search may be the solution to relevancy problem but user is reluctant in giving his personal information because of privacy concerns [1]. Moreover most existing web mining algorithms do not possess attractive time and space complexities and hence lead to sufferings of novice user. This paper addresses above mentioned issues of Search Engine domain and intends to implement intelligent web mining in the form of Web Page Ranking Tool so as to improve the web page ranking process through incorporation of Back Propagation neural networks. [2][3][4].

18 citations

Journal ArticleDOI
TL;DR: A novel technique has been proposed to pre-process the web log data to extract sequence of occurrence and navigation patterns helpful for prediction and is tested on web log files of NASA and enggresources.

15 citations