scispace - formally typeset
Search or ask a question
Author

R. SeethaLakshmi

Bio: R. SeethaLakshmi is an academic researcher. The author has contributed to research in topics: Knowledge extraction & Association rule learning. The author has an hindex of 3, co-authored 3 publications receiving 34 citations.

Papers
More filters
01 Jan 2010
TL;DR: This paper considers the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining, which aims to find patterns in data.
Abstract: Data mining is the process of extracting hidden information from the database. Data mining is emerging as one of the key features of many business organizations. The current trend in business collaboration shares the data and mined results to gain mutual benefit. The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to leverage this information. Apart from classification and regression, one of the most important tasks of data mining is to find patterns in data. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in enormous amount of data impose new threats on the seamless integration of information. In this paper, we consider the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining.

16 citations

01 Dec 2010
TL;DR: This paper considers the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining, which aims to find patterns in data.
Abstract: Data mining is the process of extracting hidden information from the database. Data mining is emerging as one of the key features of many business organizations. The current trend in business collaboration shares the data and mined results to gain mutual benefit. The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to leverage this information. Apart from classification and regression, one of the most important tasks of data mining is to find patterns in data. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in enormous amount of data impose new threats on the seamless integration of information. In this paper, we consider the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining.

10 citations

Journal Article
TL;DR: This research work uses tabu search optimization technique for modifying the sensitive items for hiding the sensitive association rules, which is an important research problem in privacy preserving data mining.
Abstract: Data mining algorithms are used for extracting the hidden knowledge from the large databases. Privacy preserving data mining is a new research area in the field of data mining which mainly deals with the side effects of the data mining techniques. The term privacy is denotes the individual‟s information should be protected. Nowadays, privacy protection has turn out to be an essential issue in data mining research. A primary constraint of privacy-preserving data mining is to prevent the sensitive knowledge extraction by protecting the input data, yet still allow the data miners to pull out the useful knowledge models. Hiding sensitive association rule is an important research problem in privacy preserving data mining. Sensitive association rules are protected by modifying the sensitive items in the original data set. In this research work, tabu search optimization technique is used for modifying the sensitive items for hiding the sensitive association rules.

8 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories is provided, which reveals the past development, present research challenges, future trends, the gaps and weaknesses.
Abstract: Preservation of privacy in data mining has emerged as an absolute prerequisite for exchanging confidential information in terms of data analysis, validation, and publishing. Ever-escalating internet phishing posed severe threat on widespread propagation of sensitive information over the web. Conversely, the dubious feelings and contentions mediated unwillingness of various information providers towards the reliability protection of data from disclosure often results utter rejection in data sharing or incorrect information sharing. This article provides a panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories. The fundamental notions of the existing privacy preserving data mining methods, their merits, and shortcomings are presented. The current privacy preserving data mining techniques are classified based on distortion, association rule, hide association rule, taxonomy, clustering, associative classification, outsourced data mining, distributed, and k-anonymity, where their notable advantages and disadvantages are emphasized. This careful scrutiny reveals the past development, present research challenges, future trends, the gaps and weaknesses. Further significant enhancements for more robust privacy protection and preservation are affirmed to be mandatory.

92 citations

Journal ArticleDOI
TL;DR: This work presents protocols based on the use of homomorphic encryption and different hashing schemes for both the semi-honest and malicious environments, while the protocol for the malicious environment is secure in the random oracle model.
Abstract: We consider the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain. This problem has many applications for online collaboration. In this work, we present protocols based on the use of homomorphic encryption and different hashing schemes for both the semi-honest and malicious environments. The protocol for the semi-honest environment is secure in the standard model, while the protocol for the malicious environment is secure in the random oracle model. Our protocols obtain linear communication and computation overhead. We further implement different variants of our semi-honest protocol. Our experiments show that the asymptotic overhead of the protocol is affected by different constants. (In particular, the degree of the polynomials evaluated by the protocol matters less than the number of polynomials that are evaluated.) As a result, the protocol variant with the best asymptotic overhead is not necessarily preferable for inputs of reasonable size.

87 citations

Journal ArticleDOI
TL;DR: A review of the state-of-the-art methods for privacy preservation is presented and analyzes the techniques for privacy preserving association rule mining and points out their merits and demerits.
Abstract: Businesses share data, outsourcing for specific business problems. Large companies stake a large part of their business on analysis of private data. Consulting firms often handle sensitive third party data as part of client projects. Organizations face great risks while sharing their data. Most of this sharing takes place with little secrecy. It also increases the legal responsibility of the parties involved in the process. So, it is crucial to reliably protect their data due to legal and customer concerns. In this paper, a review of the state-of-the-art methods for privacy preservation is presented. It also analyzes the techniques for privacy preserving association rule mining and points out their merits and demerits. Finally the challenges and directions for future research are discussed.

34 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: Among the compared models, as expected, Recurrent Neural Network is best suited for the paraphrase identification task and it is proposed that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.
Abstract: Paraphrase Identification or Natural Language Sentence Matching (NLSM) is one of the important and challenging tasks in Natural Language Processing where the task is to identify if a sentence is a paraphrase of another sentence in a given pair of sentences. Paraphrase of a sentence conveys the same meaning but its structure and the sequence of words varies. It is a challenging task as it is difficult to infer the proper context about a sentence given its short length. Also, coming up with similarity metrics for the inferred context of a pair of sentences is not straightforward as well. Whereas, its applications are numerous. This work explores various machine learning algorithms to model the task and also applies different input encoding scheme. Specifically, we created the models using Logistic Regression, Support Vector Machines, and different architectures of Neural Networks. Among the compared models, as expected, Recurrent Neural Network (RNN) is best suited for our paraphrase identification task. Also, we propose that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.

32 citations