scispace - formally typeset
Search or ask a question
Author

S. Vijayarani

Bio: S. Vijayarani is an academic researcher from Bharathiar University. The author has contributed to research in topics: Knowledge extraction & Association rule learning. The author has an hindex of 2, co-authored 2 publications receiving 26 citations.

Papers
More filters
01 Jan 2010
TL;DR: This paper considers the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining, which aims to find patterns in data.
Abstract: Data mining is the process of extracting hidden information from the database. Data mining is emerging as one of the key features of many business organizations. The current trend in business collaboration shares the data and mined results to gain mutual benefit. The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to leverage this information. Apart from classification and regression, one of the most important tasks of data mining is to find patterns in data. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in enormous amount of data impose new threats on the seamless integration of information. In this paper, we consider the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining.

16 citations

01 Dec 2010
TL;DR: This paper considers the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining, which aims to find patterns in data.
Abstract: Data mining is the process of extracting hidden information from the database. Data mining is emerging as one of the key features of many business organizations. The current trend in business collaboration shares the data and mined results to gain mutual benefit. The problem of privacy-preserving data mining has become more important in recent years because of the increasing ability to store personal data about users, and the increasing sophistication of data mining algorithms to leverage this information. Apart from classification and regression, one of the most important tasks of data mining is to find patterns in data. In particular, new advances in data mining and knowledge discovery that allow for the extraction of hidden knowledge in enormous amount of data impose new threats on the seamless integration of information. In this paper, we consider the problem of building privacy preserving algorithms for one category of data mining techniques, the association rule mining.

10 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories is provided, which reveals the past development, present research challenges, future trends, the gaps and weaknesses.
Abstract: Preservation of privacy in data mining has emerged as an absolute prerequisite for exchanging confidential information in terms of data analysis, validation, and publishing. Ever-escalating internet phishing posed severe threat on widespread propagation of sensitive information over the web. Conversely, the dubious feelings and contentions mediated unwillingness of various information providers towards the reliability protection of data from disclosure often results utter rejection in data sharing or incorrect information sharing. This article provides a panoramic overview on new perspective and systematic interpretation of a list published literatures via their meticulous organization in subcategories. The fundamental notions of the existing privacy preserving data mining methods, their merits, and shortcomings are presented. The current privacy preserving data mining techniques are classified based on distortion, association rule, hide association rule, taxonomy, clustering, associative classification, outsourced data mining, distributed, and k-anonymity, where their notable advantages and disadvantages are emphasized. This careful scrutiny reveals the past development, present research challenges, future trends, the gaps and weaknesses. Further significant enhancements for more robust privacy protection and preservation are affirmed to be mandatory.

92 citations

Journal ArticleDOI
TL;DR: This work presents protocols based on the use of homomorphic encryption and different hashing schemes for both the semi-honest and malicious environments, while the protocol for the malicious environment is secure in the random oracle model.
Abstract: We consider the problem of computing the intersection of private datasets of two parties, where the datasets contain lists of elements taken from a large domain. This problem has many applications for online collaboration. In this work, we present protocols based on the use of homomorphic encryption and different hashing schemes for both the semi-honest and malicious environments. The protocol for the semi-honest environment is secure in the standard model, while the protocol for the malicious environment is secure in the random oracle model. Our protocols obtain linear communication and computation overhead. We further implement different variants of our semi-honest protocol. Our experiments show that the asymptotic overhead of the protocol is affected by different constants. (In particular, the degree of the polynomials evaluated by the protocol matters less than the number of polynomials that are evaluated.) As a result, the protocol variant with the best asymptotic overhead is not necessarily preferable for inputs of reasonable size.

87 citations

Proceedings ArticleDOI
01 Nov 2019
TL;DR: Among the compared models, as expected, Recurrent Neural Network is best suited for the paraphrase identification task and it is proposed that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.
Abstract: Paraphrase Identification or Natural Language Sentence Matching (NLSM) is one of the important and challenging tasks in Natural Language Processing where the task is to identify if a sentence is a paraphrase of another sentence in a given pair of sentences. Paraphrase of a sentence conveys the same meaning but its structure and the sequence of words varies. It is a challenging task as it is difficult to infer the proper context about a sentence given its short length. Also, coming up with similarity metrics for the inferred context of a pair of sentences is not straightforward as well. Whereas, its applications are numerous. This work explores various machine learning algorithms to model the task and also applies different input encoding scheme. Specifically, we created the models using Logistic Regression, Support Vector Machines, and different architectures of Neural Networks. Among the compared models, as expected, Recurrent Neural Network (RNN) is best suited for our paraphrase identification task. Also, we propose that Plagiarism detection is one of the areas where Paraphrase Identification can be effectively implemented.

32 citations

Journal ArticleDOI
TL;DR: A hybrid algorithm, which combines sampling, perturbation and generalization to protect data privacy from composition attacks is proposed and experimentally demonstrates that the proposed anonymization technique significantly reduces the risk of composition attacks and also preserves good data utility.

26 citations

Proceedings ArticleDOI
01 Aug 2016
TL;DR: A diffusion strategy of the LMS type is derived to solve distributed inference problems in the case where agents are also interested in preserving the privacy of the local measurements.
Abstract: Distributed optimization allows to address inference problems in a decentralized manner over networks, where agents can exchange information with their neighbors to improve their local estimates. Privacy preservation has become an important issue in many data mining applications. It aims at protecting the privacy of individual data in order to prevent the disclosure of sensitive information during the learning process. In this paper, we derive a diffusion strategy of the LMS type to solve distributed inference problems in the case where agents are also interested in preserving the privacy of the local measurements. We carry out a detailed mean and mean-square error analysis of the algorithm. Simulations are provided to check the theoretical findings.

13 citations