scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A Privacy-Preserving Mining Algorithm of Association Rules in Distributed Databases

20 Apr 2006-Vol. 2, pp 746-750
TL;DR: A secure mining algorithm of association rules, which builds a globe hash table to prune item-sets and incorporate cryptographic techniques to minimize the information shared is addressed.
Abstract: Association rules mining is one of the most important and fundamental problems in data mining. Recently, in need of security, more and more people are studying privacy- preserving association rules mining in distributed database. This paper addresses a secure mining algorithm of association rules, which builds a globe hash table to prune item- sets and incorporate cryptographic techniques to minimize the information shared.
Citations
More filters
Journal ArticleDOI
TL;DR: This work proposes two protocols, which are securely generating global association rules in horizontally distributed databases, and incorporates Shamir’s secret sharing scheme in the second protocol, which provides privacy by preventing colluding sites and external adversary attack.
Abstract: Distributed data mining has played a vital role in numerous application domains. However, it is widely observed that data mining may pose a privacy threat to individual’s sensitive information. To address privacy problem in distributed association rule mining (a data mining technique), we propose two protocols, which are securely generating global association rules in horizontally distributed databases. The first protocol uses the notion of Elliptic-curve-based Paillier cryptosystem, which helps in achieving the integrity and authenticity of the messages exchanged among involving sites over the insecure communication channel. It offers privacy of individual site’s information against the involving sites and an external adversary. However, the collusion of two sites may affect the privacy of individuals. To address this problem, we incorporate Shamir’s secret sharing scheme in the second protocol. It provides privacy by preventing colluding sites and external adversary attack. We analyse both protocols in terms of fulfilling the privacy-preserving distributed association rule mining requirements.

14 citations


Cites background from "A Privacy-Preserving Mining Algorit..."

  • ...Liu et al [21] have designed a privacypreserving association rules mining algorithm in distributed environment, wherein a global hash table is built to prune candidate itemsets in early iteration of mining operation, which increases the efficiency of algorithm....

    [...]

Posted Content
TL;DR: A new transactionrandomization method is proposed which is a combination of the fake transaction randomization method and a new per transaction randomized method which ensures a higher level of data privacy in comparison to the previous approaches.
Abstract: Privacy preserving association rule mining has triggered the development of many privacy preserving data mining techniques. A large fraction of them use randomized data distortion techniques to mask the data for preserving. This paper proposes a new transaction randomization method which is a combination of the fake transaction randomization method and a new per transaction randomization method. This method distorts the items within each transaction and ensures a higher level of data privacy in comparison to the previous approaches. The pertransaction randomization method involves a randomization function to replace the item by a random number guarantying privacy within the transaction also. A tool has also been developed to implement the proposed approach to mine frequent itemsets and association rules from the data guaranteeing the antimonotonic property.

12 citations


Additional excerpts

  • ...An algorithm for privacy preserving mining of association rules in distributed databases that builds a global hashing table Hi in every iteration, is proposed by Liu [18]....

    [...]

  • ...Figure 7 shows the comparison between the new approach and the approach of Lin and Liu....

    [...]

Proceedings ArticleDOI
26 Jul 2012
TL;DR: This paper proposes algorithm to mine association rule using elliptic curve cryptography technique over horizontally partitioned data and provides security against involving parties and intruder and also provides authentication between involving parties.
Abstract: In this paper, we propose algorithm to mine association rule using elliptic curve cryptography technique over horizontally partitioned data. Here we consider unsecured distributed environment. Our proposed algorithm provides security against involving parties and intruder and also provides authentication between involving parties. Finally we analyze the privacy and security provided by our proposed algorithm.

10 citations


Cites background from "A Privacy-Preserving Mining Algorit..."

  • ...[5] In term of association rule mining, rule XY � Z will be discovered, as long as satisfy these conditions: Is!es Support Count XYZ (i) 1....

    [...]

Book ChapterDOI
01 Jan 2016
TL;DR: A security protocol that derives association rules securely from the horizontally distributed databases without a trusted third party (TTP) and ensures privacy and security of owner’s with the help of elliptic curve based Diffie-Hellman and Digital Signature Algorithm.
Abstract: In this research work, we design a security protocol that derives association rules securely from the horizontally distributed databases without a trusted third party (TTP), even communication channel is unsecured between involving sites It ensures privacy and security of owner’s with the help of elliptic curve based Diffie-Hellman and Digital Signature Algorithm

6 citations

DOI
04 Oct 2010
TL;DR: An elliptic curve cryptography based algorithm is proposed to mine privacy-preserving association rules on horizontal partitioned data and provides privacy and security against involving parties and other parties (adversaries) who can reveal information by reading unsecured channel between involving parties.
Abstract: Distributed data mining techniques are often used for various applications. In terms of privacy and security issues, these techniques are recently investigated with a conclusion that they reveal data or information to each other parties involved to find global valid results. But because of privacy issues, involving parties do not want to reveal such type of data. Recently many cryptography techniques have been found to address privacy problems in distributed mining. In this paper, we propose an elliptic curve cryptography based algorithm to mine privacy-preserving association rules on horizontal partitioned data. Moreover, we have also considered unsecured communication channels in distributed environment. Proposed algorithm provides privacy and security against involving parties and other parties (adversaries) who can reveal information by reading unsecured channel between involving parties. Finally, we analyze the privacy and security provided by proposed algorithm and also discuss the communication and computation cost of proposed algorithm. Keywords-Data Mining; Association Rules; Distributed Databases; Privacy; Security; Eliptic Curve Cryptography;

5 citations


Cites background from "A Privacy-Preserving Mining Algorit..."

  • ...[15] proposed an efficient and secure algorithm for mining privacy preserved association rules in distributed environment, which uses hashing of candidate itemsets and filter out the itemset whose support exceeds the minimum thresholds....

    [...]

References
More filters
Proceedings ArticleDOI
01 Jun 1993
TL;DR: An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.
Abstract: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.

15,645 citations


"A Privacy-Preserving Mining Algorit..." refers methods in this paper

  • ...CD [10] algorithm, which is an adaptation of the Apriori’s, has a simple communication scheme for count exchange, but it has problems of higher number of candidate sets and larger amount of communication....

    [...]

Proceedings Article
01 Jul 1998
TL;DR: Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
Abstract: We consider the problem of discovering association rules between items in a large database of sales transactions. We present two new algorithms for solving thii problem that are fundamentally different from the known algorithms. Empirical evaluation shows that these algorithms outperform the known algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems. We also show how the best features of the two proposed algorithms can be combined into a hybrid algorithm, called AprioriHybrid. Scale-up experiments show that AprioriHybrid scales linearly with the number of transactions. AprioriHybrid also has excellent scale-up properties with respect to the transaction size and the number of items in the database.

10,863 citations

Journal ArticleDOI
TL;DR: An efficient algorithm is presented that generates all significant transactions in a large database of customer transactions that consists of items purchased by a customer in a visit.
Abstract: We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant assoc...

3,198 citations

Journal ArticleDOI
TL;DR: In this paper, a survey of the available data mining techniques is provided and a comparative study of such techniques is presented, based on a database researcher's point-of-view.
Abstract: Mining information and knowledge from large databases has been recognized by many researchers as a key research topic in database systems and machine learning, and by many industrial companies as an important area with an opportunity of major revenues. Researchers in many different fields have shown great interest in data mining. Several emerging applications in information-providing services, such as data warehousing and online services over the Internet, also call for various data mining techniques to better understand user behavior, to improve the service provided and to increase business opportunities. In response to such a demand, this article provides a survey, from a database researcher's point of view, on the data mining techniques developed recently. A classification of the available data mining techniques is provided and a comparative study of such techniques is presented.

2,327 citations

Proceedings Article
11 Sep 1995
TL;DR: This paper presents an efficient algorithm for mining association rules that is fundamentally different from known algorithms and not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases.
Abstract: Mining for a.ssociation rules between items in a large database of sales transactions has been described as an important database mining problem. In this paper we present an efficient algorithm for mining association rules that is fundamentally different from known algorithms. Compared to previous algorithms, our algorithm not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases. We have performed extensive experiments and compared the performance of our algorithm with one of the best existing algorithms. It was found that for large databases, the CPU overhead was reduced by as much as a factor of four and I/O was reduced by almost an order of magnitude. Hence this algorithm is especially suitable for very large size databases.

1,822 citations