Privacy-Preserving Data Mining in Spatiotemporal Databases Based on Mining Negative Association Rules

doi:10.1007/978-981-15-0135-7_32

Home
/
Papers
/
Privacy-Preserving Data Mining in Spatiotemporal Databases Based on Mining Negative Association Rules

Book Chapter•DOI•

Privacy-Preserving Data Mining in Spatiotemporal Databases Based on Mining Negative Association Rules

K. S. Ranjith¹, A. Geetha Mary¹•Institutions (1)

VIT University¹

01 Jan 2020-pp 329-339

TL;DR: The mathematical calculation was done and proved that this approach is best for mining association rules for spatiotemporal databases based on the mining negative association rules and cryptography with low storage and communication cost.

read less

Abstract: In the real world, most of the entities are involved with space and time, from any starting point to the end point of the space. The conventional data mining process is extended to the mining knowledge of the spatiotemporal databases. The major knowledge is to mine the association rules in the spatiotemporal databases; the traditional approaches are not sufficient to do mining in the spatiotemporal databases. While mining the association rules, the privacy is the main concern. This paper proposed privacy preserved data mining technique for spatiotemporal databases based on the mining negative association rules and cryptography with low storage and communication cost. In the proposed approach first, the partial support for all the distributed sites is calculated, and then finally, the actual support was calculated to achieve privacy preserve data mining. The mathematical calculation was done and proved that this approach is best for mining association rules for spatiotemporal databases.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Privacy Preserving Data Mining Framework for Negative Association Rules: An Application to Healthcare Informatics

[...]

Saad M. Darwish, Reham Moustafa Essa, Mohamed A. Osman, Ahmed A. Ismail

01 Jan 2022-IEEE Access

TL;DR: Experiments with benchmark healthcare datasets show that the suggested privacy preserving data mining (PPDM) method outperforms existing algorithms in terms of Hiding Failure (HF), Artificial Rule Generation (AR), and Lost Rules (LR).

...read moreread less

Abstract: Protecting the privacy of healthcare information is an important part of encouraging data custodians to give accurate records so that mining may proceed with confidence. The application of association rule mining in healthcare data has been widespread to this point in time. Most applications focus on positive association rules, ignoring the negative consequences of particular diagnostic techniques. When it comes to bridging divergent diseases and drugs, negative association rules may give more helpful information than positive ones. This is especially true when it comes to physicians and social organizations (e.g., a certain symptom will not arise when certain symptoms exist). Data mining in healthcare must be done in a way that protects the identity of patients, especially when dealing with sensitive information. However, revealing this information puts it at risk of attack. Healthcare data privacy protection has lately been addressed by technologies that disrupt data (data sanitization) and reconstruct aggregate distributions in the interest of doing research in data mining. In this study, metaheuristic-based data sanitization for healthcare data mining is investigated in order to keep patient privacy protected. It is hoped that by using the Tabu-genetic algorithm as an optimization tool, the suggested technique chooses item sets to be sanitized (modified) from transactions that satisfy sensitive negative criteria with the goal of minimizing changes to the original database. Experiments with benchmark healthcare datasets show that the suggested privacy preserving data mining (PPDM) method outperforms existing algorithms in terms of Hiding Failure (HF), Artificial Rule Generation (AR), and Lost Rules (LR).

...read moreread less

10 citations

Journal Article•DOI•

An Adaptive Privacy Preserving Framework for Distributed Association Rule Mining in Healthcare Databases

[...]

Hasanien K. Kuba, Mustafa A. Azzawi, Saad M. Darwish, Oday A. Hassen, Ansam A. Abdulhussein - Show less +1 more

01 Jan 2023-Cmc-computers Materials & Continua

TL;DR: In this paper , the Tabu-genetic optimization paradigm was used for negative association rule mining in vertically partitioned healthcare datasets that respects users' privacy, and the applied approach dynamically determines the transactions to be interrupted for information hiding, instead of predefining them.

...read moreread less

Abstract: It is crucial, while using healthcare data, to assess the advantages of data privacy against the possible drawbacks. Data from several sources must be combined for use in many data mining applications. The medical practitioner may use the results of association rule mining performed on this aggregated data to better personalize patient care and implement preventive measures. Historically, numerous heuristics (e.g., greedy search) and metaheuristics-based techniques (e.g., evolutionary algorithm) have been created for the positive association rule in privacy preserving data mining (PPDM). When it comes to connecting seemingly unrelated diseases and drugs, negative association rules may be more informative than their positive counterparts. It is well-known that during negative association rules mining, a large number of uninteresting rules are formed, making this a difficult problem to tackle. In this research, we offer an adaptive method for negative association rule mining in vertically partitioned healthcare datasets that respects users’ privacy. The applied approach dynamically determines the transactions to be interrupted for information hiding, as opposed to predefining them. This study introduces a novel method for addressing the problem of negative association rules in healthcare data mining, one that is based on the Tabu-genetic optimization paradigm. Tabu search is advantageous since it removes a huge number of unnecessary rules and item sets. Experiments using benchmark healthcare datasets prove that the discussed scheme outperforms state-of-the-art solutions in terms of decreasing side effects and data distortions, as measured by the indicator of hiding failure.

...read moreread less

References

PDF

Open Access

More filters

Journal Article•DOI•

Efficient mining of association rules in distributed databases

[...]

David W. Cheung¹, Vincent To Yee Ng², Ada Wai-Chee Fu³, Yongjian Fu⁴•Institutions (4)

University of Hong Kong¹, Hong Kong Polytechnic University², The Chinese University of Hong Kong³, Simon Fraser University⁴

01 Dec 1996-IEEE Transactions on Knowledge and Data Engineering

TL;DR: An efficient algorithm called DMA (Distributed Mining of Association rules), which generates a small number of candidate sets and requires only O(n) messages for support-count exchange for each candidate set, in distributed databases.

...read moreread less

Abstract: Many sequential algorithms have been proposed for the mining of association rules. However, very little work has been done in mining association rules in distributed databases. A direct application of sequential algorithms to distributed databases is not effective, because it requires a large amount of communication overhead. In this study, an efficient algorithm called DMA (Distributed Mining of Association rules), is proposed. It generates a small number of candidate sets and requires only O(n) messages for support-count exchange for each candidate set, where n is the number of sites in a distributed database. The algorithm has been implemented on an experimental testbed, and its performance is studied. The results show that DMA has superior performance, when compared with the direct application of a popular sequential algorithm, in distributed databases.

...read moreread less

365 citations

Journal Article•DOI•

Survey of Spatio-Temporal Databases

[...]

Tamas Abraham¹, John F. Roddick¹•Institutions (1)

University of South Australia¹

01 Mar 1999-Geoinformatica

TL;DR: An overview of previous achievements within the spatio-temporal data field is provided and areas currently receiving or requiring further investigation are highlighted.

...read moreread less

Abstract: Spatio-temporal databases aim to support extensions to existing models of Spatial Information Systems (SIS) to include time in order to better describe our dynamic environment. Although interest into this area has increased in the past decade, a number of important issues remain to be investigated. With the advances made in temporal database research, we can expect a more unified approach towards aspatial temporal data in SIS and a wider discussion on spatio-temporal data models. This paper provides an overview of previous achievements within the field and highlights areas currently receiving or requiring further investigation.

...read moreread less

241 citations

Journal Article•DOI•

A Framework for Evaluating Privacy Preserving Data Mining Algorithms

[...]

Elisa Bertino¹, Igor Nai Fovino², Loredana Parasiliti Provenza²•Institutions (2)

Purdue University¹, University of Milan²

01 Sep 2005-Data Mining and Knowledge Discovery

TL;DR: A first evaluation framework for estimating and comparing different kinds of PPDM algorithms and applies its criteria to a specific set of algorithms and discusses the evaluation results the authors obtain.

...read moreread less

Abstract: Recently, a new class of data mining methods, known as privacy preserving data mining (PPDM) algorithms, has been developed by the research community working on security and knowledge discovery. The aim of these algorithms is the extraction of relevant knowledge from large amount of data, while protecting at the same time sensitive information. Several data mining techniques, incorporating privacy protection mechanisms, have been developed that allow one to hide sensitive itemsets or patterns, before the data mining process is executed. Privacy preserving classification methods, instead, prevent a miner from building a classifier which is able to predict sensitive data. Additionally, privacy preserving clustering techniques have been recently proposed, which distort sensitive numerical attributes, while preserving general features for clustering analysis. A crucial issue is to determine which ones among these privacy-preserving techniques better protect sensitive information. However, this is not the only criteria with respect to which these algorithms can be evaluated. It is also important to assess the quality of the data resulting from the modifications applied by each algorithm, as well as the performance of the algorithms. There is thus the need of identifying a comprehensive set of criteria with respect to which to assess the existing PPDM algorithms and determine which algorithm meets specific requirements. In this paper, we present a first evaluation framework for estimating and comparing different kinds of PPDM algorithms. Then, we apply our criteria to a specific set of algorithms and discuss the evaluation results we obtain. Finally, some considerations about future work and promising directions in the context of privacy preservation in data mining are discussed.

...read moreread less

203 citations

Proceedings Article•DOI•

A Survey on Privacy Preserving Data Mining

[...]

Jian Wang¹, Yongcheng Luo¹, Yan Zhao¹, Jiajin Le¹•Institutions (1)

Donghua University¹

25 Apr 2009

TL;DR: This paper intends to reiterate several privacy preserving data mining technologies clearly and then proceeds to analyze the merits and shortcomings of these technologies.

...read moreread less

Abstract: Privacy preserving becomes an important issue in the development progress of data mining techniques. Privacy preserving data mining has become increasingly popular because it allows sharing of privacy-sensitive data for analysis purposes. So people have become increasingly unwilling to share their data, frequently resulting in individuals either refusing to share their data or providing incorrect data. In turn, such problems in data collection can affect the success of data mining, which relies on sufficient amounts of accurate data in order to produce meaningful results. In recent years, the wide availability of personal data has made the problem of privacy preserving data mining an important one. A number of methods have recently been proposed for privacy preserving data mining of multidimensional data records. This paper intends to reiterate several privacy preserving data mining technologies clearly and then proceeds to analyze the merits and shortcomings of these technologies.

...read moreread less

65 citations

Journal Article•DOI•

Mining spatio-temporal data

[...]

Gennady Andrienko, Donato Malerba¹, Michael May, Maguelonne Teisseire²•Institutions (2)

University of Bari¹, University of Montpellier²

01 Nov 2006

TL;DR: Despite much formalization of space and time relations available in spatio-temporal reasoning, the extraction of spatial/ temporal relations implicitly defined in the data introduces some degree of fuzziness that may have a large impact on the results of the data mining process.

...read moreread less

Abstract: Both the temporal and spatial dimensions add substantial complexity to data mining tasks. First of all, the spatial relations, both metric (such as distance) and non-metric (such as topology, direction, shape, etc.) and the temporal relations (such as before and after) are information bearing and therefore need to be considered in the data mining methods. Secondly, some spatial and temporal relations are implicitly defined, that is, they are not explicitly encoded in a database. These relations must be extracted from the data and there is a trade-off between precomputing them before the actual mining process starts (eager approach) and computing them on-the-fly when they are actually needed (lazy approach). Moreover, despite much formalization of space and time relations available in spatio-temporal reasoning, the extraction of spatial/ temporal relations implicitly defined in the data introduces some degree of fuzziness that may have a large impact on the results of the data mining process. J Intell Inf Syst (2006) 27: 187–190 DOI 10.1007/s10844-006-9949-3

...read moreread less

48 citations