scispace - formally typeset
Search or ask a question

Showing papers on "Information privacy published in 2008"


Journal ArticleDOI
TL;DR: The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes that is stable, extensible, and freely available to all researchers.
Abstract: Random community genomes (metagenomes) are now commonly used to study microbes in different environments. Over the past few years, the major challenge associated with metagenomics shifted from generating to analyzing sequences. High-throughput, low-cost next-generation sequencing has provided access to metagenomics to a wide range of researchers. A high-throughput pipeline has been constructed to provide high-performance computing to all researchers interested in using metagenomics. The pipeline produces automated functional assignments of sequences in the metagenome by comparing both protein and nucleotide databases. Phylogenetic and functional summaries of the metagenomes are generated, and tools for comparative metagenomics are incorporated into the standard views. User access is controlled to ensure data privacy, but the collaborative environment underpinning the service provides a framework for sharing datasets between multiple users. In the metagenomics RAST, all users retain full control of their data, and everything is available for download in a variety of formats. The open-source metagenomics RAST service provides a new paradigm for the annotation and analysis of metagenomes. With built-in support for multiple data sources and a back end that houses abstract data types, the metagenomics RAST is stable, extensible, and freely available to all researchers. This service has removed one of the primary bottlenecks in metagenome sequence analysis – the availability of high-performance computing for annotating the data. http://metagenomics.nmpdr.org

3,322 citations


Journal ArticleDOI
TL;DR: A scalable architecture for protecting the location privacy from various privacy threats resulting from uncontrolled usage of LBSs is described, including the development of a personalized location anonymization model and a suite of location perturbation algorithms.
Abstract: Continued advances in mobile networks and positioning technologies have created a strong market push for location-based applications. Examples include location-aware emergency response, location-based advertisement, and location-based entertainment. An important challenge in the wide deployment of location-based services (LBSs) is the privacy-aware management of location information, providing safeguards for location privacy of mobile clients against vulnerabilities for abuse. This paper describes a scalable architecture for protecting the location privacy from various privacy threats resulting from uncontrolled usage of LBSs. This architecture includes the development of a personalized location anonymization model and a suite of location perturbation algorithms. A unique characteristic of our location privacy architecture is the use of a flexible privacy personalization framework to support location k-anonymity for a wide range of mobile clients with context-sensitive privacy requirements. This framework enables each mobile client to specify the minimum level of anonymity that it desires and the maximum temporal and spatial tolerances that it is willing to accept when requesting k-anonymity-preserving LBSs. We devise an efficient message perturbation engine to implement the proposed location privacy framework. The prototype that we develop is designed to be run by the anonymity server on a trusted platform and performs location anonymization on LBS request messages of mobile clients such as identity removal and spatio-temporal cloaking of the location information. We study the effectiveness of our location cloaking algorithms under various conditions by using realistic location data that is synthetically generated from real road maps and traffic volume data. Our experiments show that the personalized location k-anonymity model, together with our location perturbation engine, can achieve high resilience to location privacy threats without introducing any significant performance penalty.

883 citations


Proceedings ArticleDOI
07 Apr 2008
TL;DR: The empirical study indicates that anonymized social networks generated by the method can still be used to answer aggregate network queries with high accuracy and present a practical solution to battle neighborhood attacks.
Abstract: Recently, as more and more social network data has been published in one way or another, preserving privacy in publishing social network data becomes an important concern. With some local knowledge about individuals in a social network, an adversary may attack the privacy of some victims easily. Unfortunately, most of the previous studies on privacy preservation can deal with relational data only, and cannot be applied to social network data. In this paper, we take an initiative towards preserving privacy in social network data. We identify an essential type of privacy attacks: neighborhood attacks. If an adversary has some knowledge about the neighbors of a target victim and the relationship among the neighbors, the victim may be re-identified from a social network even if the victim's identity is preserved using the conventional anonymization techniques. We show that the problem is challenging, and present a practical solution to battle neighborhood attacks. The empirical study indicates that anonymized social networks generated by our method can still be used to answer aggregate network queries with high accuracy.

764 citations


Proceedings ArticleDOI
13 Apr 2008
TL;DR: An efficient conditional privacy preservation protocol in vehicular ad hoc networks (VANETs) is introduced to address the issue on anonymous authentication for safety messages with authority traceability and can provide fast anonymous authentication and privacy tracking while minimizing the required storage for short-time anonymous keys.
Abstract: In this paper, we introduce an efficient conditional privacy preservation (ECPP) protocol in vehicular ad hoc networks (VANETs) to address the issue on anonymous authentication for safety messages with authority traceability. The proposed protocol is characterized by the generation of on-the-fly short-time anonymous keys between on-board units (OBUs) and roadside units (RSUs), which can provide fast anonymous authentication and privacy tracking while minimizing the required storage for short-time anonymous keys. We demonstrate the merits gained by the proposed protocol through extensive analysis.

698 citations


Journal ArticleDOI
TL;DR: The results indicate that some discernible patterns emerge in the relationships between the antecedents and the three groups of IPPR, which could enable researchers to analyze a variety of behavioral responses to information privacy threats in a fairly systematic manner.
Abstract: Although Internet users are expected to respond in various ways to privacy threats from online companies, little attention has been paid so far to the complex nature of how users respond to these threats. This paper has two specific goals in its effort to fill this gap in the literature. The first, so that these outcomes can be systematically investigated, is to develop a taxonomy of information privacy-protective responses (IPPR). This taxonomy consists of six types of behavioral responses-refusal, misrepresentation, removal, negative word-of-mouth, complaining directly to online companies, and complaining indirectly to third-party organizations-that are classified into three categories: information provision, private action, and public action. Our second goal is to develop a nomological model with several salient antecedents-concerns for information privacy, perceived justice, and societal benefits from complaining-of IPPR, and to show how the antecedents differentially affect the six types of IPPR. The nomological model is tested with data collected from 523 Internet users. The results indicate that some discernible patterns emerge in the relationships between the antecedents and the three groups of IPPR. These patterns enable researchers to better understand why a certain type of IPPR is similar to or distinct from other types of IPPR. Such an understanding could enable researchers to analyze a variety of behavioral responses to information privacy threats in a fairly systematic manner. Overall, this paper contributes to researchers' theory-building efforts in the area of information privacy by breaking new ground for the study of individuals' responses to information privacy threats.

540 citations


Proceedings ArticleDOI
17 May 2008
TL;DR: In this paper, a new notion of data privacy, called distributional privacy, which is strictly stronger than the prevailing privacy notion, differential privacy, is introduced, and a new lower bound for releasing databases that are useful for halfspace queries over a continuous domain is shown.
Abstract: We demonstrate that, ignoring computational constraints, it is possible to release privacy-preserving databases that are useful for all queries over a discretized domain from any given concept class with polynomial VC-dimension. We show a new lower bound for releasing databases that are useful for halfspace queries over a continuous domain. Despite this, we give a privacy-preserving polynomial time algorithm that releases information useful for all halfspace queries, for a slightly relaxed definition of usefulness. Inspired by learning theory, we introduce a new notion of data privacy, which we call distributional privacy, and show that it is strictly stronger than the prevailing privacy notion, differential privacy.

516 citations


Journal ArticleDOI
TL;DR: This essay examines the privacy concerns voiced following the September 2006 launch of the `News Feeds' feature and concludes that the `privacy trainwreck' that people experienced was the cost of social convergence.
Abstract: Not all Facebook users appreciated the September 2006 launch of the `News Feeds' feature. Concerned about privacy implications, thousands of users vocalized their discontent through the site itself, forcing the company to implement privacy tools. This essay examines the privacy concerns voiced following these events. Because the data made easily visible were already accessible with effort, what disturbed people was primarily the sense of exposure and invasion. In essence, the `privacy trainwreck' that people experienced was the cost of social convergence.

464 citations


Journal ArticleDOI
TL;DR: AlarmNet is presented, a novel system for assisted living and residential monitoring that uses a two-way flow of data and analysis between the front- and back-ends to enable context-aware protocols that are tailored to residents' individual patterns of living.
Abstract: Improving the quality of healthcare and the prospects of "aging in place" using wireless sensor technology requires solving difficult problems in scale, energy management, data access, security, and privacy. We present AlarmNet, a novel system for assisted living and residential monitoring that uses a two-way flow of data and analysis between the front- and back-ends to enable context-aware protocols that are tailored to residents' individual patterns of living. AlarmNet integrates environmental, physiological, and activity sensors in a scalable heterogeneous architecture. The SenQ query protocol provides real-time access to data and lightweight in-network processing. Circadian activity rhythm analysis learns resident activity patterns and feeds them back into the network to aid context-aware power management and dynamic privacy policies.

439 citations


Journal ArticleDOI
TL;DR: The new challenges in privacy preserving publishing of social network data comparing to the extensively studied relational case are identified, and the possible problem formulation in three important dimensions: privacy, background knowledge, and data utility are examined.
Abstract: Nowadays, partly driven by many Web 2.0 applications, more and more social network data has been made publicly available and analyzed in one way or another. Privacy preserving publishing of social network data becomes a more and more important concern. In this paper, we present a brief yet systematic review of the existing anonymization techniques for privacy preserving publishing of social network data. We identify the new challenges in privacy preserving publishing of social network data comparing to the extensively studied relational case, and examine the possible problem formulation in three important dimensions: privacy, background knowledge, and data utility. We survey the existing anonymization methods for privacy preservation in two categories: clustering-based approaches and graph modification approaches.

427 citations


Proceedings ArticleDOI
24 Aug 2008
TL;DR: This paper investigates composition attacks, in which an adversary uses independent anonymized releases to breach privacy, and provides a precise formulation of this property, and proves that an important class of relaxations of differential privacy also satisfy the property.
Abstract: Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we investigate the effectiveness of current anonymization schemes in preserving privacy when multiple organizations independently release anonymized data about overlapping populations.1. We investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy. We explain why recently proposed models of limited auxiliary information fail to capture composition attacks. Our experiments demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques. The class includes k-anonymity and several recent variants.2. On a more positive note, certain randomization-based notions of privacy (such as differential privacy) provably resist composition attacks and, in fact, the use of arbitrary side information.This resistance enables "stand-alone" design of anonymization schemes, without the need for explicitly keeping track of other releases.We provide a precise formulation of this property, and prove that an important class of relaxations of differential privacy also satisfy the property. This significantly enlarges the class of protocols known to enable modular design.

382 citations


Proceedings ArticleDOI
18 Aug 2008
TL;DR: It is demonstrated that NOYB is practical and incrementally deployable, requires no changes to or cooperation from an existing online service, and indeed can be non-trivial for the online service to detect.
Abstract: Increasingly, Internet users trade privacy for service. Facebook, Google, and others mine personal information to target advertising. This paper presents a preliminary and partial answer to the general question "Can users retain their privacy while still benefiting from these web services?". We propose NOYB, a novel approach that provides privacy while preserving some of the functionality provided by online services. We apply our approach to the Facebook online social networking website. Through a proof-of-concept implementation we demonstrate that NOYB is practical and incrementally deployable, requires no changes to or cooperation from an existing online service, and indeed can be non-trivial for the online service to detect.

Proceedings Article
01 Jan 2008
TL;DR: An integrative model suggesting that privacy concerns form because of an individual’s disposition to privacy or situational cues that enable one person to assess the consequences of information disclosure is developed.
Abstract: Numerous public opinion polls reveal that individuals are quite concerned about threats to their information privacy However, the current understanding of privacy that emerges is fragmented and usually discipline-dependent A systematic understanding of individuals’ privacy concerns is of increasing importance as information technologies increasingly expand the ability for organizations to store, process, and exploit personal data Drawing on information boundary theory, we developed an integrative model suggesting that privacy concerns form because of an individual’s disposition to privacy or situational cues that enable one person to assess the consequences of information disclosure Furthermore, a cognitive process, comprising perceived privacy risk, privacy control and privacy intrusion is proposed to shape an individual’s privacy concerns toward a specific Web site’s privacy practices We empirically tested the research model through a survey (n=823) that was administered to users of four different types of web sites: 1) electronic commerce sites, 2) social networking sites, 3) financial sites, and 4) healthcare sites The study reported here is novel to the extent that existing empirical research has not examined this complex set of privacy issues Implications for theory and practice are discussed, and suggestions for future research along the directions of this study are provided

Proceedings ArticleDOI
18 Aug 2008
TL;DR: This study examines popular OSNs from a viewpoint of characterizing potential privacy leakage, and identifies what bits of information are currently being shared, how widely, and what users can do to prevent such sharing.
Abstract: Online social networks (OSNs) with half a billion users have dramatically raised concerns on privacy leakage. Users, often willingly, share personal identifying information about themselves, but do not have a clear idea of who accesses their private information or what portion of it really needs to be accessed. In this study we examine popular OSNs from a viewpoint of characterizing potential privacy leakage. Our study identifies what bits of information are currently being shared, how widely, and what users can do to prevent such sharing. We also examine the role of third-party sites that track OSN users and compare with privacy leakage on popular traditional Web sites. Our long term goal is to identify the narrow set of private information that users really need to share to accomplish specific interactions on OSNs.

Posted Content
TL;DR: In this paper, the authors investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy, and demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques.
Abstract: Privacy is an increasingly important aspect of data publishing. Reasoning about privacy, however, is fraught with pitfalls. One of the most significant is the auxiliary information (also called external knowledge, background knowledge, or side information) that an adversary gleans from other channels such as the web, public records, or domain knowledge. This paper explores how one can reason about privacy in the face of rich, realistic sources of auxiliary information. Specifically, we investigate the effectiveness of current anonymization schemes in preserving privacy when multiple organizations independently release anonymized data about overlapping populations. 1. We investigate composition attacks, in which an adversary uses independent anonymized releases to breach privacy. We explain why recently proposed models of limited auxiliary information fail to capture composition attacks. Our experiments demonstrate that even a simple instance of a composition attack can breach privacy in practice, for a large class of currently proposed techniques. The class includes k-anonymity and several recent variants. 2. On a more positive note, certain randomization-based notions of privacy (such as differential privacy) provably resist composition attacks and, in fact, the use of arbitrary side information. This resistance enables stand-alone design of anonymization schemes, without the need for explicitly keeping track of other releases. We provide a precise formulation of this property, and prove that an important class of relaxations of differential privacy also satisfy the property. This significantly enlarges the class of protocols known to enable modular design.

Proceedings ArticleDOI
27 Apr 2008
TL;DR: It is shown that one can use partial trajectory knowledge as a quasi-identifier for the remaining locations in the sequence and device a data suppression technique, which prevents this type of breach, while keeping the posted data as accurate as possible.
Abstract: We study the problem of protecting privacy in the publication of location sequences. Consider a database of trajectories, corresponding to movements of people, captured by their transactions when they use credit or RFID debit cards. We show that, if such trajectories are published exactly (by only hiding the identities of persons that followed them), there is a high risk of privacy breach by adversaries who hold partial information about them (e.g., shop owners). In particular, we show that one can use partial trajectory knowledge as a quasi-identifier for the remaining locations in the sequence. We device a data suppression technique, which prevents this type of breach, while keeping the posted data as accurate as possible.

Journal ArticleDOI
TL;DR: The current standardization process is reviewed, which covers the methods of providing security services and preserving driver privacy for wireless access in vehicular environments (WAVE) applications, and two fundamental issues, certificate revocation and conditional privacy preservation, are addressed.
Abstract: Vehicular communication networking is a promising approach to facilitating road safety, traffic management, and infotainment dissemination for drivers and passengers. One of the ultimate goals in the design of such networking is to resist various malicious abuses and security attacks. In this article we first review the current standardization process, which covers the methods of providing security services and preserving driver privacy for wireless access in vehicular environments (WAVE) applications. We then address two fundamental issues, certificate revocation and conditional privacy preservation, for making the standards practical. In addition, a suite of novel security mechanisms are introduced for achieving secure certificate revocation and conditional privacy preservation, which are considered among the most challenging design objectives in vehicular ad hoc networks.

Journal ArticleDOI
TL;DR: This research examines how personalization and context can impact on customers’ perceived benefits and privacy concerns, and how this aggregated effect in turn affects u-commerce adoption intention.
Abstract: U-commerce represents “anytime, anywhere” commerce. U-commerce can provide a high level of personalization, which can bring significant benefits to customers. However, customers’ privacy is a major concern and obstacle to the adoption of u-commerce. As customers’ intention to adopt u-commerce is based on the aggregate effect of perceived benefits and risk exposure (e.g., privacy concerns), this research examines how personalization and context can impact on customers’ perceived benefits and privacy concerns, and how this aggregated effect in turn affects u-commerce adoption intention.

Proceedings ArticleDOI
19 May 2008
TL;DR: This paper introduces a novel RSU-aided messages authentication scheme, called RAISE, which adopts the k-anonymity approach to protect user identity privacy, where an adversary cannot associate a message with a particular vehicle.
Abstract: Addressing security and privacy issues is a prerequisite for a market-ready vehicular communication network Although recent related studies have already addressed most of these issues, few of them have taken scalability issues into consideration When the traffic density becomes larger, a vehicle cannot verify all signatures of the messages sent by its neighbors in a timely manner, which results in message loss Communication overhead as another issue has also not been well addressed in previously reported studies To deal with these issues, this paper introduces a novel RSU-aided messages authentication scheme, called RAISE With RAISE, roadside units (RSUs) are responsible for verifying the authenticity of the messages sent from vehicles and for notifying the results back to vehicles In addition, our scheme adopts the k-anonymity approach to protect user identity privacy, where an adversary cannot associate a message with a particular vehicle Extensive simulations are conducted to verify the proposed scheme, which demonstrates that RAISE yields much better performance than any of the previously reported counterparts in terms of message loss ratio and delay

Journal ArticleDOI
TL;DR: The design of a VC security system that has emerged as a result of the European SeVe-Com project is discussed and an outlook on open security research issues that will arise as VC systems develop from today's simple prototypes to full-fledged systems is provided.
Abstract: Vehicular communication systems are on the verge of practical deployment. Nonetheless, their security and privacy protection is one of the problems that have been addressed only recently. In order to show the feasibility of secure VC, certain implementations are required. we discuss the design of a VC security system that has emerged as a result of the European SeVe-Com project. In this second article we discuss various issues related to the implementation and deployment aspects of secure VC systems. Moreover, we provide an outlook on open security research issues that will arise as VC systems develop from today's simple prototypes to full-fledged systems.

14 Apr 2008
TL;DR: This paper reports on the first iterative prototype, where presenting an audience-oriented view of profile information significantly improved the understanding of privacy settings.
Abstract: Users of online social networking communities are disclosing large amounts of personal information, putting themselves at a variety of risks. Our ongoing research investigates mechanisms for socially appropriate privacy management in online social networking communities. As a first step, we are examining the role of interface usability in current privacy settings. In this paper we report on our first iterative prototype, where presenting an audience-oriented view of profile information significantly improved the understanding of privacy settings.

Journal ArticleDOI
01 Jul 2008
TL;DR: A comprehensive approach for privacy preserving access control based on the notion of purpose, which allows multiple purposes to be associated with each data element and also supports explicit prohibitions, thus allowing privacy officers to specify that some data should not be used for certain purposes.
Abstract: In this article, we present a comprehensive approach for privacy preserving access control based on the notion of purpose. In our model, purpose information associated with a given data element specifies the intended use of the data element. A key feature of our model is that it allows multiple purposes to be associated with each data element and also supports explicit prohibitions, thus allowing privacy officers to specify that some data should not be used for certain purposes. An important issue addressed in this article is the granularity of data labeling, i.e., the units of data with which purposes can be associated. We address this issue in the context of relational databases and propose four different labeling schemes, each providing a different granularity. We also propose an approach to represent purpose information, which results in low storage overhead, and we exploit query modification techniques to support access control based on purpose information. Another contribution of our work is that we address the problem of how to determine the purpose for which certain data are accessed by a given user. Our proposed solution relies on role-based access control (RBAC) models as well as the notion of conditional role which is based on the notions of role attribute and system attribute.

Proceedings ArticleDOI
25 Oct 2008
TL;DR: This work investigates learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in the contexts where aggregate information is released about a database containing sensitive information about individuals.
Abstract: Learning problems form an important category of computational tasks that generalizes many of the computations researchers apply to large real-life data sets. We ask: what concept classes can be learned privately, namely, by an algorithm whose output does not depend too heavily on any one input or specific training example? More precisely, we investigate learning algorithms that satisfy differential privacy, a notion that provides strong confidentiality guarantees in the contexts where aggregate information is released about a database containing sensitive information about individuals. We present several basic results that demonstrate general feasibility of private learning and relate several models previously studied separately in the contexts of privacy and standard learning.

Journal ArticleDOI
TL;DR: This work proposes a lightweight authenticated key establishment scheme with privacy preservation to secure the communications between mobile vehicles and roadside infrastructure in a VANET, called SECSPP, and integrates blind signature techniques into the scheme in allowing mobile vehicles to anonymously interact with the services of roadside infrastructure.

Proceedings ArticleDOI
01 Sep 2008
TL;DR: Examining users' current strategies for maintaining their privacy, and where those strategies fail, on the online social network site Facebook demonstrates the need for mechanisms that provide awareness of the privacy impact of users' daily interactions.
Abstract: Online social networking communities such as Facebook and MySpace are extremely popular. These sites have changed how many people develop and maintain relationships through posting and sharing personal information. The amount and depth of these personal disclosures have raised concerns regarding online privacy. We expand upon previous research on users' under-utilization of available privacy options by examining users' current strategies for maintaining their privacy, and where those strategies fail, on the online social network site Facebook. Our results demonstrate the need for mechanisms that provide awareness of the privacy impact of users' daily interactions.

Journal ArticleDOI
TL;DR: PriS is described, a security requirements engineering method, which incorporates privacy requirements early in the system development process and provides a holistic approach from ‘high-level’ goals to ‘privacy-compliant’ IT systems.
Abstract: A major challenge in the field of software engineering is to make users trust the software that they use in their every day activities for professional or recreational reasons. Trusting software depends on various elements, one of which is the protection of user privacy. Protecting privacy is about complying with user’s desires when it comes to handling personal information. Users’ privacy can also be defined as the right to determine when, how and to what extend information about them is communicated to others. Current research stresses the need for addressing privacy issues during the system design rather than during the system implementation phase. To this end, this paper describes PriS, a security requirements engineering method, which incorporates privacy requirements early in the system development process. PriS considers privacy requirements as organisational goals that need to be satisfied and adopts the use of privacy-process patterns as a way to: (1) describe the effect of privacy requirements on business processes; and (2) facilitate the identification of the system architecture that best supports the privacy-related business processes. In this way, PriS provides a holistic approach from ‘high-level’ goals to ‘privacy-compliant’ IT systems. The PriS way-of-working is formally defined thus, enabling the development of automated tools for assisting its application.

Journal ArticleDOI
01 Mar 2008
TL;DR: Federated identity management lets users dynamically distribute identity information across security domains, increasing the portability of their digital identities and raising new architectural challenges and significant security and privacy issues.
Abstract: Federated identity management lets users dynamically distribute identity information across security domains, increasing the portability of their digital identities. It also raises new architectural challenges and significant security and privacy issues.

Proceedings ArticleDOI
04 Mar 2008
TL;DR: The aim of this paper is to show security measures for NFC (Near Field Communication) use cases and devices, and applies different attacks against the operation modes to show how applications and devices could be protected against such attacks.
Abstract: The aim of this paper is to show security measures for NFC (Near Field Communication) use cases and devices. We give a brief overview over NFC technology and evaluate the implementation of NFC in devices. Out of this technology review we derive different use cases and applications based on NFC technology. Based on the use cases we show assets and interfaces of an NFC device that could be a possible target of an attacker. In the following we apply different attacks against the operation modes to show how applications and devices could be protected against such attacks. The information collected is consolidated in a set of threats giving guidelines on how to improve security and overcome privacy issues. This allows integrating NFC technology in a secure way for the end consumer.

Journal ArticleDOI
01 Jul 2008
TL;DR: This paper brings privacy-preservation to that baseline, presenting protocols to develop a Naïve Bayes classifier on both vertically as well as horizontally partitioned data.
Abstract: Privacy-preserving data mining--developing models without seeing the data --- is receiving growing attention. This paper assumes a privacy-preserving distributed data mining scenario: data sources collaborate to develop a global model, but must not disclose their data to others. The problem of secure distributed classification is an important one. In many situations, data is split between multiple organizations. These organizations may want to utilize all of the data to create more accurate predictive models while revealing neither their training data/databases nor the instances to be classified. Naive Bayes is often used as a baseline classifier, consistently providing reasonable classification performance. This paper brings privacy-preservation to that baseline, presenting protocols to develop a Naive Bayes classifier on both vertically as well as horizontally partitioned data.

Proceedings ArticleDOI
27 Oct 2008
TL;DR: This work presents a new architecture for protecting information published through the social networking website, Facebook, through encryption, that dramatically raises the cost of such potential compromises and places them within a framework for legal privacy protection because they would violate a user's reasonable expectation of privacy.
Abstract: Social networking websites are enormously popular, but they present a number of privacy risks to their users, one of the foremost of which being that social network service providers are able to observe and accumulate the information that users transmit through the network. We aim to mitigate this risk by presenting a new architecture for protecting information published through the social networking website, Facebook, through encryption. Our architecture makes a trade-off between security and usability in the interests of minimally affecting users' workflow and maintaining universal accessibility. While active attacks by Facebook could compromise users' privacy, our architecture dramatically raises the cost of such potential compromises and, importantly, places them within a framework for legal privacy protection because they would violate a user's reasonable expectation of privacy. We have built a prototype Facebook application implementing our architecture, addressing some of the limitations of the Facebook platform through proxy cryptography.

Journal ArticleDOI
TL;DR: Using LISREL, it is found that privacy concerns have an important influence on the willingness to disclose personal information required to transact online.
Abstract: This U.S.-based research attempts to understand the relationships between users' perceptions about Internet privacy concerns, the need for government surveillance, government intrusion concerns, and the willingness to disclose personal information required to complete online transactions. We test a theoretical model based on a privacy calculus framework and Asymmetric Information Theory using data collected from 422 respondents. Using LISREL, we found that privacy concerns have an important influence on the willingness to disclose personal information required to transact online. The perceived need for government surveillance was negatively related to privacy concerns and positively related to willingness to disclose personal information. On the other hand, concerns about government intrusion were positively related to privacy concerns. The theoretical framework of our study can be applied across other countries.