scispace - formally typeset
Search or ask a question

Showing papers by "Raluca Ada Popa published in 2020"


Proceedings Article
01 Jan 2020
TL;DR: This work designs, implements, and evaluates DELPHI, a secure prediction system that allows two parties to execute neural network inference without revealing either party’s data, and develops a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work.
Abstract: Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party’s privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user’s device. The former harms the personal privacy of the user, while the latter reveals the service provider’s proprietary model. We design, implement, and evaluate DELPHI, a secure prediction system that allows two parties to execute neural network inference without revealing either party’s data. DELPHI approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22× improvement in online prediction latency compared to the state-of-the-art prior work.

234 citations


Journal ArticleDOI
25 Feb 2020
TL;DR: This report summarizes the discussion and conclusions of the 9th self-assessment meeting of database researchers, held during October 9-10, 2018 in Seattle.
Abstract: Approximately every five years, a group of database researchers meet to do a self-assessment of our community, including reflections on our impact on the industry as well as challenges facing our research community. This report summarizes the discussion and conclusions of the 9th such meeting, held during October 9-10, 2018 in Seattle.

61 citations


Proceedings ArticleDOI
09 Nov 2020
TL;DR: This work designs and implements Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data, and develops a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of the hybrid protocol.
Abstract: Many companies provide neural network prediction services to users for a wide range of applications. However, current prediction systems compromise one party's privacy: either the user has to send sensitive inputs to the service provider for classification, or the service provider must store its proprietary neural networks on the user's device. The former harms the personal privacy of the user, while the latter reveals the service provider's proprietary model.We design, implement, and evaluate Delphi, a secure prediction system that allows two parties to execute neural network inference without revealing either party's data. Delphi approaches the problem by simultaneously co-designing cryptography and machine learning. We first design a hybrid cryptographic protocol that improves upon the communication and computation costs over prior work. Second, we develop a planner that automatically generates neural network architecture configurations that navigate the performance-accuracy trade-offs of our hybrid protocol. Together, these techniques allow us to achieve a 22x improvement in online prediction latency compared to the state-of-the-art prior work.

46 citations


Proceedings Article
12 Aug 2020
TL;DR: Visor as discussed by the authors is a system that provides confidentiality for the user's video stream as well as the ML models in the presence of a compromised cloud platform and untrusted co-tenants.
Abstract: Video-analytics-as-a-service is becoming an important offering for cloud providers. A key concern in such services is privacy of the videos being analyzed. While trusted execution environments (TEEs) are promising options for preventing the direct leakage of private video content, they remain vulnerable to side-channel attacks. We present Visor, a system that provides confidentiality for the user’s video stream as well as the ML models in the presence of a compromised cloud platform and untrusted co-tenants. Visor executes video pipelines in a hybrid TEE that spans both the CPU and GPU. It protects the pipeline against side-channel attacks induced by data-dependent access patterns of video modules, and also addresses leakage in the CPU-GPU communication channel. Visor is up to 1000× faster than naive oblivious solutions, and its overheads relative to a non-oblivious baseline are limited to 2×–6×.

30 citations


Posted Content
TL;DR: DORY is designed and built, an encrypted search system that addresses real-world requirements and protects search access patterns and performs orders of magnitude better than a baseline built on ORAM.
Abstract: Efficient, leakage-free search on encrypted data has remained an unsolved problem for the last two decades; efficient schemes are vulnerable to leakage-abuse attacks, and schemes that eliminate leakage are impractical to deploy. To overcome this tradeoff, we reexamine the system model. We surveyed five companies providing end-to-end encrypted filesharing to better understand what they require from an encrypted search system. Based on our findings, we design and build DORY, an encrypted search system that addresses real-world requirements and protects search access patterns; namely,when a user searches for a keyword over the fileswithin a folder, the server learns only that a search happens in that folder, but does not learn which documents match the search, the number of documents that match, or other information about the keyword. DORY splits trust betweenmultiple servers to protect against a malicious attacker who controls all but one of the servers. We develop new cryptographic and systems techniques to meet the efficiency and trust model requirements outlined by the companies we surveyed. We implement DORY and show that it performs orders of magnitude better than a baseline built on ORAM. Parallelized across 8 servers, each with 16 CPUs, DORY takes 116ms to search roughly 50K documents and 862ms to search over 1M documents.

28 citations


Proceedings ArticleDOI
01 Jan 2020
TL;DR: Metal is the first file-sharing system that hides metadata from malicious users and that has a latency of only a few seconds, which is 500× faster (in terms of amortized latency) or 10× faster than PIR-MCORAM, which does not hide user identities.

26 citations


Proceedings ArticleDOI
15 Apr 2020
TL;DR: Oblivious Coopetitive Queries (OCQ), an efficient, general framework for oblivious coopetitive analytics using hardware enclaves, is proposed and implemented as an extension to Apache Spark SQL, finding that OCQ is up to 9.9x faster than Opaque, a state-of-the-art secure analytics framework which outsources all data and computation to an enclave-enabled cloud.
Abstract: Coopetitive analytics refers to cooperation among competing parties to run queries over their joint data. Regulatory, business, and liability concerns prevent these organizations from sharing their sensitive data in plaintext. We propose Oblivious Coopetitive Queries (OCQ), an efficient, general framework for oblivious coopetitive analytics using hardware enclaves. OCQ builds on Opaque, a Spark-based framework for secure distributed analytics, to execute coopetitive queries using hardware enclaves in a decentralized manner. Its query planner chooses how and where to execute each relational operator to prevent data leakage through side channels such as memory access patterns, network traffic statistics, and cardinality, while minimizing overhead. We implemented OCQ as an extension to Apache Spark SQL. We find that OCQ is up to 9.9x faster than Opaque, a state-of-the-art secure analytics framework which outsources all data and computation to an enclave-enabled cloud; and is up to 219x faster than implementing analytics using AgMPC, a state-of-the-art secure multi-party computation framework.

25 citations


Proceedings Article
01 Jan 2020
TL;DR: Civet is a framework for partitioning Java applications into enclaves that reduces the number of lines of code in the enclave and uses language-level defenses, including deep type checks and dynamic taint-tracking, to harden the enclave interface.
Abstract: Hardware enclaves are designed to execute small pieces of sensitive code or to operate on sensitive data, in isolation from larger, less trusted systems. Partitioning a large, legacy application requires significant effort. Partitioning an application written in a managed language, such as Java, is more challenging because of mutable language characteristics, extensive code reachability in class libraries, and the inevitability of using a heavyweight runtime. Civet is a framework for partitioning Java applications into enclaves. Civet reduces the number of lines of code in the enclave and uses language-level defenses, including deep type checks and dynamic taint-tracking, to harden the enclave interface. Civet also contributes a partitioned Java runtime design, including a garbage collection design optimized for the peculiarities of enclaves. Civet is efficient for data-intensive workloads; partitioning a Hadoop mapper reduces the enclave overhead from 10× to 16–22% without taint-tracking or 70–80% with taint-tracking.

25 citations


Proceedings ArticleDOI
09 Nov 2020
TL;DR: This work proposes Secure XGBoost, a privacy-preserving system that enables multiparty training and inference of X GBoost models and augments the security of the enclaves using novel data-oblivious algorithms that prevent access side-channel attacks on enclaves induced via access pattern leakage.
Abstract: In recent years, gradient boosted decision tree learning has proven to be an effective method of training robust models. Moreover, collaborative learning among multiple parties has the potential to greatly benefit all parties involved, but organizations have also encountered obstacles in sharing sensitive data due to business, regulatory, and liability concerns.We propose Secure XGBoost, a privacy-preserving system that enables multiparty training and inference of XGBoost models. Secure XGBoost protects the privacy of each party's data as well as the integrity of the computation with the help of hardware enclaves. Crucially, Secure XGBoost augments the security of the enclaves using novel data-oblivious algorithms that prevent access side-channel attacks on enclaves induced via access pattern leakage.

24 citations


Proceedings Article
01 Jan 2020
TL;DR: This work proposes Ghostor, a data-sharing system that, using only decentralized trust, hides user identities from the server, and allows users to detect server-side integrity violations, and develops a technique called verifiable anonymous history.
Abstract: Data-sharing systems are often used to store sensitive data. Both academia and industry have proposed numerous solutions to protect the user privacy and data integrity from a compromised server. Practical state-of-the-art solutions, however, use weak threat models based on centralized trust—they assume that part of the server will remain uncompromised, or that the adversary will not perform active attacks. We propose Ghostor, a data-sharing system that, using only decentralized trust, (1) hides user identities from the server, and (2) allows users to detect server-side integrity violations. To achieve (1), Ghostor avoids keeping any per-user state at the server, requiring us to redesign the system to avoid common paradigms like per-user authentication and user-specific mailboxes. To achieve (2), Ghostor develops a technique called verifiable anonymous history. Ghostor leverages a blockchain rarely, publishing only a single hash to the blockchain for the entire system once every epoch. We measured that Ghostor incurs a 4–5x throughput overhead compared to an insecure baseline. Although significant, Ghostor’s overhead may be worth it for securityand privacy-sensitive applications.

21 citations


Posted Content
TL;DR: In this article, the authors present new attacks for recovering the content of individual user queries, assuming no leakage from the system except the number of results and avoiding the limiting assumptions that are unrealistic in practice, such as requiring a large number of queries to be issued by the user, or assuming certain distributions on the queries or underlying data.
Abstract: Recent years have seen an increased interest towards strong security primitives for encrypted databases (such as oblivious protocols), that hide the access patterns of query execution, and reveal only the volume of results. However, recent work has shown that even volume leakage can enable the reconstruction of entire columns in the database. Yet, existing attacks rely on a set of assumptions that are unrealistic in practice: for example, they (i) require a large number of queries to be issued by the user, or (ii) assume certain distributions on the queries or underlying data (e.g., that the queries are distributed uniformly at random, or that the database does not contain missing values). In this work, we present new attacks for recovering the content of individual user queries, assuming no leakage from the system except the number of results and avoiding the limiting assumptions above. Unlike prior attacks, our attacks require only a single query to be issued by the user for recovering the keyword. Furthermore, our attacks make no assumptions about the distribution of issued queries or the underlying data. Instead, our key insight is to exploit the behavior of real-world applications. We start by surveying 11 applications to identify two key characteristics that can be exploited by attackers: (i) file injection, and (ii) automatic query replay. We present attacks that leverage these two properties in concert with volume leakage, independent of the details of any encrypted database system. Subsequently, we perform an attack on the real Gmail web client by simulating a server-side adversary. Our attack on Gmail completes within a matter of minutes, demonstrating the feasibility of our techniques. We also present three ancillary attacks for situations when certain mitigation strategies are employed.

Proceedings ArticleDOI
01 Sep 2020
TL;DR: In this paper, the authors present new attacks for recovering the content of individual user queries assuming no leakage from the system except the number of results and avoiding the limiting assumptions that are unrealistic in practice for example they (i) require a large number of queries to be issued by the user or (ii) assume certain distributions on the queries or underlying data.
Abstract: Recent years have seen an increased interest towards strong security primitives for encrypted databases (such as oblivious protocols) that hide the access patterns of query execution and reveal only the volume of results. However recent work has shown that even volume leakage can enable the reconstruction of entire columns in the database. Yet existing attacks rely on a set of assumptions that are unrealistic in practice for example they (i) require a large number of queries to be issued by the user or (ii) assume certain distributions on the queries or underlying data (e.g. that the queries are distributed uniformly at random or that the database does not contain missing values). In this work we present new attacks for recovering the content of individual user queries assuming no leakage from the system except the number of results and avoiding the limiting assumptions above. Unlike prior attacks our attacks require only a single query to be issued by the user for recovering the keyword. Furthermore our attacks make no assumptions about the distribution of issued queries or the underlying data. Instead our key insight is to exploit the behavior of real-world applications. We start by surveying 11 applications to identify two key characteristics that can be exploited by attackers-(l) file injection and (ii) automatic query replay. We present attacks that leverage these two properties in concert with volume leakage independent of the details of any encrypted database system. Subsequently we perform an attack on the real Gmail web client by simulating a server-side adversary. Our attack on Gmail completes within a matter of minutes demonstrating the feasibility of our techniques. We also present three ancillary attacks for situations when certain mitigation strategies are employed.

Posted Content
TL;DR: Visor is a system that provides confidentiality for the user's video stream as well as the ML models in the presence of a compromised cloud platform and untrusted co-tenants and protects the pipeline against side-channel attacks induced by data-dependent access patterns of video modules, and also addresses leakage in the CPU-GPU communication channel.
Abstract: Video-analytics-as-a-service is becoming an important offering for cloud providers A key concern in such services is privacy of the videos being analyzed While trusted execution environments (TEEs) are promising options for preventing the direct leakage of private video content, they remain vulnerable to side-channel attacks We present Visor, a system that provides confidentiality for the user's video stream as well as the ML models in the presence of a compromised cloud platform and untrusted co-tenants Visor executes video pipelines in a hybrid TEE that spans both the CPU and GPU It protects the pipeline against side-channel attacks induced by data-dependent access patterns of video modules, and also addresses leakage in the CPU-GPU communication channel Visor is up to $1000\times$ faster than naive oblivious solutions, and its overheads relative to a non-oblivious baseline are limited to $2\times$--$6\times$

Journal Article
TL;DR: The first file-sharing system that hides metadata from malicious users and has a latency of only a few seconds is Metal as discussed by the authors, which consists of a new two-server multi-user oblivious RAM (ORAM) scheme, a metadata-hiding access control protocol, and a capability sharing protocol.
Abstract: File-sharing systems like Dropbox offer insufficient privacy because a compromised server can see the file contents in the clear. Although encryption can hide such contents from the servers, metadata leakage remains significant. The goal of our work is to develop a file-sharing system that hides metadata— including user identities and file access patterns. Metal is the first file-sharing system that hides such metadata from malicious users and that has a latency of only a few seconds. The core of Metal consists of a new two-server multi-user oblivious RAM (ORAM) scheme, which is secure against malicious users, a metadata-hiding access control protocol, and a capability sharing protocol. Compared with the state-of-the-art malicious-user filesharing scheme PIR-MCORAM (Maffei et al.’17), which does not hide user identities, Metal hides the user identities and is 500× faster (in terms of amortized latency) or 10× faster (in terms of worst-case latency).


Posted Content
TL;DR: Secure XGBoost as mentioned in this paper protects the privacy of each party's data as well as the integrity of the computation with the help of hardware enclaves and augments the security of the enclaves using novel data-oblivious algorithms that prevent access side-channel attacks on enclaves induced via access pattern leakage.
Abstract: In recent years, gradient boosted decision tree learning has proven to be an effective method of training robust models. Moreover, collaborative learning among multiple parties has the potential to greatly benefit all parties involved, but organizations have also encountered obstacles in sharing sensitive data due to business, regulatory, and liability concerns. We propose Secure XGBoost, a privacy-preserving system that enables multiparty training and inference of XGBoost models. Secure XGBoost protects the privacy of each party's data as well as the integrity of the computation with the help of hardware enclaves. Crucially, Secure XGBoost augments the security of the enclaves using novel data-oblivious algorithms that prevent access side-channel attacks on enclaves induced via access pattern leakage.



Posted Content
TL;DR: In this article, the authors propose a secure multi-party computation (MPC) protocol that allows multiple parties to collaboratively run analytical SQL queries without revealing their individual data to each other.
Abstract: Many organizations stand to benefit from pooling their data together in order to draw mutually beneficial insights -- e.g., for fraud detection across banks, better medical studies across hospitals, etc. However, such organizations are often prevented from sharing their data with each other by privacy concerns, regulatory hurdles, or business competition. We present Senate, a system that allows multiple parties to collaboratively run analytical SQL queries without revealing their individual data to each other. Unlike prior works on secure multi-party computation (MPC) that assume that all parties are semi-honest, Senate protects the data even in the presence of malicious adversaries. At the heart of Senate lies a new MPC decomposition protocol that decomposes the cryptographic MPC computation into smaller units, some of which can be executed by subsets of parties and in parallel, while preserving its security guarantees. Senate then provides a new query planning algorithm that decomposes and plans the cryptographic computation effectively, achieving a performance of up to 145$\times$ faster than the state-of-the-art.