scispace - formally typeset
Search or ask a question
Author

Rosario Cammarota

Other affiliations: University of California, Yahoo!, Qualcomm  ...read more
Bio: Rosario Cammarota is an academic researcher from Intel. The author has contributed to research in topics: Encryption & Compiler. The author has an hindex of 14, co-authored 86 publications receiving 911 citations. Previous affiliations of Rosario Cammarota include University of California & Yahoo!.

Papers published on a yearly basis

Papers
More filters
Posted Content
TL;DR: This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems.
Abstract: With the recent wave of progress in artificial intelligence (AI) has come a growing awareness of the large-scale impacts of AI systems, and recognition that existing regulations and norms in industry and academia are insufficient to ensure responsible AI development. In order for AI developers to earn trust from system users, customers, civil society, governments, and other stakeholders that they are building AI responsibly, they will need to make verifiable claims to which they can be held accountable. Those outside of a given organization also need effective means of scrutinizing such claims. This report suggests various steps that different stakeholders can take to improve the verifiability of claims made about AI systems and their associated development processes, with a focus on providing evidence about the safety, security, fairness, and privacy protection of AI systems. We analyze ten mechanisms for this purpose--spanning institutions, software, and hardware--and make recommendations aimed at implementing, exploring, or improving those mechanisms.

191 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: A new way to use dynamic reuse distances to further improve cache management policies is proposed which prevents replacing a cache line until a certain number of accesses to its cache set, called a Protecting Distance (PD).
Abstract: Cache management policies such as replacement, bypass, or shared cache partitioning have been relying on data reuse behavior to predict the future. This paper proposes a new way to use dynamic reuse distances to further improve such policies. A new replacement policy is proposed which prevents replacing a cache line until a certain number of accesses to its cache set, called a Protecting Distance (PD). The policy protects a cache line long enough for it to be reused, but not beyond that to avoid cache pollution. This can be combined with a bypass mechanism that also relies on dynamic reuse analysis to bypass lines with less expected reuse. A miss fetch is bypassed if there are no unprotected lines. A hit rate model based on dynamic reuse history is proposed and the PD that maximizes the hit rate is dynamically computed. The PD is recomputed periodically to track a program's memory access behavior and phases. Next, a new multi-core cache partitioning policy is proposed using the concept of protection. It manages lifetimes of lines from different cores (threads) in such a way that the overall hit rate is maximized. The average per-thread lifetime is reduced by decreasing the thread's PD. The single-core PD-based replacement policy with bypass achieves an average speedup of 4.2% over the DIP policy, while the average speedups over DIP are 1.5% for dynamic RRIP (DRRIP) and 1.6% for sampling dead-block prediction (SDP). The 16-core PD-based partitioning policy improves the average weighted IPC by 5.2%, throughput by 6.4% and fairness by 9.9% over thread-aware DRRIP (TA-DRRIP). The required hardware is evaluated and the overhead is shown to be manageable.

159 citations

Proceedings ArticleDOI
TL;DR: The proposed nGraph-HE2 framework leverages the CKKS scheme, whose support for real numbers is friendly to data science, and a client-aided model using a two-party approach to compute activation functions to enable privacy-preserving inference on standard, pre-trained models using their native activation functions and number fields.
Abstract: In previous work, Boemer et al. introduced nGraph-HE, an extension to the Intel nGraph deep learning (DL) compiler, that enables data scientists to deploy models with popular frameworks such as TensorFlow and PyTorch with minimal code changes. However, the class of supported models was limited to relatively shallow networks with polynomial activations. Here, we introduce nGraph-HE2, which extends nGraph-HE to enable privacy-preserving inference on standard, pre-trained models using their native activation functions and number fields (typically real numbers). The proposed framework leverages the CKKS scheme, whose support for real numbers is friendly to data science, and a client-aided model using a two-party approach to compute activation functions. We first present CKKS-specific optimizations, enabling a 3x-88x runtime speedup for scalar encoding, and doubling the throughput through a novel use of CKKS plaintext packing into complex numbers. Second, we optimize ciphertext-plaintext addition and multiplication, yielding 2.6x-4.2x runtime speedup. Third, we exploit two graph-level optimizations: lazy-rescaling and depth-aware encoding, which allow us to significantly improve performance. Together, these optimizations enable state-of-the-art throughput of 1,998 images/s on the CryptoNets network. Using the client-aided model, we also present homomorphic evaluation of (to our knowledge) the largest network to date, namely, pre-trained MobileNetV2 models on the ImageNet dataset, with 60.4%/82.7% top-1/top-5 accuracy and an amortized runtime of 381 ms/image.

121 citations

Proceedings ArticleDOI
30 Apr 2019
TL;DR: nGraph-HE as mentioned in this paper is an extension of nGraph, Intel's DL graph compiler, which enables deployment of trained models with popular frameworks such as TensorFlow while simply treating HE as another hardware target.
Abstract: Homomorphic encryption (HE)---the ability to perform computation on encrypted data---is an attractive remedy to increasing concerns about data privacy in deep learning (DL). However, building DL models that operate on ciphertext is currently labor-intensive and requires simultaneous expertise in DL, cryptography, and software engineering. DL frameworks and recent advances in graph compilers have greatly accelerated the training and deployment of DL models to various computing platforms. We introduce nGraph-HE, an extension of nGraph, Intel's DL graph compiler, which enables deployment of trained models with popular frameworks such as TensorFlow while simply treating HE as another hardware target. Our graph-compiler approach enables HE-aware optimizations- implemented at compile-time, such as constant folding and HE-SIMD packing, and at run-time, such as special value plaintext bypass. Furthermore, nGraph-HE integrates with DL frameworks such as TensorFlow, enabling data scientists to benchmark DL models with minimal overhead.

94 citations

Journal ArticleDOI
TL;DR: This work survey trends in lattice-based cryptographic schemes, some recent fundamental proposals for the use of lattices in computer security, challenges for their implementation in software and hardware, and emerging needs for their adoption.
Abstract: The advent of quantum computing threatens to break many classical cryptographic schemes, leading to innovations in public key cryptography that focus on post-quantum cryptography primitives and protocols resistant to quantum computing threats. Lattice-based cryptography is a promising post-quantum cryptography family, both in terms of foundational properties as well as in its application to both traditional and emerging security problems such as encryption, digital signature, key exchange, and homomorphic encryption. While such techniques provide guarantees, in theory, their realization on contemporary computing platforms requires careful design choices and tradeoffs to manage both the diversity of computing platforms (e.g., high-performance to resource constrained), as well as the agility for deployment in the face of emerging and changing standards. In this work, we survey trends in lattice-based cryptographic schemes, some recent fundamental proposals for the use of lattices in computer security, challenges for their implementation in software and hardware, and emerging needs for their adoption. The survey means to be informative about the math to allow the reader to focus on the mechanics of the computation ultimately needed for mapping schemes on existing hardware or synthesizing part or all of a scheme on special-purpose har dware.

83 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, applied linear regression models are used for linear regression in the context of quality control in quality control systems, and the results show that linear regression is effective in many applications.
Abstract: (1991). Applied Linear Regression Models. Journal of Quality Technology: Vol. 23, No. 1, pp. 76-77.

1,811 citations

Journal ArticleDOI
TL;DR: A review on interpretabilities suggested by different research works and categorize them is provided, hoping that insight into interpretability will be born with more considerations for medical practices and initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.
Abstract: Recently, artificial intelligence and machine learning in general have demonstrated remarkable performances in many tasks, from image processing to natural language processing, especially with the advent of deep learning (DL). Along with research progress, they have encroached upon many different fields and disciplines. Some of them require high level of accountability and thus transparency, for example, the medical sector. Explanations for machine decisions and predictions are thus needed to justify their reliability. This requires greater interpretability, which often means we need to understand the mechanism underlying the algorithms. Unfortunately, the blackbox nature of the DL is still unresolved, and many machine decisions are still poorly understood. We provide a review on interpretabilities suggested by different research works and categorize them. The different categories show different dimensions in interpretability research, from approaches that provide “obviously” interpretable information to the studies of complex patterns. By applying the same categorization to interpretability in medical research, it is hoped that: 1) clinicians and practitioners can subsequently approach these methods with caution; 2) insight into interpretability will be born with more considerations for medical practices; and 3) initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.

810 citations

Journal Article
TL;DR: This conversion is the first generic transformation from an arbitrary one-way asymmetricryption scheme to a chosen-ciphertext secure asymmetric encryption scheme in the random oracle model.
Abstract: This paper shows a generic and simple conversion from weak asymmetric and symmetric encryption schemes into an asymmetric encryption scheme which is secure in a very strong sense- indistinguishability against adaptive chosen-ciphertext attacks in the random oracle model. In particular, this conversion can be applied efficiently to an asymmetric encryption scheme that provides a large enough coin space and, for every message, many enough variants of the encryption, like the ElGamal encryption scheme.

457 citations

Journal ArticleDOI
TL;DR: GPT-4 as mentioned in this paper is a Transformer-based model pre-trained to predict the next token in a document, which can accept image and text inputs and produce text outputs.
Abstract: We report the development of GPT-4, a large-scale, multimodal model which can accept image and text inputs and produce text outputs. While less capable than humans in many real-world scenarios, GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers. GPT-4 is a Transformer-based model pre-trained to predict the next token in a document. The post-training alignment process results in improved performance on measures of factuality and adherence to desired behavior. A core component of this project was developing infrastructure and optimization methods that behave predictably across a wide range of scales. This allowed us to accurately predict some aspects of GPT-4's performance based on models trained with no more than 1/1,000th the compute of GPT-4.

454 citations

Journal ArticleDOI
TL;DR: A definition and proposed concept for informed machine learning is provided, which illustrates its building blocks and distinguishes it from conventional machine learning, and a taxonomy is introduced that serves as a classification framework forinformed machine learning approaches.
Abstract: Despite its great success, machine learning can have its limits when dealing with insufficient training data. A potential solution is the additional integration of prior knowledge into the training process which leads to the notion of informed machine learning. In this paper, we present a structured overview of various approaches in this field. We provide a definition and propose a concept for informed machine learning which illustrates its building blocks and distinguishes it from conventional machine learning. We introduce a taxonomy that serves as a classification framework for informed machine learning approaches. It considers the source of knowledge, its representation, and its integration into the machine learning pipeline. Based on this taxonomy, we survey related research and describe how different knowledge representations such as algebraic equations, logic rules, or simulation results can be used in learning systems. This evaluation of numerous papers on the basis of our taxonomy uncovers key methods in the field of informed machine learning.

297 citations