Showing papers by "Srinivas Devadas published in 2014"

PDF

Open Access

Journal Article•DOI•

Physical Unclonable Functions and Applications: A Tutorial

[...]

Charles Herder¹, Meng-Day (Mandel) Yu, Farinaz Koushanfar², Srinivas Devadas¹•Institutions (2)

Massachusetts Institute of Technology¹, Rice University²

30 May 2014

TL;DR: This paper motivates the use of PUFs versus conventional secure nonvolatile memories, defines the two primary PUF types, and describes strong and weak PUF implementations and their use for low-cost authentication and key generation applications.

...read moreread less

Abstract: This paper describes the use of physical unclonable functions (PUFs) in low-cost authentication and key generation applications. First, it motivates the use of PUFs versus conventional secure nonvolatile memories and defines the two primary PUF types: “strong PUFs” and “weak PUFs.” It describes strong PUF implementations and their use for low-cost authentication. After this description, the paper covers both attacks and protocols to address errors. Next, the paper covers weak PUF implementations and their use in key generation applications. It covers error-correction schemes such as pattern matching and index-based coding. Finally, this paper reviews several emerging concepts in PUF technologies such as public model PUFs and new PUF implementation technologies.

...read moreread less

977 citations

Journal Article•DOI•

Staring into the abyss: an evaluation of concurrency control with one thousand cores

[...]

Xiangyao Yu¹, George Bezerra¹, Andrew Pavlo², Srinivas Devadas¹, Michael Stonebraker¹ - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Carnegie Mellon University²

01 Nov 2014

TL;DR: In this article, the authors evaluate concurrency control for on-line transaction processing (OLTP) workloads on many-core chips and show that the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts.

...read moreread less

Abstract: Computer architectures are moving towards an era dominated by many-core machines with dozens or even hundreds of cores on a single chip. This unprecedented level of on-chip parallelism introduces a new dimension to scalability that current database management systems (DBMSs) were not designed for. In particular, as the number of cores increases, the problem of concurrency control becomes extremely challenging. With hundreds of threads running in parallel, the complexity of coordinating competing accesses to data will likely diminish the gains from increased core counts.To better understand just how unprepared current DBMSs are for future CPU architectures, we performed an evaluation of concurrency control for on-line transaction processing (OLTP) workloads on many-core chips. We implemented seven concurrency control algorithms on a main-memory DBMS and using computer simulations scaled our system to 1024 cores. Our analysis shows that all algorithms fail to scale to this magnitude but for different reasons. In each case, we identify fundamental bottlenecks that are independent of the particular database implementation and argue that even state-of-the-art DBMSs suffer from these limitations. We conclude that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.

...read moreread less

239 citations

Journal Article•DOI•

Robust and Reverse-Engineering Resilient PUF Authentication and Key-Exchange by Substring Matching

[...]

Masoud Rostami¹, Mehrdad Majzoobi¹, Farinaz Koushanfar¹, Dan S. Wallach¹, Srinivas Devadas² - Show less +1 more•Institutions (2)

Rice University¹, Massachusetts Institute of Technology²

16 Jan 2014-IEEE Transactions on Emerging Topics in Computing

TL;DR: Novel robust and low-overhead physical unclonable function (PUF) authentication and key exchange protocols that are resilient against reverse-engineering attacks are proposed and evaluated and confirmed by hardware implementation.

...read moreread less

Abstract: This paper proposes novel robust and low-overhead physical unclonable function (PUF) authentication and key exchange protocols that are resilient against reverse-engineering attacks. The protocols are executed between a party with access to a physical PUF (prover) and a trusted party who has access to the PUF compact model (verifier). The proposed protocols do not follow the classic paradigm of exposing the full PUF responses or a transformation of them. Instead, random subsets of the PUF response strings are sent to the verifier so the exact position of the subset is obfuscated for the third-party channel observers. Authentication of the responses at the verifier side is done by matching the substring to the available full response string; the index of the matching point is the actual obfuscated secret (or key) and not the response substring itself. We perform a thorough analysis of resiliency of the protocols against various adversarial acts, including machine learning and statistical attacks. The attack analysis guides us in tuning the parameters of the protocol for an efficient and secure implementation. The low overhead and practicality of the protocols are evaluated and confirmed by hardware implementation.

...read moreread less

160 citations

Proceedings Article•DOI•

Suppressing the Oblivious RAM timing channel while making information leakage and program efficiency trade-offs

[...]

Christopher W. Fletchery¹, Ling Ren¹, Xiangyao Yu¹, Marten van Dijk², Omer Khan², Srinivas Devadas¹ - Show less +2 more•Institutions (2)

Massachusetts Institute of Technology¹, University of Connecticut²

19 Jun 2014

TL;DR: This paper shows how a secure processor can bound ORAM timing channel leakage to a user-controllable leakage limit, and presents a dynamic scheme that leaks at most 32 bits through the ORam timing channel and introduces only 20% performance overhead and 12% power overhead relative to a baseline ORAM that has no timing channel protection.

...read moreread less

Abstract: Oblivious RAM (ORAM) is an established cryptographic technique to hide a program's address pattern to an untrusted storage system. More recently, ORAM schemes have been proposed to replace conventional memory controllers in secure processor settings to protect against information leakage in external memory and the processor I/O bus.

...read moreread less

98 citations

Proceedings Article•DOI•

A noise bifurcation architecture for linear additive physical functions

[...]

Meng-Day (Mandel) Yu, David M'Raihi¹, Ingrid Verbauwhede², Srinivas Devadas²•Institutions (2)

Katholieke Universiteit Leuven¹, Massachusetts Institute of Technology²

06 May 2014

TL;DR: This work presents the first architecture for linear additive physical functions where the noise seen by the adversary and the noise see by the verifier are bifurcated by using a randomized decimation technique and a novel response recovery method at an authentication verification server.

...read moreread less

Abstract: Physical Unclonable Functions (PUFs) allow a silicon device to be authenticated based on its manufacturing variations using challenge/response evaluations. Popular realizations use linear additive functions as building blocks. Security is scaled up using non-linear mixing (e.g., adding XORs). Because the responses are physically derived and thus noisy, the resulting explosion in noise impacts both the adversary (which is desirable) as well as the verifier (which is undesirable). We present the first architecture for linear additive physical functions where the noise seen by the adversary and the noise seen by the verifier are bifurcated by using a randomized decimation technique and a novel response recovery method at an authentication verification server. We allow the adversary's noise η a → 0.50 while keeping the verifier's noise η v constant, using a parameter-based authentication modality that does not require explicit challenge/response pair storage at the server. We present supporting data using 28nm FPGA PUF noise results as well as machine learning attack results. We demonstrate that our architecture can also withstand recent side-channel attacks that filter the noise (to clean up training challenge/response labels) prior to machine learning.

...read moreread less

72 citations

Posted Content•

Ring ORAM: Closing the Gap Between Small and Large Client Storage Oblivious RAM.

[...]

Ling Ren, Christopher W. Fletcher, Albert Kwon, Emil Stefanov, Elaine Shi, Marten van Dijk, Srinivas Devadas - Show less +3 more

01 Jan 2014-IACR Cryptology ePrint Archive

38 citations

Proceedings Article•DOI•

Locality-aware data replication in the Last-Level Cache

[...]

George Kurian¹, Srinivas Devadas¹, Omer Khan²•Institutions (2)

Massachusetts Institute of Technology¹, University of Connecticut²

19 Jun 2014

TL;DR: This work proposes a locality-aware selective data replication protocol for the last-level cache (LLC) that aims to lower memory access latency and energy by replicating only high locality cache lines in the LLC slice of the requesting core, while simultaneously keeping the off-chip miss rate low.

...read moreread less

Abstract: Next generation multicores will process massive data with varying degree of locality. Harnessing on-chip data locality to optimize the utilization of cache and network resources is of fundamental importance. We propose a locality-aware selective data replication protocol for the last-level cache (LLC). Our goal is to lower memory access latency and energy by replicating only high locality cache lines in the LLC slice of the requesting core, while simultaneously keeping the off-chip miss rate low. Our approach relies on low overhead yet highly accurate in-hardware run-time classification of data locality at the cache line granularity, and only allows replication for cache lines with high reuse. Furthermore, our classifier captures the LLC pressure at the existing replica locations and adapts its replication decision accordingly. The locality tracking mechanism is decoupled from the sharer tracking structures that cause scalability concerns in traditional coherence protocols. Moreover, the complexity of our protocol is low since no additional coherence states are created. On a set of parallel benchmarks, our protocol reduces the overall energy by 16%, 14%, 13% and 21% and the completion time by 4%, 9%, 6% and 13% when compared to the previously proposed Victim Replication, Adaptive Selective Replication, Reactive-NUCA and Static-NUCA LLC management schemes.

...read moreread less

36 citations

Patent•

PUF Authentication and Key-Exchange by Substring Matching

[...]

Masoud Rostami¹, Mehrdad Majzoobi², Farinaz Koushanfar², Dan S. Wallach², Srinivas Devadas² - Show less +1 more•Institutions (2)

Rice University¹, Massachusetts Institute of Technology²

03 Jan 2014

TL;DR: In this article, the authors propose a verifier to verify the authenticity of a prover device using a probabilistic model of a physical unclonable function (PUF).

...read moreread less

Abstract: Mechanisms for operating a prover device and a verifier device so that the verifier device can verify the authenticity of the prover device. The prover device generates a data string by: (a) submitting a challenge to a physical unclonable function (PUF) to obtain a response string, (b) selecting a substring from the response string, (c) injecting the selected substring into the data string, and (d) injecting random bits into bit positions of the data string not assigned to the selected substring. The verifier: (e) generates an estimated response string by evaluating a computational model of the PUF based on the challenge; (f) performs a search process to identify the selected substring within the data string using the estimated response string; and (g) determines whether the prover device is authentic based on a measure of similarity between the identified substring and a corresponding substring of the estimated response string.

...read moreread less

29 citations

Posted Content•

RAW Path ORAM: A Low-Latency, Low-Area Hardware ORAM Controller with Integrity Verification.

[...]

Christopher W. Fletcher, Ling Ren, Albert Kwon, Marten van Dijk, Emil Stefanov, Srinivas Devadas - Show less +2 more

01 Jan 2014-IACR Cryptology ePrint Archive

26 citations

Posted Content•

Automated Design, Implementation, and Evaluation of Arbiter-based PUF on FPGA using Programmable Delay Lines

[...]

Srinivas Devadas, Akshat Kharaya, Farinaz Koushanfar, Mehrdad Majzoobi

18 Aug 2014-IACR Cryptology ePrint Archive

Abstract: This paper proposes a novel approach for automated implementation of an arbiter-based physical unclonable function (PUF) on field programmable gate arrays (FPGAs). We introduce a high resolution programmable delay logic (PDL) that is implemented by harnessing the FPGA lookup-table (LUT) internal structure. PDL allows automatic fine tuning of delays that can mitigate the timing skews caused by asymmetries in interconnect routing and systematic variations. To thwart the arbiter metastability problem, we present and analyze methods for majority voting of responses. A method to classify and group challenges into different robustness sets is introduced that enhances the corresponding responses’ stability in the face of operational variations. The trade-off between response stability and response entropy (uniqueness) is investigated through comprehensive measurements. We exploit the correlation between the impact of temperature and power supply on responses and perform less costly power measurements to predict the temperature impact on PUF. The measurements are performed on 12 identical Virtex 5 FPGAs across 9 different accurately controlled operating temperature and voltage supply points. A database of challenge response pairs (CRPs) are collected and made openly available for the research community.

...read moreread less

26 citations

Proceedings Article•DOI•

A self-aware processor SoC using energy monitors integrated into power converters for self-adaptation

[...]

Yildiz Sinangil¹, Sabrina M. Neuman¹, Mahmut E. Sinangil², Nathan Ickes¹, George Bezerra¹, Eric Lau¹, Jason E. Miller¹, Henry Hoffmann³, Srinivas Devadas¹, Anantha P. Chandraksan¹ - Show less +6 more•Institutions (3)

Massachusetts Institute of Technology¹, Nvidia², University of Chicago³

10 Jun 2014

TL;DR: Measurement results show that up to 8.4× energy savings can be achieved with DVFS and self-adaptation, and enable a software self-aware computation engine (SEEC) to dynamically adapt the processor to meet performance and energy goals.

...read moreread less

Abstract: This paper presents a self-aware processor with energy monitoring circuits that can measure actual energy consumption of the key blocks. The monitors are embedded into on-chip DC/DC converters and generate results within 10% of accuracy with minimal power (<;0.1%) and area (<;1%) overhead. Our system, which is implemented in 0.18μm technology, is designed to be voltage scalable from 1.8V down to 0.6V. Low-voltage SRAM operation is made possible through the use of 8T bit-cells and write-assists. The d-caches are designed to be re-configurable in associativity and size to adapt to compute- versus cache-bound phases of applications. Cache configuration is performed in <; 3 clock cycles including tag invalidation. These hardware features enable a software self-aware computation engine (SEEC) to dynamically adapt the processor to meet performance and energy goals. Measurement results show that up to 8.4× energy savings can be achieved with DVFS and self-adaptation.

...read moreread less

Posted Content•

Unified Oblivious-RAM: Improving Recursive ORAM with Locality and Pseudorandomness.

[...]

Ling Ren, Christopher W. Fletcher, Xiangyao Yu, Albert Kwon, Marten van Dijk, Srinivas Devadas - Show less +2 more

01 Jan 2014-IACR Cryptology ePrint Archive

TL;DR: Unified ORAM improves performance both asymptotically and empirically and reduces data movement from ORAM by half and improves benchmark performance by 61% as compared to recursive Path ORAM.

...read moreread less

Abstract: Oblivious RAM (ORAM) is a cryptographic primitive that hides memory access patterns to untrusted storage. ORAM may be used in secure processors for encrypted computation and/or software protection. While recursive Path ORAM is currently the most practical ORAM for secure processors, it still incurs large performance and energy overhead and is the performance bottleneck of recently proposed secure processors. In this paper, we propose two optimizations to recursive Path ORAM. First, we identify a type of program locality in its operations to improve performance. Second, we use pseudorandom function to compress the position map. But applying these two techniques in recursive Path ORAM breaks ORAM security. To securely take advantage of the two ideas, we propose unified ORAM. Unified ORAM improves performance both asymptotically and empirically. Empirically, our experiments show that unified ORAM reduces data movement from ORAM by half and improves benchmark performance by 61% as compared to recursive Path ORAM.

...read moreread less

Journal Article•DOI•

Thread Migration Prediction for Distributed Shared Caches

[...]

Keun Sup Shim¹, Mieszko Lis¹, Omer Khan², Srinivas Devadas¹•Institutions (2)

Massachusetts Institute of Technology¹, University of Connecticut²

14 Jan 2014-IEEE Computer Architecture Letters

TL;DR: This approach can better exploit shared data locality for NUCA designs by effectively replacing multiple round-trip remote cache accesses with a smaller number of migrations, and improves the performance by 24% on average over the shared-NUCA design that only uses remote accesses.

...read moreread less

Abstract: Chip-multiprocessors (CMPs) have become the mainstream parallel architecture in recent years; for scalability reasons, designs with high core counts tend towards tiled CMPs with physically distributed shared caches. This naturally leads to a Non-Uniform Cache Access (NUCA) design, where on-chip access latencies depend on the physical distances between requesting cores and home cores where the data is cached. Improving data locality is thus key to performance, and several studies have addressed this problem using data replication and data migration. In this paper, we consider another mechanism, hardware-level thread migration. This approach, we argue, can better exploit shared data locality for NUCA designs by effectively replacing multiple round-trip remote cache accesses with a smaller number of migrations. High migration costs, however, make it crucial to use thread migrations judiciously; we therefore propose a novel, on-line prediction scheme which decides whether to perform a remote access (as in traditional NUCA designs) or to perform a thread migration at the instruction level. For a set of parallel benchmarks, our thread migration predictor improves the performance by 24% on average over the shared-NUCA design that only uses remote accesses.

...read moreread less

Posted Content•

Enhancing Oblivious RAM Performance Using Dynamic Prefetching.

[...]

Xiangyao Yu, Ling Ren, Christopher W. Fletcher, Albert Kwon, Marten van Dijk, Srinivas Devadas - Show less +2 more

01 Jan 2014-IACR Cryptology ePrint Archive

TL;DR: In this paper, the authors propose an ORAM prefetching technique called dynamic super block scheme and comprehensively explore its design space, which detects data locality in the program working set at runtime, and exploits the locality in a data-independent way.

...read moreread less

Abstract: Oblivious RAM (ORAM) is an established technique to hide the access pattern to an untrusted storage system. With ORAM, a curious adversary cannot tell what data address the user is accessing when observing the bits moving between the user and the storage system. All existing ORAM schemes achieve obliviousness by adding redundancy to the storage system, i.e., each access is turned into multiple random accesses. Such redundancy incurs a large performance overhead. Though traditional data prefetching techniques successfully hide memory latency in DRAM based systems, it turns out that they do not work well for ORAM. In this paper, we exploit ORAM locality by taking advantage of the ORAM internal structures. Though it might seem apparent that obliviousness and locality are two contradictory concepts, we challenge this intuition by exploiting data locality in ORAM without sacrificing provable security. In particular, we propose an ORAM prefetching technique called dynamic super block scheme and comprehensively explore its design space. The dynamic super block scheme detects data locality in the program’s working set at runtime, and exploits the locality in a data-independent way. Our simulation results show that with dynamic super block scheme, ORAM performance without super blocks can be significantly improved. After adding timing protection to ORAM, the average performance gain is 25.5% (up to 49.4%) over the baseline ORAM and 16.6% (up to 30.1%) over the best ORAM prefetching technique proposed previously.

...read moreread less

Proceedings Article•DOI•

Power modeling and other new features in the Graphite simulator

[...]

George Kurian¹, Sabrina M. Neuman¹, George Bezerra¹, Anthony Giovinazzo¹, Srinivas Devadas¹, Jason E. Miller¹ - Show less +2 more•Institutions (1)

Massachusetts Institute of Technology¹

23 Mar 2014

TL;DR: Improvements to the Graphite simulator designed to help explore current and emerging research topics are described, ideally suited to explore both power and performance in future multicore and manycore processors, especially those incorporating dynamic runtime monitoring and adaptation.

...read moreread less

Abstract: This paper described recent improvements to the Graphite simulator designed to help explore current and emerging research topics. With these improvements, Graphite is ideally suited to explore both power and performance in future multicore and manycore processors, especially those incorporating dynamic runtime monitoring and adaptation. Separate validation of Graphite has shown performance results within about 6% on average (18% worst case) of a cycle-level simulator and normalized power trends are predicted to within 10%. This makes Graphite accurate enough for medium- to long-term studies while maintaining very high performance. Graphite is freely available for anyone to use: http://graphite.csail.mit.edu.

...read moreread less

Posted Content•

Trapdoor Computational Fuzzy Extractors.

[...]

Charles Herder, Ling Ren, Marten van Dijk, Meng-Day (Mandel) Yu, Srinivas Devadas - Show less +1 more

01 Jan 2014-IACR Cryptology ePrint Archive

TL;DR: A method of cryptographically-secure key extraction from a noisy biometric source using a fuzzy commitment scheme and shows how keys can be extracted securely and efficiently even under extreme environmental variation.

...read moreread less

Proceedings Article•DOI•

Author retrospective AEGIS: architecture for tamper-evident and tamper-resistant processing

[...]

G. Edward Suh¹, Christopher W. Fletcher², Dwaine Clarke², Blaise Gassend², Marten van Dijk³, Srinivas Devadas² - Show less +2 more•Institutions (3)

Cornell University¹, Massachusetts Institute of Technology², University of Connecticut³

10 Jun 2014

TL;DR: AEGIS is a single-chip secure processor that can be used to protect the integrity and confidentiality of an application program from both physical and software attacks.

...read moreread less

Abstract: AEGIS is a single-chip secure processor that can be used to protect the integrity and confidentiality of an application program from both physical and software attacks. We briefly describe the history behind this architecture and its key features, discuss main observations and lessons from the project, and list limitations of AEGIS and how recent research addresses them.

...read moreread less

Proceedings Article•DOI•

Algorithms for scheduling task-based applications onto heterogeneous many-core architectures

[...]

Michel A. Kinsy¹, Srinivas Devadas²•Institutions (2)

University of Oregon¹, Massachusetts Institute of Technology²

01 Sep 2014

TL;DR: In this paper, an ILP formulation and two non-iterative heuristics for task-based application scheduling on a heterogeneous many-core architecture are presented, where the ILP convergence time may be too long.

...read moreread less

Abstract: In this paper we present an Integer Linear Programming (ILP) formulation and two non-iterative heuristics for scheduling a task-based application onto a heterogeneous many-core architecture. Our ILP formulation is able to handle different application performance targets, e.g., low execution time, low memory miss rate, and different architectural features, e.g., cache sizes. For large size problem where the ILP convergence time may be too long, we propose a simple mapping algorithm which tries to spread tasks onto as many processing units as possible, and a more elaborate heuristic that shows good mapping performance when compared to the ILP formulation. We use two realistic power electronics applications to evaluate our mapping techniques on full RTL many-core systems consisting of eight different types of processor cores.

...read moreread less

Algorithms for scheduling task-based applications onto heterogeneous many-core architectures

[...]

Michel A. Kinsy¹, Srinivas Devadas²•Institutions (2)

University of Oregon¹, Massachusetts Institute of Technology²

01 Sep 2014

TL;DR: This paper proposes a simple mapping algorithm which tries to spread tasks onto as many processing units as possible, and a more elaborate heuristic that shows good mapping performance when compared to the ILP formulation.

...read moreread less

Proceedings Article•DOI•

Low-overhead hard real-time aware interconnect network router

[...]

Michel A. Kinsy¹, Srinivas Devadas²•Institutions (2)

University of Oregon¹, Massachusetts Institute of Technology²

01 Sep 2014

TL;DR: In this paper, the authors proposed a network-on-chip router that provides predictable and deterministic communication latency for hard real-time data traffic while maintaining high concurrency and throughput for best-effort/general-purpose traffic with minimal hardware overhead.

...read moreread less

Abstract: The increasing complexity of embedded systems is accelerating the use of multicore processors in these systems. This trend gives rise to new problems such as the sharing of on-chip network resources among hard real-time and normal best effort data traffic. We propose a network-on-chip router that provides predictable and deterministic communication latency for hard real-time data traffic while maintaining high concurrency and throughput for best-effort/general-purpose traffic with minimal hardware overhead. The proposed router requires less area than non-interfering networks, and provides better Quality of Service (QoS) in terms of predictability and determinism to hard real-time traffic than priority-based routers. We present a deadlock-free algorithm for decoupled routing of the two types of traffic. We compare the area and power estimates of three different router architectures with various QoS schemes using the IBM 45-nm SOI CMOS technology cell library. Performance evaluations are done using three realistic benchmark applications: a hybrid electric vehicle application, a utility grid connected photovoltaic converter system, and a variable speed induction motor drive application.

...read moreread less

Physical Unclonable Functions and Applications: A Tutorial This paper is a tutorial on ongoing work in physical-disorder-based security, security analysis, and implementation choices.

[...]

Charles Herder, Farinaz Koushanfar, Srinivas Devadas

01 Jan 2014

TL;DR: This paper describes the use of physical unclon- able functions (PUFs) in low-cost authentication and key generation applications and defines the two primary PUF types: ''strong PUFs'' and ''weak PUFs.''

...read moreread less

Abstract: This paper describes the use of physical unclon- able functions (PUFs) in low-cost authentication and key generation applications. First, it motivates the use of PUFs versus conventional secure nonvolatile memories and defines the two primary PUF types: ''strong PUFs'' and ''weak PUFs.'' It describes strong PUF implementations and their use for low- cost authentication. After this description, the paper covers both attacks and protocols to address errors. Next, the paper covers weak PUF implementations and their use in key gene- ration applications. It covers error-correction schemes such as pattern matching and index-based coding. Finally, this paper reviews several emerging concepts in PUF technologies such as public model PUFs and new PUF implementation technologies.

...read moreread less

Proceedings Article•DOI•

Keynote addresses: Can EDA solve the problems of electronics designs for the cars of the future?

[...]

Peter van Staa¹, Srinivas Devadas²•Institutions (2)

Bosch¹, Massachusetts Institute of Technology²

01 Nov 2014

TL;DR: An EDA system is essential for an electronics design in due time with respect to continuously shorter design cycles in parallel to larger product spectra and high pressure on the development costs due to the increasing competition on the world market.

...read moreread less

Abstract: Electronic systems in modern cars contribute with more than 80% to the innovation of the Automotive industry — probably being already the most complex systems in products of today. This complexity is not due to the sheer number of components in each device, but by the number of devices, and their heterogeneous nature combining analogue and digital circuits with sensors, actuators and software. In addition the very high demand on robustness and reliability to assure safety and availability at any time and everywhere under rough working conditions requires specific effort in the quality management of the electronics. While in the past a car was more or less a closed system today the use of any kind of multimedia, the communication with the internet and — increasingly — with all parts of the surrounding traffic has becoming a key asset of the development of modern cars. All these aspects have to be addressed by an EDA system which is essential for an electronics design in due time with respect to continuously shorter design cycles in parallel to larger product spectra and high pressure on the development costs due to the increasing competition on the world market. A further challenge for an EDA environment in the automotive design chain is the management of a large number of players with a different background over a broad spectrum of abstraction levels.

...read moreread less

Journal Article•DOI•

Simultaneous Alignment and Folding of Protein Sequences

[...]

Jérôme Waldispühl¹, Charles W. O'Donnell, Sebastian Will, Srinivas Devadas, Rolf Backofen, Bonnie Berger - Show less +2 more•Institutions (1)

McGill University¹

03 Jul 2014-Journal of Computational Biology

TL;DR: PartiFold-Align as discussed by the authors exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding.

...read moreread less

Abstract: Accurate comparative analysis tools for low-homology proteins remains a difficult challenge in computational biology, especially sequence alignment and consensus folding problems. We present partiFold-Align, the first algorithm for simultaneous alignment and consensus folding of unaligned protein sequences; the algorithm's complexity is polynomial in time and space. Algorithmically, partiFold-Align exploits sparsity in the set of super-secondary structure pairings and alignment candidates to achieve an effectively cubic running time for simultaneous pairwise alignment and folding. We demonstrate the efficacy of these techniques on transmembrane β-barrel proteins, an important yet difficult class of proteins with few known three-dimensional structures. Testing against structurally derived sequence alignments, partiFold-Align significantly outperforms state-of-the-art pairwise and multiple sequence alignment tools in the most difficult low-sequence homology case. It also improves secondary structur...

...read moreread less

Posted Content•

Constants Count: Practical Improvements to Oblivious RAM

[...]

Ling Ren¹, Christopher W. Fletcher¹, Albert Kwon¹, Emil Stefanov², Elaine Shi³, Marten van Dijk⁴, Srinivas Devadas¹ - Show less +3 more•Institutions (4)

Massachusetts Institute of Technology¹, University of California, Berkeley², Cornell University³, University of Connecticut⁴

01 Jan 2014-IACR Cryptology ePrint Archive

TL;DR: Ring ORAM as discussed by the authors is the first tree-based ORAM whose bandwidth is independent of the ORAM bucket size, a property that unlocks multiple performance improvements, such as 2.3× to 4× better than Path ORAM, the prior-art scheme for small client storage.

...read moreread less

Abstract: Oblivious RAM (ORAM) is a cryptographic primitive that hides memory access patterns as seen by untrusted storage. This paper proposes Ring ORAM, the most bandwidth-efficient ORAM scheme for the small client storage setting in both theory and practice. Ring ORAM is the first tree-based ORAM whose bandwidth is independent of the ORAM bucket size, a property that unlocks multiple performance improvements. First, Ring ORAM's overall bandwidth is 2.3× to 4× better than Path ORAM, the prior-art scheme for small client storage. Second, if memory can perform simple untrusted computation, Ring ORAM achieves constant online bandwidth (∼ 60× improvement over Path ORAM for practical parameters). As a case study, we show Ring ORAM speeds up program completion time in a secure processor by 1.5× relative to Path ORAM. On the theory side, Ring ORAM features a tighter and significantly simpler analysis than Path ORAM.

...read moreread less

Proceedings Article•DOI•

Author retrospective for analytical cache models with applications to cache partitioning

[...]

G. Edward Suh¹, George Kurian², Srinivas Devadas², Larry Rudolph²•Institutions (2)

Cornell University¹, Massachusetts Institute of Technology²

10 Jun 2014

TL;DR: The history of the work, primary observations and lessons that were learned from the modeling effort, and follow-up work to show how the research direction evolved over time are summarized.

...read moreread less

Abstract: This paper presents the author retrospective on the analytical cache modeling work published in the 2001 International Conference on Supercomputing (ICS). We summarize the history of the work, revisit primary observations and lessons that we learned from the modeling effort, and also briefly describe follow-up work to show how the research direction evolved over time.Original Paper: http://dx.doi.org/10.1145/377792.377797

...read moreread less