Showing papers by "Hewlett-Packard published in 2009"

PDF

Open Access

Proceedings Article•DOI•

McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures

[...]

Sheng Li¹, Jung Ho Ahn², Richard Strong³, Jay B. Brockman¹, Dean M. Tullsen³, Norman P. Jouppi⁴ - Show less +2 more•Institutions (4)

University of Notre Dame¹, Seoul National University², University of California, San Diego³, Hewlett-Packard⁴

12 Dec 2009

TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.

...read moreread less

Abstract: This paper introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated memory controllers, and multiple-domain clocking. At the circuit and technology levels, McPAT supports critical-path timing modeling, area modeling, and dynamic, short-circuit, and leakage power modeling for each of the device types forecast in the ITRS roadmap including bulk CMOS, SOI, and double-gate transistors. McPAT has a flexible XML interface to facilitate its use with many performance simulators. Combined with a performance simulator, McPAT enables architects to consistently quantify the cost of new ideas and assess tradeoffs of different architectures using new metrics like energy-delay-area2 product (EDA2P) and energy-delay-area product (EDAP). This paper explores the interconnect options of future manycore processors by varying the degree of clustering over generations of process technologies. Clustering will bring interesting tradeoffs between area and performance because the interconnects needed to group cores into clusters incur area overhead, but many applications can make good use of them due to synergies of cache sharing. Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taken into account configuring clusters with 4 cores gives the best EDA2P and EDAP.

...read moreread less

2,487 citations

Journal Article•DOI•

The mechanism of electroforming of metal oxide memristive switches

[...]

Jianhua Yang¹, Feng Miao², Matthew D. Pickett¹, Douglas A. A. Ohlberg¹, Duncan Stewart¹, Chun Ning Lau², R. Stanley Williams¹ - Show less +3 more•Institutions (2)

Hewlett-Packard¹, University of California, Riverside²

05 May 2009-Nanotechnology

TL;DR: The nature of the oxide electroforming as an electro-reduction and vacancy creation process caused by high electric fields and enhanced by electrical Joule heating is explained with direct experimental evidence.

...read moreread less

Abstract: Metal and semiconductor oxides are ubiquitous electronic materials. Normally insulating, oxides can change behavior under high electric fields—through ‘electroforming’ or ‘breakdown’—critically affecting CMOS (complementary metal‐oxide‐semiconductor) logic, DRAM (dynamic random access memory) and flash memory, and tunnel barrier oxides. An initial irreversible electroforming process has been invariably required for obtaining metal oxide resistance switches, which may open urgently needed new avenues for advanced computer memory and logic circuits including ultra-dense non-volatile random access memory (NVRAM) and adaptive neuromorphic logic circuits. This electrical switching arises from the coupled motion of electrons and ions within the oxide material, as one of the first recognized examples of a memristor (memory‐resistor) device, the fourth fundamental passive circuit element originally predicted in 1971 by Chua. A lack of device repeatability has limited technological implementation of oxide switches, however. Here we explain the nature of the oxide electroforming as an electro-reduction and vacancy creation process caused by high electric fields and enhanced by electrical Joule heating with direct experimental evidence. Oxygen vacancies are created and drift towards the cathode, forming localized conducting channels in the oxide. Simultaneously, O 2− ions drift towards the anode where they evolve O2 gas, causing physical deformation of the junction. The problematic gas eruption and physical deformation are mitigated by shrinking to the nanoscale and controlling the electroforming voltage polarity. Better yet, electroforming problems can be largely eliminated by engineering the device structure to remove ‘bulk’ oxide effects in favor of interface-controlled electronic switching.

...read moreread less

787 citations

Journal Article•DOI•

Cloud Computing: Distributed Internet Computing for IT and Scientific Research

[...]

Marios D. Dikaiakos¹, Dimitrios Katsaros², Pankaj Mehra³, George Pallis¹, Athena Vakali⁴ - Show less +1 more•Institutions (4)

University of Cyprus¹, University of Thessaly², Hewlett-Packard³, Aristotle University of Thessaloniki⁴

01 Sep 2009-IEEE Internet Computing

TL;DR: This issue's articles tackle topics including architecture and management of cloud computing infrastructures, SaaS and IaaS applications, discovery of services and data in cloud computing infrastructure, and cross-platform interoperability.

...read moreread less

Abstract: Cloud computing is a disruptive technology with profound implications not only for Internet services but also for the IT sector as a whole. Its emergence promises to streamline the on-demand provisioning of software, hardware, and data as a service, achieving economies of scale in IT solutions' deployment and operation. This issue's articles tackle topics including architecture and management of cloud computing infrastructures, SaaS and IaaS applications, discovery of services and data in cloud computing infrastructures, and cross-platform interoperability. Still, several outstanding issues exist, particularly related to SLAs, security and privacy, and power efficiency. Other open issues include ownership, data transfer bottlenecks, performance unpredictability, reliability, and software licensing issues. Finally, hosted applications' business models must show a clear pathway to monetizing cloud computing. Several companies have already built Internet consumer services such as search, social networking, Web email, and online commerce that use cloud computing infrastructure. Above all, cloud computing's still unknown "killer application" will determine many of the challenges and the solutions we must develop to make this technology work in practice.

...read moreread less

786 citations

Journal Article•DOI•

Memristor―CMOS Hybrid Integrated Circuits for Reconfigurable Logic

[...]

Qiangfei Xia¹, Warren Robinett¹, Cumbie Michael W¹, Neel Banerjee¹, Thomas J. Cardinali¹, Jianhua Yang¹, Wei Wu¹, Xuema Li¹, William M. Tong¹, Dmitri B. Strukov¹, Gregory S. Snider¹, Gilberto Medeiros-Ribeiro¹, R. Stanley Williams¹ - Show less +9 more•Institutions (1)

Hewlett-Packard¹

01 Sep 2009-Nano Letters

TL;DR: Hybrid reconfigurable logic circuits were fabricated by integrating memristor-based crossbars onto a foundry-built CMOS (complementary metal-oxide-semiconductor) platform using nanoimprint lithography, as well as materials and processes that were compatible with the CMOS.

...read moreread less

Abstract: Hybrid reconfigurable logic circuits were fabricated by integrating memristor-based crossbars onto a foundry-built CMOS (complementary metal-oxide-semiconductor) platform using nanoimprint lithography, as well as materials and processes that were compatible with the CMOS Titanium dioxide thin-film memristors served as the configuration bits and switches in a data routing network and were connected to gate-level CMOS components that acted as logic elements, in a manner similar to a field programmable gate array We analyzed the chips using a purpose-built testing system, and demonstrated the ability to configure individual devices, use them to wire up various logic gates and a flip-flop, and then reconfigure devices

...read moreread less

612 citations

Proceedings Article•DOI•

Taking account of privacy when designing cloud computing services

[...]

Siani Pearson¹•Institutions (1)

Hewlett-Packard¹

23 May 2009

TL;DR: The privacy challenges that software engineers face when targeting the cloud as their production environment to offer services are assessed, and key design principles to address these are suggested.

...read moreread less

Abstract: Privacy is an important issue for cloud computing, both in terms of legal compliance and user trust, and needs to be considered at every phase of design. In this paper the privacy challenges that software engineers face when targeting the cloud as their production environment to offer services are assessed, and key design principles to address these are suggested.

...read moreread less

600 citations

Proceedings Article•DOI•

Automated control of multiple virtualized resources

[...]

Pradeep Padala¹, Kai-Yuan Hou¹, Kang G. Shin¹, Xiaoyun Zhu², Mustafa Uysal³, Zhikui Wang³, Sharad Singhal³, Arif Merchant³ - Show less +4 more•Institutions (3)

University of Michigan¹, VMware², Hewlett-Packard³

01 Apr 2009

TL;DR: Experimental evaluation with RUBiS and TPC-W benchmarks along with production-trace-driven workloads indicates that AutoControl can detect and mitigate CPU and disk I/O bottlenecks that occur over time and across multiple nodes by allocating each resource accordingly.

...read moreread less

Abstract: Virtualized data centers enable sharing of resources among hosted applications. However, it is difficult to satisfy service-level objectives(SLOs) of applications on shared infrastructure, as application workloads and resource consumption patterns change over time. In this paper, we present AutoControl, a resource control system that automatically adapts to dynamic workload changes to achieve application SLOs. AutoControl is a combination of an online model estimator and a novel multi-input, multi-output (MIMO) resource controller. The model estimator captures the complex relationship between application performance and resource allocations, while the MIMO controller allocates the right amount of multiple virtualized resources to achieve application SLOs. Our experimental evaluation with RUBiS and TPC-W benchmarks along with production-trace-driven workloads indicates that AutoControl can detect and mitigate CPU and disk I/O bottlenecks that occur over time and across multiple nodes by allocating each resource accordingly. We also show that AutoControl can be used to provide service differentiation according to the application priorities during resource contention.

...read moreread less

553 citations

Journal Article•DOI•

Diamonds with a high density of nitrogen-vacancy centers for magnetometry applications

[...]

Victor M. Acosta¹, Erik Bauch², Erik Bauch¹, Micah P. Ledbetter¹, Charles Santori³, Kai-Mei C. Fu³, Paul E. Barclay³, Raymond G. Beausoleil³, H. Linget⁴, Jean-François Roch⁴, François Treussart⁴, S. Chemerisov⁵, Wojciech Gawlik⁶, Dmitry Budker¹, Dmitry Budker⁷ - Show less +11 more•Institutions (7)

University of California, Berkeley¹, Technical University of Berlin², Hewlett-Packard³, École normale supérieure de Cachan⁴, Argonne National Laboratory⁵, Jagiellonian University⁶, Lawrence Berkeley National Laboratory⁷

09 Sep 2009-Physical Review B

TL;DR: In this paper, the optical and spin-relaxation properties of millimeter-scale diamond samples were characterized using confocal microscopy, visible and infrared absorption, and optically detected magnetic resonance.

...read moreread less

Abstract: Nitrogen-vacancy (NV) centers in millimeter-scale diamond samples were produced by irradiation and subsequent annealing under varied conditions. The optical and spin-relaxation properties of these samples were characterized using confocal microscopy, visible and infrared absorption, and optically detected magnetic resonance. The sample with the highest ${\text{NV}}^{\ensuremath{-}}$ concentration, approximately 16 ppm $(2.8\ifmmode\times\else\texttimes\fi{}{10}^{18}\text{ }{\text{cm}}^{\ensuremath{-}3})$, was prepared with no observable traces of neutrally charged vacancy defects. The effective transverse spin-relaxation time for this sample was ${T}_{2}^{\ensuremath{\ast}}=118(48)\text{ }\text{ns}$, predominately limited by residual paramagnetic nitrogen which was determined to have a concentration of 49(7) ppm. Under ideal conditions, the shot-noise limited sensitivity is projected to be $\ensuremath{\sim}150\text{ }\text{fT}/\sqrt{\text{Hz}}$ for a $100\text{ }\ensuremath{\mu}\text{m}$-scale magnetometer based on this sample. Other samples with ${\text{NV}}^{\ensuremath{-}}$ concentrations from 0.007 to 12 ppm and effective relaxation times ranging from 27 to over 291 ns were prepared and characterized.

...read moreread less

523 citations

Journal Article•DOI•

Exponential ionic drift: fast switching and low volatility of thin-film memristors

[...]

Dmitri B. Strukov¹, R. Stanley Williams¹•Institutions (1)

Hewlett-Packard¹

01 Mar 2009-Applied Physics A

TL;DR: In this paper, the authors investigate the exponential dependence of switching speeds in thin-film memristors for high electric fields and elevated temperatures, and propose a nonlinear ionic drift model to predict the volatility and switching time for various material systems.

...read moreread less

Abstract: We investigate the exponential dependence of switching speeds in thin-film memristors for high electric fields and elevated temperatures. An existing nonlinear ionic drift model and our simulation results explain the very large ratios for the state lifetime to switching speed experimentally observed in devices for which resistance switching is due to ion migration. Given the activation barriers of the drifting species, it is possible to predict the volatility and switching time for various material systems.

...read moreread less

515 citations

Proceedings Article•DOI•

What's inside the Cloud? An architectural map of the Cloud landscape

[...]

Alexander Lenk, Markus Klems, Jens Nimis, Stefan Tai, Thomas Sandholm¹ - Show less +1 more•Institutions (1)

Hewlett-Packard¹

23 May 2009

TL;DR: This work proposes an integrated Cloud computing stack architecture to serve as a reference point for future mash-ups and comparative studies and shows how the existing Cloud landscape maps into this architecture and identifies an infrastructure gap.

...read moreread less

Abstract: We propose an integrated Cloud computing stack architecture to serve as a reference point for future mash-ups and comparative studies. We also show how the existing Cloud landscape maps into this architecture and identify an infrastructure gap that we plan to address in future work.

...read moreread less

506 citations

Proceedings Article•

Sparse indexing: large scale, inline deduplication using sampling and locality

[...]

Mark Lillibridge¹, Kave Eshghi¹, Deepavali Bhagwat², Vinay Deolalikar¹, Greg Trezise, Peter Thomas Camble - Show less +2 more•Institutions (2)

Hewlett-Packard¹, University of California, Santa Cruz²

24 Feb 2009

TL;DR: Sparse indexing, a technique that uses sampling and exploits the inherent locality within backup streams to solve for large-scale backup the chunk-lookup disk bottleneck problem that inline, chunk-based deduplication schemes face, is presented.

...read moreread less

Abstract: We present sparse indexing, a technique that uses sampling and exploits the inherent locality within backup streams to solve for large-scale backup (e.g., hundreds of terabytes) the chunk-lookup disk bottleneck problem that inline, chunk-based deduplication schemes face. The problem is that these schemes traditionally require a full chunk index, which indexes every chunk, in order to determine which chunks have already been stored; unfortunately, at scale it is impractical to keep such an index in RAM and a disk-based index with one seek per incoming chunk is far too slow. We perform stream deduplication by breaking up an incoming stream into relatively large segments and deduplicating each segment against only a few of the most similar previous segments. To identify similar segments, we use sampling and a sparse index. We choose a small portion of the chunks in the stream as samples; our sparse index maps these samples to the existing segments in which they occur. Thus, we avoid the need for a full chunk index. Since only the sampled chunks' hashes are kept in RAM and the sampling rate is low, we dramatically reduce the RAM to disk ratio for effective deduplication. At the same time, only a few seeks are required per segment so the chunk-lookup disk bottleneck is avoided. Sparse indexing has recently been incorporated into number of Hewlett-Packard backup products.

...read moreread less

477 citations

Proceedings Article•DOI•

Extreme Binning: Scalable, parallel deduplication for chunk-based file backup

[...]

Deepavali Bhagwat¹, Kave Eshghi², Darrell D. E. Long¹, Mark Lillibridge²•Institutions (2)

University of California, Santa Cruz¹, Hewlett-Packard²

28 Dec 2009

TL;DR: Extreme Binning is presented, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time.

...read moreread less

Abstract: Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. We present Extreme Binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time. Due to lack of locality, existing techniques perform poorly on these workloads. Extreme Binning exploits file similarity instead of locality, and makes only one disk access for chunk lookup per file, which gives reasonable throughput. Multi-node backup systems built with Extreme Binning scale gracefully with the amount of input data; more backup nodes can be added to boost throughput. Each file is allocated using a stateless routing algorithm to only one node, allowing for maximum parallelization, and each backup node is autonomous with no dependency across nodes, making data management tasks robust with low overhead.

...read moreread less

Patent•

Method and system of regulating voltages

[...]

John R. Spencer¹•Institutions (1)

Hewlett-Packard¹

30 Jan 2009

TL;DR: In this paper, a switching circuit is configured to create the intermediate voltage signal based on a switching signal having a duty cycle, and wherein the duty cycle of the switching signal is open-loop with respect the intermediate signal and the first regulated voltage signal.

...read moreread less

Abstract: Regulating voltages At least some of the illustrative embodiments are systems including a switching circuit configured to produce an intermediate voltage signal from an input voltage signal, and a first voltage regulator coupled the switching circuit and configured to produce a first regulated voltage signal from the intermediate voltage signal The switching circuit is configured to create the intermediate voltage signal based on a switching signal having a duty cycle, and wherein the duty cycle of the switching signal is open-loop with respect the intermediate voltage signal and the first regulated voltage signal

...read moreread less

Patent•

Extended touch-sensitive control area for electronic device

[...]

Matias Gonzalo Duarte, Daniel Marc Gatan Shiplacoff¹, Dianne Parry Dominguez, Jeremy Godfrey Lyon¹, Paul Mercer, Peter Skillman¹ - Show less +2 more•Institutions (1)

Hewlett-Packard¹

04 May 2009

TL;DR: In this paper, a touch-sensitive display screen is enhanced by a touch sensitive control area that extends beyond the edges of the display screen, referred to as a gesture area, allowing a user to activate commands using a gesture vocabulary.

...read moreread less

Abstract: A touch-sensitive display screen is enhanced by a touch-sensitive control area that extends beyond the edges of the display screen. The touch-sensitive area outside the display screen, referred to as a "gesture area," allows a user to activate commands using a gesture vocabulary. In one aspect, the present invention allows some commands to be activated by inputting a gesture within the gesture area. Other commands can be activated by directly manipulating on-screen objects. Yet other commands can be activated by beginning a gesture within the gesture area, and finishing it on the screen (or vice versa), and/or by performing input that involves contemporaneous con-tact with both the gesture area and the screen.

...read moreread less

Proceedings Article•DOI•

Disaggregated memory for expansion and sharing in blade servers

[...]

Kevin Lim¹, Jichuan Chang², Trevor Mudge¹, Parthasarathy Ranganathan², Steven K. Reinhardt³, Thomas F. Wenisch¹ - Show less +2 more•Institutions (3)

University of Michigan¹, Hewlett-Packard², Advanced Micro Devices³

20 Jun 2009

TL;DR: It is demonstrated that memory disaggregation can provide substantial performance benefits (on average 10X) in memory constrained environments, while the sharing enabled by the solutions can improve performance-per-dollar by up to 57% when optimizing memory provisioning across multiple servers.

...read moreread less

Abstract: Analysis of technology and application trends reveals a growing imbalance in the peak compute-to-memory-capacity ratio for future servers. At the same time, the fraction contributed by memory systems to total datacenter costs and power consumption during typical usage is increasing. In response to these trends, this paper re-examines traditional compute-memory co-location on a single system and details the design of a new general-purpose architectural building block-a memory blade-that allows memory to be "disaggregated" across a system ensemble. This remote memory blade can be used for memory capacity expansion to improve performance and for sharing memory across servers to reduce provisioning and power costs. We use this memory blade building block to propose two new system architecture solutions-(1) page-swapped remote memory at the virtualization layer, and (2) block-access remote memory with support in the coherence hardware-that enable transparent memory expansion and sharing on commodity-based systems. Using simulations of a mix of enterprise benchmarks supplemented with traces from live datacenters, we demonstrate that memory disaggregation can provide substantial performance benefits (on average 10X) in memory constrained environments, while the sharing enabled by our solutions can improve performance-per-dollar by up to 57% when optimizing memory provisioning across multiple servers.

...read moreread less

Book Chapter•DOI•

A Power Benchmarking Framework for Network Devices

[...]

Priya Mahadevan¹, Puneet Sharma¹, Sujata Banerjee¹, Parthasarathy Ranganathan¹•Institutions (1)

Hewlett-Packard¹

11 May 2009

TL;DR: The hurdles in network power instrumentation are described and a power measurement study of a variety of networking gear such as hubs, edge switches, core switches, routers and wireless access points in both stand-alone mode and a production data center are presented.

...read moreread less

Abstract: Energy efficiency is becoming increasingly important in the operation of networking infrastructure, especially in enterprise and data center networks. Researchers have proposed several strategies for energy management of networking devices. However, we need a comprehensive characterization of power consumption by a variety of switches and routers to accurately quantify the savings from the various power savings schemes. In this paper, we first describe the hurdles in network power instrumentation and present a power measurement study of a variety of networking gear such as hubs, edge switches, core switches, routers and wireless access points in both stand-alone mode and a production data center. We build and describe a benchmarking suite that will allow users to measure and compare the power consumed for a large set of common configurations at any switch or router of their choice. We also propose a network energy proportionality index, which is an easily measurable metric, to compare power consumption behaviors of multiple devices.

...read moreread less

Journal Article•DOI•

Is Market Fragmentation Harming Market Quality

[...]

Maureen O'Hara¹, Mao Ye²•Institutions (2)

Cornell University¹, Hewlett-Packard²

10 Mar 2009-Social Science Research Network

TL;DR: In this paper, the authors examine how fragmentation of trading is affecting the quality of trading in U.S. markets and find that market fragmentation generally reduces transactions costs and increases execution speeds.

...read moreread less

Abstract: Equity markets world-wide have seen a proliferation of trading venues and the consequent fragmentation of order flow. In this paper, we examine how fragmentation of trading is affecting the quality of trading in U.S. markets. We propose using newly-available TRF (trade reporting facilities) volumes to proxy for fragmentation levels in individual stocks, and we use a matched sample to compare execution quality and efficiency of stocks with more and less fragmented trading. We find that market fragmentation generally reduces transactions costs and increases execution speeds. Fragmentation does increase short-term volatility, but prices are more efficient in that they are closer to being a random walk. Our results that fragmentation does not appear to harm market quality have important implications for regulatory policy.

...read moreread less

Proceedings Article•DOI•

Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning

[...]

Archana Ganapathi¹, Harumi Kuno², Umeshwar Dayal², Janet L. Wiener², Armando Fox¹, Michael I. Jordan¹, David A. Patterson¹ - Show less +3 more•Institutions (2)

University of California, Berkeley¹, Hewlett-Packard²

29 Mar 2009

TL;DR: A system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours, and was able to correctly identify both the short and long-running queries to inform workload management and capacity planning.

...read moreread less

Abstract: One of the most challenging aspects of managing a very large data warehouse is identifying how queries will behave before they start executing. Yet knowing their performance characteristics --- their runtimes and resource usage --- can solve two important problems. First, every database vendor struggles with managing unexpectedly long-running queries. When these long-running queries can be identified before they start, they can be rejected or scheduled when they will not cause extreme resource contention for the other queries in the system. Second, deciding whether a system can complete a given workload in a given time period (or a bigger system is necessary) depends on knowing the resource requirements of the queries in that workload. We have developed a system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours. For training and testing our system, we used both real customer queries and queries generated from an extended set of TPC-DS templates. The extensions mimic queries that caused customer problems. We used these queries to compare how accurately different techniques predict metrics such as elapsed time, records used, disk I/Os, and message bytes. The most promising technique was not only the most accurate, but also predicted these metrics simultaneously and using only information available prior to query execution. We validated the accuracy of this machine learning technique on a number of HP Neoview configurations. We were able to predict individual query elapsed time within 20% of its actual time for 85% of the test queries. Most importantly, we were able to correctly identify both the short and long-running (up to two hour) queries to inform workload management and capacity planning.

...read moreread less

Proceedings Article•DOI•

Real-time O(1) bilateral filtering

[...]

Qingxiong Yang¹, Kar-Han Tan², Narendra Ahuja¹•Institutions (2)

University of Illinois at Urbana–Champaign¹, Hewlett-Packard²

20 Jun 2009

TL;DR: A new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature, that yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels.

...read moreread less

Abstract: We propose a new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature. By showing that a bilateral filter can be decomposed into a number of constant time spatial filters, our method yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels. In contrast, the current available constant time algorithm requires the use of specific spatial or specific range kernels. Also, our algorithm lends itself to a parallel implementation leading to the first real-time O(1) algorithm that we know of. Meanwhile, our algorithm yields higher quality results since we are effectively quantizing the range function instead of quantizing both the range function and the input image. Empirical experiments show that our algorithm not only gives higher PSNR, but is about 10× faster than the state-of-the-art. It also has a small memory footprint, needed only 2% of the memory required by the state-of-the-art for obtaining the same quality as exact using 8-bit images. We also show that our algorithm can be easily extended for O(1) median filtering. Our bilateral filtering algorithm was tested in a number of applications, including HD video conferencing, video abstraction, highlight removal, and multi-focus imaging.

...read moreread less

Journal Article•DOI•

Quantum repeater with encoding

[...]

Liang Jiang¹, Jacob M. Taylor², Kae Nemoto³, William J. Munro³, William J. Munro⁴, Rodney Van Meter³, Rodney Van Meter⁵, Mikhail D. Lukin¹ - Show less +4 more•Institutions (5)

Harvard University¹, Massachusetts Institute of Technology², National Institute of Informatics³, Hewlett-Packard⁴, Keio University⁵

20 Mar 2009-Physical Review A

TL;DR: In this article, the authors propose an approach to implement quantum repeaters for long-distance quantum communication, which generates a backbone of encoded Bell pairs and uses the procedure of classical error correction during simultaneous entanglement connection.

...read moreread less

Abstract: We propose an approach to implement quantum repeaters for long-distance quantum communication. Our protocol generates a backbone of encoded Bell pairs and uses the procedure of classical error correction during simultaneous entanglement connection. We illustrate that the repeater protocol with simple Calderbank-Shor-Steane encoding can significantly extend the communication distance, while still maintaining a fast key generation rate.

...read moreread less

Journal Article•DOI•

A hybrid nanomemristor/transistor logic circuit capable of self-programming.

[...]

Julien Borghetti¹, Zhiyong Li¹, Joseph Straznicky¹, Xuema Li¹, Douglas A. A. Ohlberg¹, Wei Wu¹, Duncan Stewart¹, R. Stanley Williams¹ - Show less +4 more•Institutions (1)

Hewlett-Packard¹

10 Feb 2009-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: The digitally configured memristor crossbars were used to perform logic functions, to serve as a routing fabric for interconnecting the FETs and as the target for storing information.

...read moreread less

Abstract: Memristor crossbars were fabricated at 40 nm half-pitch, using nanoimprint lithography on the same substrate with Si metal-oxide-semiconductor field effect transistor (MOS FET) arrays to form fully integrated hybrid memory resistor (memristor)/transistor circuits. The digitally configured memristor crossbars were used to perform logic functions, to serve as a routing fabric for interconnecting the FETs and as the target for storing information. As an illustrative demonstration, the compound Boolean logic operation (A AND B) OR (C AND D) was performed with kilohertz frequency inputs, using resistor-based logic in a memristor crossbar with FET inverter/amplifier outputs. By routing the output signal of a logic operation back onto a target memristor inside the array, the crossbar was conditionally configured by setting the state of a nonvolatile switch. Such conditional programming illuminates the way for a variety of self-programmed logic arrays, and for electronic synaptic computing.

...read moreread less

Journal Article•DOI•

Coupled Ionic and Electronic Transport Model of Thin‐Film Semiconductor Memristive Behavior

[...]

Dmitri B. Strukov¹, Julien Borghetti¹, R. Stanley Williams¹•Institutions (1)

Hewlett-Packard¹

04 May 2009-Small

TL;DR: A more physical model based on numerical solutions of coupled drift-diffusion equations for electrons and ions with appropriate boundary conditions is provided to obtain physical insight into the transport processes responsible for memristive behavior in semiconductor films.

...read moreread less

Abstract: The memristor, the fourth passive circuit element, was predicted theoretically nearly 40 years ago, but we just recently demonstrated both an intentional material system and an analytical model that exhibited the properties of such a device. Here we provide a more physical model based on numerical solutions of coupled drift-diffusion equations for electrons and ions with appropriate boundary conditions. We simulate the dynamics of a two-terminal memristive device based on a semiconductor thinfilm with mobile dopants that are partially compensated by a small amount of immobile acceptors. We examine the mobile ion distributions, zero-bias potentials, and current‐voltage characteristics of the model for both steady-state bias conditions and for dynamical switching to obtain physical insight into the transport processes responsible for memristive behavior in semiconductor films.

...read moreread less

Proceedings Article•DOI•

HyperX: topology, routing, and packaging of efficient large-scale networks

[...]

Jung Ho Ahn¹, Nathan Binkert¹, Al Davis¹, Moray McLaren¹, Robert Schreiber¹ - Show less +1 more•Institutions (1)

Hewlett-Packard¹

14 Nov 2009

TL;DR: This work considers an extension of the hypercube and flattened butterfly topologies, the HyperX, and gives an adaptive routing algorithm, DAL, to take advantage of high-radix switch components that integrated photonics will make available.

...read moreread less

Abstract: In the push to achieve exascale performance, systems will grow to over 100,000 sockets, as growing cores-per-socket and improved single-core performance provide only part of the speedup needed. These systems will need affordable interconnect structures that scale to this level. To meet the need, we consider an extension of the hypercube and flattened butterfly topologies, the HyperX, and give an adaptive routing algorithm, DAL. HyperX takes advantage of high-radix switch components that integrated photonics will make available. Our main contributions include a formal descriptive framework, enabling a search method that finds optimal HyperX configurations; DAL; and a low cost packaging strategy for an exascale HyperX. Simulations show that HyperX can provide performance as good as a folded Clos, with fewer switches. We also describe a HyperX packaging scheme that reduces system cost. Our analysis of efficiency, performance, and packaging demonstrates that the HyperX is a strong competitor for exascale networks.

...read moreread less

Book Chapter•DOI•

A Privacy Manager for Cloud Computing

[...]

Siani Pearson¹, Yun Shen¹, Miranda Mowbray¹•Institutions (1)

Hewlett-Packard¹

22 Nov 2009

TL;DR: A privacy manager for cloud computing is described, which reduces the risk to the cloud computing user of their private data being stolen or misused, and also assists the cloud Computing provider to conform to privacy law.

...read moreread less

Abstract: We describe a privacy manager for cloud computing, which reduces the risk to the cloud computing user of their private data being stolen or misused, and also assists the cloud computing provider to conform to privacy law. We describe different possible architectures for privacy management in cloud computing; give an algebraic description of obfuscation, one of the features of the privacy manager; and describe how the privacy manager might be used to protect private metadata of online photos.

...read moreread less

Proceedings Article•DOI•

GViM: GPU-accelerated virtual machines

[...]

Vishakha Gupta¹, Ada Gavrilovska¹, Karsten Schwan¹, Harshvardhan Kharche¹, Niraj Tolia², Vanish Talwar², Parthasarathy Ranganathan² - Show less +3 more•Institutions (2)

Georgia Institute of Technology¹, Hewlett-Packard²

31 Mar 2009

TL;DR: GViM is presented, a system designed for virtualizing and managing the resources of a general purpose system accelerated by graphics processors and how such accelerators can be virtualized without additional hardware support.

...read moreread less

Abstract: The use of virtualization to abstract underlying hardware can aid in sharing such resources and in efficiently managing their use by high performance applications. Unfortunately, virtualization also prevents efficient access to accelerators, such as Graphics Processing Units (GPUs), that have become critical components in the design and architecture of HPC systems. Supporting General Purpose computing on GPUs (GPGPU) with accelerators from different vendors presents significant challenges due to proprietary programming models, heterogeneity, and the need to share accelerator resources between different Virtual Machines (VMs).To address this problem, this paper presents GViM, a system designed for virtualizing and managing the resources of a general purpose system accelerated by graphics processors. Using the NVIDIA GPU as an example, we discuss how such accelerators can be virtualized without additional hardware support and describe the basic extensions needed for resource management. Our evaluation with a Xen-based implementation of GViM demonstrate efficiency and flexibility in system usage coupled with only small performance penalties for the virtualized vs. non-virtualized solutions.

...read moreread less

Journal Article•DOI•

Writing to and reading from a nano-scale crossbar memory based on memristors.

[...]

Pascal O. Vontobel¹, Warren Robinett, Philip J. Kuekes, Duncan Stewart², Joseph Straznicky, R. Stanley Williams¹ - Show less +2 more•Institutions (2)

Hewlett-Packard¹, National Research Council²

21 Oct 2009-Nanotechnology

TL;DR: A design study for a nano-scale crossbar memory system that uses memristors with symmetrical but highly nonlinear current-voltage characteristics as memory elements and simulation results show the feasibility of these writing and reading procedures.

...read moreread less

Abstract: We present a design study for a nano-scale crossbar memory system that uses memristors with symmetrical but highly nonlinear current-voltage characteristics as memory elements. The memory is non-volatile since the memristors retain their state when un-powered. In order to address the nano-wires that make up this nano-scale crossbar, we use two coded demultiplexers implemented using mixed-scale crossbars (in which CMOS-wires cross nano-wires and in which the crosspoint junctions have one-time configurable memristors). This memory system does not utilize the kind of devices (diodes or transistors) that are normally used to isolate the memory cell being written to and read from in conventional memories. Instead, special techniques are introduced to perform the writing and the reading operation reliably by taking advantage of the nonlinearity of the type of memristors used. After discussing both writing and reading strategies for our memory system in general, we focus on a 64 x 64 memory array and present simulation results that show the feasibility of these writing and reading procedures. Besides simulating the case where all device parameters assume exactly their nominal value, we also simulate the much more realistic case where the device parameters stray around their nominal value: we observe a degradation in margins, but writing and reading is still feasible. These simulation results are based on a device model for memristors derived from measurements of fabricated devices in nano-scale crossbars using Pt and Ti nano-wires and using oxygen-depleted TiO(2) as the switching material.

...read moreread less

Patent•

Routing across a virtual network

[...]

Aled Justin Edwards¹, Anna Fischer¹, Christopher I Dalton¹, Goldsack Patrick¹•Institutions (1)

Hewlett-Packard¹

09 Mar 2009

TL;DR: In this paper, the authors proposed to use the router as part of the virtual machine manager rather than having only a switch in the VM manager to avoid the need for virtual machines for implementing gateways.

...read moreread less

Abstract: A data center can share processing resources using virtual networks. A virtual machine manager (10) hosts one or more virtual machines (11, 411), the virtual machines forming part of a segmented virtual network (34). Outgoing messages from the virtual machines have an intermediate destination address of an intermediate node in a local segment of the segmented virtual network, and the virtual machine manager has a router (18) for determining a new intermediate destination address outside the local segment, for routing the given outgoing message. By having the router as part of the virtual machine manager rather than having only a switch in the virtual machine manager, the need for virtual machines for implementing gateways is avoided. This can reduce the number of “hops” for the message between virtual entities hosted, and thus improve performance. This can help a service provider to share physical processing resources of a data center between different clients having their own virtual networks.

...read moreread less

Operating system support for NVM+DRAM hybrid main memory

[...]

Jeffrey C. Mogul¹, Eduardo Argollo¹, Mehul A. Shah¹, Paolo Faraboschi¹•Institutions (1)

Hewlett-Packard¹

18 May 2009

TL;DR: Preliminary experiments suggesting that this approach to building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM, is viable are described.

...read moreread less

Abstract: Technology trends may soon favor building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM. We describe how the operating system might manage such hybrid memories, using semantic information not available in other layers. We describe preliminary experiments suggesting that this approach is viable.

...read moreread less

Journal Article•DOI•

Crowdsourcing, attention and productivity

[...]

Bernardo A. Huberman¹, Daniel M. Romero², Fang Wu¹•Institutions (2)

Hewlett-Packard¹, Cornell University²

01 Dec 2009-Journal of Information Science

TL;DR: In this paper, the authors show that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads, which in many cases asymptotes to no uploads whatsoever.

...read moreread less

Abstract: We show through an analysis of a massive data set from YouTube that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads. Conversely, a lack of attention leads to a decrease in the number of videos uploaded and the consequent drop in productivity, which in many cases asymptotes to no uploads whatsoever. Moreover, short-term contributors compare their performance to the average contributor's performance while long-term contributors compare it to their own media.

...read moreread less

Journal Article•DOI•

Column-oriented database systems

[...]

Daniel J. Abadi¹, Peter Boncz, Stavros Harizopoulos²•Institutions (2)

Yale University¹, Hewlett-Packard²

01 Aug 2009

TL;DR: This tutorial presents an overview of column-oriented database system technology and addresses questions about how easily a major row-based system achieve column-store performance and the new applications that can be potentially enabled by column-stores.

...read moreread less

Abstract: Column-oriented database systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional database systems that store entire records (rows) one after the other. Reading a subset of a table's columns becomes faster, at the potential expense of excessive disk-head seeking from column to column for scattered reads or updates. After several dozens of research papers and at least a dozen of new column-store start-ups, several questions remain. Are these a new breed of systems or simply old wine in new bottles? How easily can a major row-based system achieve column-store performance? Are column-stores the answer to effortlessly support large-scale data-intensive applications? What are the new, exciting system research problems to tackle? What are the new applications that can be potentially enabled by column-stores? In this tutorial, we present an overview of column-oriented database system technology and address these and other related questions.

...read moreread less