scispace - formally typeset
Search or ask a question

Showing papers by "Hewlett-Packard published in 2009"


Proceedings ArticleDOI
12 Dec 2009
TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.
Abstract: This paper introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At the microarchitectural level, McPAT includes models for the fundamental components of a chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, integrated memory controllers, and multiple-domain clocking. At the circuit and technology levels, McPAT supports critical-path timing modeling, area modeling, and dynamic, short-circuit, and leakage power modeling for each of the device types forecast in the ITRS roadmap including bulk CMOS, SOI, and double-gate transistors. McPAT has a flexible XML interface to facilitate its use with many performance simulators. Combined with a performance simulator, McPAT enables architects to consistently quantify the cost of new ideas and assess tradeoffs of different architectures using new metrics like energy-delay-area2 product (EDA2P) and energy-delay-area product (EDAP). This paper explores the interconnect options of future manycore processors by varying the degree of clustering over generations of process technologies. Clustering will bring interesting tradeoffs between area and performance because the interconnects needed to group cores into clusters incur area overhead, but many applications can make good use of them due to synergies of cache sharing. Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taken into account configuring clusters with 4 cores gives the best EDA2P and EDAP.

2,487 citations


Journal ArticleDOI
TL;DR: The nature of the oxide electroforming as an electro-reduction and vacancy creation process caused by high electric fields and enhanced by electrical Joule heating is explained with direct experimental evidence.
Abstract: Metal and semiconductor oxides are ubiquitous electronic materials. Normally insulating, oxides can change behavior under high electric fields—through ‘electroforming’ or ‘breakdown’—critically affecting CMOS (complementary metal‐oxide‐semiconductor) logic, DRAM (dynamic random access memory) and flash memory, and tunnel barrier oxides. An initial irreversible electroforming process has been invariably required for obtaining metal oxide resistance switches, which may open urgently needed new avenues for advanced computer memory and logic circuits including ultra-dense non-volatile random access memory (NVRAM) and adaptive neuromorphic logic circuits. This electrical switching arises from the coupled motion of electrons and ions within the oxide material, as one of the first recognized examples of a memristor (memory‐resistor) device, the fourth fundamental passive circuit element originally predicted in 1971 by Chua. A lack of device repeatability has limited technological implementation of oxide switches, however. Here we explain the nature of the oxide electroforming as an electro-reduction and vacancy creation process caused by high electric fields and enhanced by electrical Joule heating with direct experimental evidence. Oxygen vacancies are created and drift towards the cathode, forming localized conducting channels in the oxide. Simultaneously, O 2− ions drift towards the anode where they evolve O2 gas, causing physical deformation of the junction. The problematic gas eruption and physical deformation are mitigated by shrinking to the nanoscale and controlling the electroforming voltage polarity. Better yet, electroforming problems can be largely eliminated by engineering the device structure to remove ‘bulk’ oxide effects in favor of interface-controlled electronic switching.

787 citations


Journal ArticleDOI
TL;DR: This issue's articles tackle topics including architecture and management of cloud computing infrastructures, SaaS and IaaS applications, discovery of services and data in cloud computing infrastructure, and cross-platform interoperability.
Abstract: Cloud computing is a disruptive technology with profound implications not only for Internet services but also for the IT sector as a whole. Its emergence promises to streamline the on-demand provisioning of software, hardware, and data as a service, achieving economies of scale in IT solutions' deployment and operation. This issue's articles tackle topics including architecture and management of cloud computing infrastructures, SaaS and IaaS applications, discovery of services and data in cloud computing infrastructures, and cross-platform interoperability. Still, several outstanding issues exist, particularly related to SLAs, security and privacy, and power efficiency. Other open issues include ownership, data transfer bottlenecks, performance unpredictability, reliability, and software licensing issues. Finally, hosted applications' business models must show a clear pathway to monetizing cloud computing. Several companies have already built Internet consumer services such as search, social networking, Web email, and online commerce that use cloud computing infrastructure. Above all, cloud computing's still unknown "killer application" will determine many of the challenges and the solutions we must develop to make this technology work in practice.

786 citations


Journal ArticleDOI
TL;DR: Hybrid reconfigurable logic circuits were fabricated by integrating memristor-based crossbars onto a foundry-built CMOS (complementary metal-oxide-semiconductor) platform using nanoimprint lithography, as well as materials and processes that were compatible with the CMOS.
Abstract: Hybrid reconfigurable logic circuits were fabricated by integrating memristor-based crossbars onto a foundry-built CMOS (complementary metal-oxide-semiconductor) platform using nanoimprint lithography, as well as materials and processes that were compatible with the CMOS Titanium dioxide thin-film memristors served as the configuration bits and switches in a data routing network and were connected to gate-level CMOS components that acted as logic elements, in a manner similar to a field programmable gate array We analyzed the chips using a purpose-built testing system, and demonstrated the ability to configure individual devices, use them to wire up various logic gates and a flip-flop, and then reconfigure devices

612 citations


Proceedings ArticleDOI
Siani Pearson1
23 May 2009
TL;DR: The privacy challenges that software engineers face when targeting the cloud as their production environment to offer services are assessed, and key design principles to address these are suggested.
Abstract: Privacy is an important issue for cloud computing, both in terms of legal compliance and user trust, and needs to be considered at every phase of design. In this paper the privacy challenges that software engineers face when targeting the cloud as their production environment to offer services are assessed, and key design principles to address these are suggested.

600 citations


Proceedings ArticleDOI
01 Apr 2009
TL;DR: Experimental evaluation with RUBiS and TPC-W benchmarks along with production-trace-driven workloads indicates that AutoControl can detect and mitigate CPU and disk I/O bottlenecks that occur over time and across multiple nodes by allocating each resource accordingly.
Abstract: Virtualized data centers enable sharing of resources among hosted applications. However, it is difficult to satisfy service-level objectives(SLOs) of applications on shared infrastructure, as application workloads and resource consumption patterns change over time. In this paper, we present AutoControl, a resource control system that automatically adapts to dynamic workload changes to achieve application SLOs. AutoControl is a combination of an online model estimator and a novel multi-input, multi-output (MIMO) resource controller. The model estimator captures the complex relationship between application performance and resource allocations, while the MIMO controller allocates the right amount of multiple virtualized resources to achieve application SLOs. Our experimental evaluation with RUBiS and TPC-W benchmarks along with production-trace-driven workloads indicates that AutoControl can detect and mitigate CPU and disk I/O bottlenecks that occur over time and across multiple nodes by allocating each resource accordingly. We also show that AutoControl can be used to provide service differentiation according to the application priorities during resource contention.

553 citations


Journal ArticleDOI
TL;DR: In this paper, the optical and spin-relaxation properties of millimeter-scale diamond samples were characterized using confocal microscopy, visible and infrared absorption, and optically detected magnetic resonance.
Abstract: Nitrogen-vacancy (NV) centers in millimeter-scale diamond samples were produced by irradiation and subsequent annealing under varied conditions. The optical and spin-relaxation properties of these samples were characterized using confocal microscopy, visible and infrared absorption, and optically detected magnetic resonance. The sample with the highest ${\text{NV}}^{\ensuremath{-}}$ concentration, approximately 16 ppm $(2.8\ifmmode\times\else\texttimes\fi{}{10}^{18}\text{ }{\text{cm}}^{\ensuremath{-}3})$, was prepared with no observable traces of neutrally charged vacancy defects. The effective transverse spin-relaxation time for this sample was ${T}_{2}^{\ensuremath{\ast}}=118(48)\text{ }\text{ns}$, predominately limited by residual paramagnetic nitrogen which was determined to have a concentration of 49(7) ppm. Under ideal conditions, the shot-noise limited sensitivity is projected to be $\ensuremath{\sim}150\text{ }\text{fT}/\sqrt{\text{Hz}}$ for a $100\text{ }\ensuremath{\mu}\text{m}$-scale magnetometer based on this sample. Other samples with ${\text{NV}}^{\ensuremath{-}}$ concentrations from 0.007 to 12 ppm and effective relaxation times ranging from 27 to over 291 ns were prepared and characterized.

523 citations


Journal ArticleDOI
TL;DR: In this paper, the authors investigate the exponential dependence of switching speeds in thin-film memristors for high electric fields and elevated temperatures, and propose a nonlinear ionic drift model to predict the volatility and switching time for various material systems.
Abstract: We investigate the exponential dependence of switching speeds in thin-film memristors for high electric fields and elevated temperatures. An existing nonlinear ionic drift model and our simulation results explain the very large ratios for the state lifetime to switching speed experimentally observed in devices for which resistance switching is due to ion migration. Given the activation barriers of the drifting species, it is possible to predict the volatility and switching time for various material systems.

515 citations


Proceedings ArticleDOI
Alexander Lenk, Markus Klems, Jens Nimis, Stefan Tai, Thomas Sandholm1 
23 May 2009
TL;DR: This work proposes an integrated Cloud computing stack architecture to serve as a reference point for future mash-ups and comparative studies and shows how the existing Cloud landscape maps into this architecture and identifies an infrastructure gap.
Abstract: We propose an integrated Cloud computing stack architecture to serve as a reference point for future mash-ups and comparative studies. We also show how the existing Cloud landscape maps into this architecture and identify an infrastructure gap that we plan to address in future work.

506 citations


Proceedings Article
24 Feb 2009
TL;DR: Sparse indexing, a technique that uses sampling and exploits the inherent locality within backup streams to solve for large-scale backup the chunk-lookup disk bottleneck problem that inline, chunk-based deduplication schemes face, is presented.
Abstract: We present sparse indexing, a technique that uses sampling and exploits the inherent locality within backup streams to solve for large-scale backup (e.g., hundreds of terabytes) the chunk-lookup disk bottleneck problem that inline, chunk-based deduplication schemes face. The problem is that these schemes traditionally require a full chunk index, which indexes every chunk, in order to determine which chunks have already been stored; unfortunately, at scale it is impractical to keep such an index in RAM and a disk-based index with one seek per incoming chunk is far too slow. We perform stream deduplication by breaking up an incoming stream into relatively large segments and deduplicating each segment against only a few of the most similar previous segments. To identify similar segments, we use sampling and a sparse index. We choose a small portion of the chunks in the stream as samples; our sparse index maps these samples to the existing segments in which they occur. Thus, we avoid the need for a full chunk index. Since only the sampled chunks' hashes are kept in RAM and the sampling rate is low, we dramatically reduce the RAM to disk ratio for effective deduplication. At the same time, only a few seeks are required per segment so the chunk-lookup disk bottleneck is avoided. Sparse indexing has recently been incorporated into number of Hewlett-Packard backup products.

477 citations


Proceedings ArticleDOI
28 Dec 2009
TL;DR: Extreme Binning is presented, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time.
Abstract: Data deduplication is an essential and critical component of backup systems. Essential, because it reduces storage space requirements, and critical, because the performance of the entire backup operation depends on its throughput. Traditional backup workloads consist of large data streams with high locality, which existing deduplication techniques require to provide reasonable throughput. We present Extreme Binning, a scalable deduplication technique for non-traditional backup workloads that are made up of individual files with no locality among consecutive files in a given window of time. Due to lack of locality, existing techniques perform poorly on these workloads. Extreme Binning exploits file similarity instead of locality, and makes only one disk access for chunk lookup per file, which gives reasonable throughput. Multi-node backup systems built with Extreme Binning scale gracefully with the amount of input data; more backup nodes can be added to boost throughput. Each file is allocated using a stateless routing algorithm to only one node, allowing for maximum parallelization, and each backup node is autonomous with no dependency across nodes, making data management tasks robust with low overhead.

Patent
John R. Spencer1
30 Jan 2009
TL;DR: In this paper, a switching circuit is configured to create the intermediate voltage signal based on a switching signal having a duty cycle, and wherein the duty cycle of the switching signal is open-loop with respect the intermediate signal and the first regulated voltage signal.
Abstract: Regulating voltages At least some of the illustrative embodiments are systems including a switching circuit configured to produce an intermediate voltage signal from an input voltage signal, and a first voltage regulator coupled the switching circuit and configured to produce a first regulated voltage signal from the intermediate voltage signal The switching circuit is configured to create the intermediate voltage signal based on a switching signal having a duty cycle, and wherein the duty cycle of the switching signal is open-loop with respect the intermediate voltage signal and the first regulated voltage signal

Patent
04 May 2009
TL;DR: In this paper, a touch-sensitive display screen is enhanced by a touch sensitive control area that extends beyond the edges of the display screen, referred to as a gesture area, allowing a user to activate commands using a gesture vocabulary.
Abstract: A touch-sensitive display screen is enhanced by a touch-sensitive control area that extends beyond the edges of the display screen. The touch-sensitive area outside the display screen, referred to as a "gesture area," allows a user to activate commands using a gesture vocabulary. In one aspect, the present invention allows some commands to be activated by inputting a gesture within the gesture area. Other commands can be activated by directly manipulating on-screen objects. Yet other commands can be activated by beginning a gesture within the gesture area, and finishing it on the screen (or vice versa), and/or by performing input that involves contemporaneous con-tact with both the gesture area and the screen.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: It is demonstrated that memory disaggregation can provide substantial performance benefits (on average 10X) in memory constrained environments, while the sharing enabled by the solutions can improve performance-per-dollar by up to 57% when optimizing memory provisioning across multiple servers.
Abstract: Analysis of technology and application trends reveals a growing imbalance in the peak compute-to-memory-capacity ratio for future servers. At the same time, the fraction contributed by memory systems to total datacenter costs and power consumption during typical usage is increasing. In response to these trends, this paper re-examines traditional compute-memory co-location on a single system and details the design of a new general-purpose architectural building block-a memory blade-that allows memory to be "disaggregated" across a system ensemble. This remote memory blade can be used for memory capacity expansion to improve performance and for sharing memory across servers to reduce provisioning and power costs. We use this memory blade building block to propose two new system architecture solutions-(1) page-swapped remote memory at the virtualization layer, and (2) block-access remote memory with support in the coherence hardware-that enable transparent memory expansion and sharing on commodity-based systems. Using simulations of a mix of enterprise benchmarks supplemented with traces from live datacenters, we demonstrate that memory disaggregation can provide substantial performance benefits (on average 10X) in memory constrained environments, while the sharing enabled by our solutions can improve performance-per-dollar by up to 57% when optimizing memory provisioning across multiple servers.

Book ChapterDOI
11 May 2009
TL;DR: The hurdles in network power instrumentation are described and a power measurement study of a variety of networking gear such as hubs, edge switches, core switches, routers and wireless access points in both stand-alone mode and a production data center are presented.
Abstract: Energy efficiency is becoming increasingly important in the operation of networking infrastructure, especially in enterprise and data center networks. Researchers have proposed several strategies for energy management of networking devices. However, we need a comprehensive characterization of power consumption by a variety of switches and routers to accurately quantify the savings from the various power savings schemes. In this paper, we first describe the hurdles in network power instrumentation and present a power measurement study of a variety of networking gear such as hubs, edge switches, core switches, routers and wireless access points in both stand-alone mode and a production data center. We build and describe a benchmarking suite that will allow users to measure and compare the power consumed for a large set of common configurations at any switch or router of their choice. We also propose a network energy proportionality index, which is an easily measurable metric, to compare power consumption behaviors of multiple devices.

Journal ArticleDOI
TL;DR: In this paper, the authors examine how fragmentation of trading is affecting the quality of trading in U.S. markets and find that market fragmentation generally reduces transactions costs and increases execution speeds.
Abstract: Equity markets world-wide have seen a proliferation of trading venues and the consequent fragmentation of order flow. In this paper, we examine how fragmentation of trading is affecting the quality of trading in U.S. markets. We propose using newly-available TRF (trade reporting facilities) volumes to proxy for fragmentation levels in individual stocks, and we use a matched sample to compare execution quality and efficiency of stocks with more and less fragmented trading. We find that market fragmentation generally reduces transactions costs and increases execution speeds. Fragmentation does increase short-term volatility, but prices are more efficient in that they are closer to being a random walk. Our results that fragmentation does not appear to harm market quality have important implications for regulatory policy.

Proceedings ArticleDOI
29 Mar 2009
TL;DR: A system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours, and was able to correctly identify both the short and long-running queries to inform workload management and capacity planning.
Abstract: One of the most challenging aspects of managing a very large data warehouse is identifying how queries will behave before they start executing. Yet knowing their performance characteristics --- their runtimes and resource usage --- can solve two important problems. First, every database vendor struggles with managing unexpectedly long-running queries. When these long-running queries can be identified before they start, they can be rejected or scheduled when they will not cause extreme resource contention for the other queries in the system. Second, deciding whether a system can complete a given workload in a given time period (or a bigger system is necessary) depends on knowing the resource requirements of the queries in that workload. We have developed a system that uses machine learning to accurately predict the performance metrics of database queries whose execution times range from milliseconds to hours. For training and testing our system, we used both real customer queries and queries generated from an extended set of TPC-DS templates. The extensions mimic queries that caused customer problems. We used these queries to compare how accurately different techniques predict metrics such as elapsed time, records used, disk I/Os, and message bytes. The most promising technique was not only the most accurate, but also predicted these metrics simultaneously and using only information available prior to query execution. We validated the accuracy of this machine learning technique on a number of HP Neoview configurations. We were able to predict individual query elapsed time within 20% of its actual time for 85% of the test queries. Most importantly, we were able to correctly identify both the short and long-running (up to two hour) queries to inform workload management and capacity planning.

Proceedings ArticleDOI
20 Jun 2009
TL;DR: A new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature, that yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels.
Abstract: We propose a new bilateral filtering algorithm with computational complexity invariant to filter kernel size, so-called O(1) or constant time in the literature. By showing that a bilateral filter can be decomposed into a number of constant time spatial filters, our method yields a new class of constant time bilateral filters that can have arbitrary spatial and arbitrary range kernels. In contrast, the current available constant time algorithm requires the use of specific spatial or specific range kernels. Also, our algorithm lends itself to a parallel implementation leading to the first real-time O(1) algorithm that we know of. Meanwhile, our algorithm yields higher quality results since we are effectively quantizing the range function instead of quantizing both the range function and the input image. Empirical experiments show that our algorithm not only gives higher PSNR, but is about 10× faster than the state-of-the-art. It also has a small memory footprint, needed only 2% of the memory required by the state-of-the-art for obtaining the same quality as exact using 8-bit images. We also show that our algorithm can be easily extended for O(1) median filtering. Our bilateral filtering algorithm was tested in a number of applications, including HD video conferencing, video abstraction, highlight removal, and multi-focus imaging.

Journal ArticleDOI
TL;DR: In this article, the authors propose an approach to implement quantum repeaters for long-distance quantum communication, which generates a backbone of encoded Bell pairs and uses the procedure of classical error correction during simultaneous entanglement connection.
Abstract: We propose an approach to implement quantum repeaters for long-distance quantum communication. Our protocol generates a backbone of encoded Bell pairs and uses the procedure of classical error correction during simultaneous entanglement connection. We illustrate that the repeater protocol with simple Calderbank-Shor-Steane encoding can significantly extend the communication distance, while still maintaining a fast key generation rate.

Journal ArticleDOI
TL;DR: The digitally configured memristor crossbars were used to perform logic functions, to serve as a routing fabric for interconnecting the FETs and as the target for storing information.
Abstract: Memristor crossbars were fabricated at 40 nm half-pitch, using nanoimprint lithography on the same substrate with Si metal-oxide-semiconductor field effect transistor (MOS FET) arrays to form fully integrated hybrid memory resistor (memristor)/transistor circuits. The digitally configured memristor crossbars were used to perform logic functions, to serve as a routing fabric for interconnecting the FETs and as the target for storing information. As an illustrative demonstration, the compound Boolean logic operation (A AND B) OR (C AND D) was performed with kilohertz frequency inputs, using resistor-based logic in a memristor crossbar with FET inverter/amplifier outputs. By routing the output signal of a logic operation back onto a target memristor inside the array, the crossbar was conditionally configured by setting the state of a nonvolatile switch. Such conditional programming illuminates the way for a variety of self-programmed logic arrays, and for electronic synaptic computing.

Journal ArticleDOI
04 May 2009-Small
TL;DR: A more physical model based on numerical solutions of coupled drift-diffusion equations for electrons and ions with appropriate boundary conditions is provided to obtain physical insight into the transport processes responsible for memristive behavior in semiconductor films.
Abstract: The memristor, the fourth passive circuit element, was predicted theoretically nearly 40 years ago, but we just recently demonstrated both an intentional material system and an analytical model that exhibited the properties of such a device. Here we provide a more physical model based on numerical solutions of coupled drift-diffusion equations for electrons and ions with appropriate boundary conditions. We simulate the dynamics of a two-terminal memristive device based on a semiconductor thinfilm with mobile dopants that are partially compensated by a small amount of immobile acceptors. We examine the mobile ion distributions, zero-bias potentials, and current‐voltage characteristics of the model for both steady-state bias conditions and for dynamical switching to obtain physical insight into the transport processes responsible for memristive behavior in semiconductor films.

Proceedings ArticleDOI
Jung Ho Ahn1, Nathan Binkert1, Al Davis1, Moray McLaren1, Robert Schreiber1 
14 Nov 2009
TL;DR: This work considers an extension of the hypercube and flattened butterfly topologies, the HyperX, and gives an adaptive routing algorithm, DAL, to take advantage of high-radix switch components that integrated photonics will make available.
Abstract: In the push to achieve exascale performance, systems will grow to over 100,000 sockets, as growing cores-per-socket and improved single-core performance provide only part of the speedup needed. These systems will need affordable interconnect structures that scale to this level. To meet the need, we consider an extension of the hypercube and flattened butterfly topologies, the HyperX, and give an adaptive routing algorithm, DAL. HyperX takes advantage of high-radix switch components that integrated photonics will make available. Our main contributions include a formal descriptive framework, enabling a search method that finds optimal HyperX configurations; DAL; and a low cost packaging strategy for an exascale HyperX. Simulations show that HyperX can provide performance as good as a folded Clos, with fewer switches. We also describe a HyperX packaging scheme that reduces system cost. Our analysis of efficiency, performance, and packaging demonstrates that the HyperX is a strong competitor for exascale networks.

Book ChapterDOI
22 Nov 2009
TL;DR: A privacy manager for cloud computing is described, which reduces the risk to the cloud computing user of their private data being stolen or misused, and also assists the cloud Computing provider to conform to privacy law.
Abstract: We describe a privacy manager for cloud computing, which reduces the risk to the cloud computing user of their private data being stolen or misused, and also assists the cloud computing provider to conform to privacy law. We describe different possible architectures for privacy management in cloud computing; give an algebraic description of obfuscation, one of the features of the privacy manager; and describe how the privacy manager might be used to protect private metadata of online photos.

Proceedings ArticleDOI
31 Mar 2009
TL;DR: GViM is presented, a system designed for virtualizing and managing the resources of a general purpose system accelerated by graphics processors and how such accelerators can be virtualized without additional hardware support.
Abstract: The use of virtualization to abstract underlying hardware can aid in sharing such resources and in efficiently managing their use by high performance applications. Unfortunately, virtualization also prevents efficient access to accelerators, such as Graphics Processing Units (GPUs), that have become critical components in the design and architecture of HPC systems. Supporting General Purpose computing on GPUs (GPGPU) with accelerators from different vendors presents significant challenges due to proprietary programming models, heterogeneity, and the need to share accelerator resources between different Virtual Machines (VMs).To address this problem, this paper presents GViM, a system designed for virtualizing and managing the resources of a general purpose system accelerated by graphics processors. Using the NVIDIA GPU as an example, we discuss how such accelerators can be virtualized without additional hardware support and describe the basic extensions needed for resource management. Our evaluation with a Xen-based implementation of GViM demonstrate efficiency and flexibility in system usage coupled with only small performance penalties for the virtualized vs. non-virtualized solutions.

Journal ArticleDOI
TL;DR: A design study for a nano-scale crossbar memory system that uses memristors with symmetrical but highly nonlinear current-voltage characteristics as memory elements and simulation results show the feasibility of these writing and reading procedures.
Abstract: We present a design study for a nano-scale crossbar memory system that uses memristors with symmetrical but highly nonlinear current-voltage characteristics as memory elements. The memory is non-volatile since the memristors retain their state when un-powered. In order to address the nano-wires that make up this nano-scale crossbar, we use two coded demultiplexers implemented using mixed-scale crossbars (in which CMOS-wires cross nano-wires and in which the crosspoint junctions have one-time configurable memristors). This memory system does not utilize the kind of devices (diodes or transistors) that are normally used to isolate the memory cell being written to and read from in conventional memories. Instead, special techniques are introduced to perform the writing and the reading operation reliably by taking advantage of the nonlinearity of the type of memristors used. After discussing both writing and reading strategies for our memory system in general, we focus on a 64 x 64 memory array and present simulation results that show the feasibility of these writing and reading procedures. Besides simulating the case where all device parameters assume exactly their nominal value, we also simulate the much more realistic case where the device parameters stray around their nominal value: we observe a degradation in margins, but writing and reading is still feasible. These simulation results are based on a device model for memristors derived from measurements of fabricated devices in nano-scale crossbars using Pt and Ti nano-wires and using oxygen-depleted TiO(2) as the switching material.

Patent
09 Mar 2009
TL;DR: In this paper, the authors proposed to use the router as part of the virtual machine manager rather than having only a switch in the VM manager to avoid the need for virtual machines for implementing gateways.
Abstract: A data center can share processing resources using virtual networks. A virtual machine manager (10) hosts one or more virtual machines (11, 411), the virtual machines forming part of a segmented virtual network (34). Outgoing messages from the virtual machines have an intermediate destination address of an intermediate node in a local segment of the segmented virtual network, and the virtual machine manager has a router (18) for determining a new intermediate destination address outside the local segment, for routing the given outgoing message. By having the router as part of the virtual machine manager rather than having only a switch in the virtual machine manager, the need for virtual machines for implementing gateways is avoided. This can reduce the number of “hops” for the message between virtual entities hosted, and thus improve performance. This can help a service provider to share physical processing resources of a data center between different clients having their own virtual networks.

18 May 2009
TL;DR: Preliminary experiments suggesting that this approach to building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM, is viable are described.
Abstract: Technology trends may soon favor building main memory as a hybrid between DRAM and non-volatile memory, such as flash or PC-RAM. We describe how the operating system might manage such hybrid memories, using semantic information not available in other layers. We describe preliminary experiments suggesting that this approach is viable.

Journal ArticleDOI
TL;DR: In this paper, the authors show that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads, which in many cases asymptotes to no uploads whatsoever.
Abstract: We show through an analysis of a massive data set from YouTube that the productivity exhibited in crowdsourcing exhibits a strong positive dependence on attention, measured by the number of downloads. Conversely, a lack of attention leads to a decrease in the number of videos uploaded and the consequent drop in productivity, which in many cases asymptotes to no uploads whatsoever. Moreover, short-term contributors compare their performance to the average contributor's performance while long-term contributors compare it to their own media.

Journal ArticleDOI
01 Aug 2009
TL;DR: This tutorial presents an overview of column-oriented database system technology and addresses questions about how easily a major row-based system achieve column-store performance and the new applications that can be potentially enabled by column-stores.
Abstract: Column-oriented database systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional database systems that store entire records (rows) one after the other. Reading a subset of a table's columns becomes faster, at the potential expense of excessive disk-head seeking from column to column for scattered reads or updates. After several dozens of research papers and at least a dozen of new column-store start-ups, several questions remain. Are these a new breed of systems or simply old wine in new bottles? How easily can a major row-based system achieve column-store performance? Are column-stores the answer to effortlessly support large-scale data-intensive applications? What are the new, exciting system research problems to tackle? What are the new applications that can be potentially enabled by column-stores? In this tutorial, we present an overview of column-oriented database system technology and address these and other related questions.