scispace - formally typeset
Search or ask a question

Showing papers by "Hewlett-Packard published in 2013"


Journal ArticleDOI
TL;DR: The performance requirements for computing with memristive devices are examined and how the outstanding challenges could be met are examined.
Abstract: Memristive devices are electrical resistance switches that can retain a state of internal resistance based on the history of applied voltage and current. These devices can store and process information, and offer several key performance characteristics that exceed conventional integrated circuit technology. An important class of memristive devices are two-terminal resistance switches based on ionic motion, which are built from a simple conductor/insulator/conductor thin-film stack. These devices were originally conceived in the late 1960s and recent progress has led to fast, low-energy, high-endurance devices that can be scaled down to less than 10 nm and stacked in three dimensions. However, the underlying device mechanisms remain unclear, which is a significant barrier to their widespread application. Here, we review recent progress in the development and understanding of memristive devices. We also examine the performance requirements for computing with memristive devices and detail how the outstanding challenges could be met.

3,037 citations


Journal ArticleDOI
TL;DR: A neuristor built using two nanoscale Mott memristors, dynamical devices that exhibit transient memory and negative differential resistance arising from an insulating-to-conducting phase transition driven by Joule heating exhibits the important neural functions of all-or-nothing spiking with signal gain and diverse periodic spiking.
Abstract: The Hodgkin-Huxley model for action potential generation in biological axons is central for understanding the computational capability of the nervous system and emulating its functionality. Owing to the historical success of silicon complementary metal-oxide-semiconductors, spike-based computing is primarily confined to software simulations and specialized analogue metal-oxide-semiconductor field-effect transistor circuits. However, there is interest in constructing physical systems that emulate biological functionality more directly, with the goal of improving efficiency and scale. The neuristor was proposed as an electronic device with properties similar to the Hodgkin-Huxley axon, but previous implementations were not scalable. Here we demonstrate a neuristor built using two nanoscale Mott memristors, dynamical devices that exhibit transient memory and negative differential resistance arising from an insulating-to-conducting phase transition driven by Joule heating. This neuristor exhibits the important neural functions of all-or-nothing spiking with signal gain and diverse periodic spiking, using materials and structures that are amenable to extremely high-density integration with or without silicon transistors.

792 citations


Journal ArticleDOI
TL;DR: It becomes critically important to study how the current approaches to standardization in this area can be improved, and better understand the opportunities for the research community to contribute to the IoT field.
Abstract: Technologies to support the Internet of Things are becoming more important as the need to better understand our environments and make them smart increases. As a result it is predicted that intelligent devices and networks, such as WSNs, will not be isolated, but connected and integrated, composing computer networks. So far, the IP-based Internet is the largest network in the world; therefore, there are great strides to connect WSNs with the Internet. To this end, the IETF has developed a suite of protocols and open standards for accessing applications and services for wireless resource constrained networks. However, many open challenges remain, mostly due to the complex deployment characteristics of such systems and the stringent requirements imposed by various services wishing to make use of such complex systems. Thus, it becomes critically important to study how the current approaches to standardization in this area can be improved, and at the same time better understand the opportunities for the research community to contribute to the IoT field. To this end, this article presents an overview of current standards and research activities in both industry and academia.

744 citations


Journal ArticleDOI
TL;DR: A framework for characterizing various dimensions of quality control in crowdsourcing systems, a critical issue, is proposed.
Abstract: As a new distributed computing model, crowdsourcing lets people leverage the crowd's intelligence and wisdom toward solving problems. This article proposes a framework for characterizing various dimensions of quality control in crowdsourcing systems, a critical issue. The authors briefly review existing quality-control approaches, identify open issues, and look to future research directions. In the Web extra, the authors discuss both design-time and runtime approaches in more detail.

394 citations


Proceedings ArticleDOI
11 Aug 2013
TL;DR: An unsupervised model, called Author Spamicity Model (ASM), is proposed, which works in the Bayesian setting, which facilitates modeling spamicity of authors as latent and allows us to exploit various observed behavioral footprints of reviewers.
Abstract: Opinionated social media such as product reviews are now widely used by individuals and organizations for their decision making. However, due to the reason of profit or fame, people try to game the system by opinion spamming (e.g., writing fake reviews) to promote or to demote some target products. In recent years, fake review detection has attracted significant attention from both the business and research communities. However, due to the difficulty of human labeling needed for supervised learning and evaluation, the problem remains to be highly challenging. This work proposes a novel angle to the problem by modeling spamicity as latent. An unsupervised model, called Author Spamicity Model (ASM), is proposed. It works in the Bayesian setting, which facilitates modeling spamicity of authors as latent and allows us to exploit various observed behavioral footprints of reviewers. The intuition is that opinion spammers have different behavioral distributions than non-spammers. This creates a distributional divergence between the latent population distributions of two clusters: spammers and non-spammers. Model inference results in learning the population distributions of the two clusters. Several extensions of ASM are also considered leveraging from different priors. Experiments on a real-life Amazon review dataset demonstrate the effectiveness of the proposed models which significantly outperform the state-of-the-art competitors.

393 citations


Journal ArticleDOI
TL;DR: This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa.
Abstract: Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an analytical framework that can quantify the privacy of personally identifiable information while still providing a quantifiable benefit (utility) to multiple legitimate information consumers. This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa. Specific contributions include: 1) stochastic data models for both categorical and numerical data; 2) utility-privacy tradeoff regions and the encoding (sanization) schemes achieving them for both classes and their practical relevance; and 3) modeling of prior knowledge at the user and/or data source and optimal encoding schemes for both cases.

393 citations


Proceedings ArticleDOI
25 Jun 2013
TL;DR: The main observation is that natural human mobility, when combined with PHY layer information, can help in accurately estimating the angle and distance of a mobile device from an wireless access point (AP).
Abstract: Despite of several years of innovative research, indoor localization is still not mainstream. Existing techniques either employ cumbersome fingerprinting, or rely upon the deployment of additional infrastructure. Towards a solution that is easier to adopt, we propose CUPID, which is free from these restrictions, yet is comparable in accuracy. While existing WiFi based solutions are highly susceptible to indoor multipath, CUPID utilizes physical layer (PHY) information to extract the signal strength and the angle of only the direct path, successfully avoiding the effect of multipath reflections. Our main observation is that natural human mobility, when combined with PHY layer information, can help in accurately estimating the angle and distance of a mobile device from an wireless access point (AP). Real-world indoor experiments using off-the-shelf wireless chipsets confirm the feasibility of CUPID. In addition, while previous approaches rely on multiple APs, CUPID is able to localize a device when only a single AP is present. When a few more APs are available, CUPID can improve the median localization error to 2.7m, which is comparable to schemes that rely on expensive fingerprinting or additional infrastructure.

384 citations


Journal ArticleDOI
21 Mar 2013-Nature
TL;DR: A multi-directional diffractive backlight technology that permits the rendering of high-resolution, full-parallax 3D images in a very wide view zone (up to 180 degrees in principle) at an observation distance of up to a metre is introduced.
Abstract: Multiview three-dimensional (3D) displays can project the correct perspectives of a 3D image in many spatial directions simultaneously They provide a 3D stereoscopic experience to many viewers at the same time with full motion parallax and do not require special glasses or eye tracking None of the leading multiview 3D solutions is particularly well suited to mobile devices (watches, mobile phones or tablets), which require the combination of a thin, portable form factor, a high spatial resolution and a wide full-parallax view zone (for short viewing distance from potentially steep angles) Here we introduce a multi-directional diffractive backlight technology that permits the rendering of high-resolution, full-parallax 3D images in a very wide view zone (up to 180 degrees in principle) at an observation distance of up to a metre The key to our design is a guided-wave illumination technique based on light-emitting diodes that produces wide-angle multiview images in colour from a thin planar transparent lightguide Pixels associated with different views or colours are spatially multiplexed and can be independently addressed and modulated at video rate using an external shutter plane To illustrate the capabilities of this technology, we use simple ink masks or a high-resolution commercial liquid-crystal display unit to demonstrate passive and active (30 frames per second) modulation of a 64-view backlight, producing 3D images with a spatial resolution of 88 pixels per inch and full-motion parallax in an unprecedented view zone of 90 degrees We also present several transparent hand-held prototypes showing animated sequences of up to six different 200-view images at a resolution of 127 pixels per inch

353 citations


Proceedings ArticleDOI
04 Nov 2013
TL;DR: A new Private Set Intersection (PSI) protocol that is extremely efficient and highly scalable compared with existing protocols, based on a novel approach that is oblivious Bloom intersection, which has linear complexity and relies mostly on efficient symmetric key operations.
Abstract: Large scale data processing brings new challenges to the design of privacy-preserving protocols: how to meet the increasing requirements of speed and throughput of modern applications, and how to scale up smoothly when data being protected is big. Efficiency and scalability become critical criteria for privacy preserving protocols in the age of Big Data. In this paper, we present a new Private Set Intersection (PSI) protocol that is extremely efficient and highly scalable compared with existing protocols. The protocol is based on a novel approach that we call oblivious Bloom intersection. It has linear complexity and relies mostly on efficient symmetric key operations. It has high scalability due to the fact that most operations can be parallelized easily. The protocol has two versions: a basic protocol and an enhanced protocol, the security of the two variants is analyzed and proved in the semi-honest model and the malicious model respectively. A prototype of the basic protocol has been built. We report the result of performance evaluation and compare it against the two previously fastest PSI protocols. Our protocol is orders of magnitude faster than these two protocols. To compute the intersection of two million-element sets, our protocol needs only 41 seconds (80-bit security) and 339 seconds (256-bit security) on moderate hardware in parallel mode.

352 citations


Proceedings Article
01 Jan 2013
TL;DR: This work exploits the burstiness nature of reviews to identify review spammers and proposes a novel evaluation method to evaluate the detected spammers automatically using supervised classification of their reviews, which outperforms strong baselines.
Abstract: Online product reviews have become an important source of user opinions. Due to profit or fame, imposters have been writing deceptive or fake reviews to promote and/or to demote some target products or services. Such imposters are called review spammers. In the past few years, several approaches have been proposed to deal with the problem. In this work, we take a different approach, which exploits the burstiness nature of reviews to identify review spammers. Bursts of reviews can be either due to sudden popularity of products or spam attacks. Reviewers and reviews appearing in a burst are often related in the sense that spammers tend to work with other spammers and genuine reviewers tend to appear together with other genuine reviewers. This paves the way for us to build a network of reviewers appearing in different bursts. We then model reviewers and their cooccurrence in bursts as a Markov Random Field (MRF), and employ the Loopy Belief Propagation (LBP) method to infer whether a reviewer is a spammer or not in the graph. We also propose several features and employ feature induced message passing in the LBP framework for network inference. We further propose a novel evaluation method to evaluate the detected spammers automatically using supervised classification of their reviews. Additionally, we employ domain experts to perform a human evaluation of the identified spammers and non-spammers. Both the classification result and human evaluation result show that the proposed method outperforms strong baselines, which demonstrate the effectiveness of the method.

319 citations


Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work proposes mapping part of a process's linear virtual address space with a direct segment, while page mapping the rest of thevirtual address space to remove the TLB miss overhead for big-memory workloads.
Abstract: Our analysis shows that many "big-memory" server workloads, such as databases, in-memory caches, and graph analytics, pay a high cost for page-based virtual memory. They consume as much as 10% of execution cycles on TLB misses, even using large pages. On the other hand, we find that these workloads use read-write permission on most pages, are provisioned not to swap, and rarely benefit from the full flexibility of page-based virtual memory.To remove the TLB miss overhead for big-memory workloads, we propose mapping part of a process's linear virtual address space with a direct segment, while page mapping the rest of the virtual address space. Direct segments use minimal hardware---base, limit and offset registers per core---to map contiguous virtual memory regions directly to contiguous physical memory. They eliminate the possibility of TLB misses for key data structures such as database buffer pools and in-memory key-value stores. Memory mapped by a direct segment may be converted back to paging when needed.We prototype direct-segment software support for x86-64 in Linux and emulate direct-segment hardware. For our workloads, direct segments eliminate almost all TLB misses and reduce the execution time wasted on TLB misses to less than 0.5%.

Journal ArticleDOI
TL;DR: This work proposes a multiphase distributed vulnerability detection, measurement, and countermeasure selection mechanism called NICE, which is built on attack graph-based analytical models and reconfigurable virtual network-based countermeasures to significantly improve attack detection and mitigate attack consequences.
Abstract: Cloud security is one of most important issues that has attracted a lot of research and development effort in past few years. Particularly, attackers can explore vulnerabilities of a cloud system and compromise virtual machines to deploy further large-scale Distributed Denial-of-Service (DDoS). DDoS attacks usually involve early stage actions such as multistep exploitation, low-frequency vulnerability scanning, and compromising identified vulnerable virtual machines as zombies, and finally DDoS attacks through the compromised zombies. Within the cloud system, especially the Infrastructure-as-a-Service (IaaS) clouds, the detection of zombie exploration attacks is extremely difficult. This is because cloud users may install vulnerable applications on their virtual machines. To prevent vulnerable virtual machines from being compromised in the cloud, we propose a multiphase distributed vulnerability detection, measurement, and countermeasure selection mechanism called NICE, which is built on attack graph-based analytical models and reconfigurable virtual network-based countermeasures. The proposed framework leverages OpenFlow network programming APIs to build a monitor and control plane over distributed programmable virtual switches to significantly improve attack detection and mitigate attack consequences. The system and security evaluations demonstrate the efficiency and effectiveness of the proposed solution.

Book ChapterDOI
Siani Pearson1
01 Jan 2013
TL;DR: This chapter assesses how security, trust and privacy issues occur in the context of cloud computing and discusses ways in which they may be addressed.
Abstract: Cloud computing refers to the underlying infrastructure for an emerging model of service provision that has the advantage of reducing cost by sharing computing and storage resources, combined with an on-demand provisioning mechanism relying on a pay-per-use business model. These new features have a direct impact on information technology (IT) budgeting but also affect traditional security, trust and privacy mechanisms. The advantages of cloud computing—its ability to scale rapidly, store data remotely and share services in a dynamic environment—can become disadvantages in maintaining a level of assurance sufficient to sustain confidence in potential customers. Some core traditional mechanisms for addressing privacy (such as model contracts) are no longer flexible or dynamic enough, so new approaches need to be developed to fit this new paradigm. In this chapter, we assess how security, trust and privacy issues occur in the context of cloud computing and discuss ways in which they may be addressed.

Journal ArticleDOI
TL;DR: A quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature are described and directions to improve the sentence extraction results obtained are suggested.
Abstract: Text summarization is the process of automatically creating a shorter version of one or more text documents. It is an important way of finding relevant information in large text libraries or in the Internet. Essentially, text summarization techniques are classified as Extractive and Abstractive. Extractive techniques perform text summarization by selecting sentences of documents according to some criteria. Abstractive summaries attempt to improve the coherence among sentences by eliminating redundancies and clarifying the contest of sentences. In terms of extractive summarization, sentence scoring is the technique most used for extractive text summarization. This paper describes and performs a quantitative and qualitative assessment of 15 algorithms for sentence scoring available in the literature. Three different datasets (News, Blogs and Article contexts) were evaluated. In addition, directions to improve the sentence extraction results obtained are suggested.

Journal ArticleDOI
TL;DR: A new framework is presented that abstracts both the privacy and the utility requirements of smart meter data and exploits the presence of high-power but less private appliance spectra as implicit distortion noise to create an optimal privacy-preserving solution.
Abstract: The solutions offered to-date for end-user privacy in smart meter measurements, a well-known challenge in the smart grid, have been tied to specific technologies such as batteries or assumptions on data usage without quantifying the loss of benefit (utility) that results from any such approach. Using tools from information theory and a hidden Markov model for the measurements, a new framework is presented that abstracts both the privacy and the utility requirements of smart meter data. This leads to a novel privacy-utility tradeoff problem with minimal assumptions that is tractable. For a stationary Gaussian model of the electricity load, it is shown that for a desired mean-square distortion (utility) measure between the measured and revealed data, the optimal privacy-preserving solution: i) exploits the presence of high-power but less private appliance spectra as implicit distortion noise, and ii) filters out frequency components with lower power relative to a distortion threshold; this approach encompasses many previously proposed approaches to smart meter privacy.

Journal ArticleDOI
01 Nov 2013
TL;DR: Big data is changing the landscape of security tools for network monitoring, security information and event management, and forensics; however, in the eternal arms race of attack and defense, security researchers must keep exploring novel ways to mitigate and contain sophisticated attackers.
Abstract: Big data is changing the landscape of security tools for network monitoring, security information and event management, and forensics; however, in the eternal arms race of attack and defense, security researchers must keep exploring novel ways to mitigate and contain sophisticated attackers.

Proceedings ArticleDOI
07 Dec 2013
TL;DR: Kiln is a persistent memory design that adopts a nonvolatile cache and aNonvolatile main memory to enable atomic in-place updates without logging or copy-on-write and can achieve 2× performance improvement compared with NVRAM-based persistent memory with write-ahead logging.
Abstract: Persistent memory is an emerging technology which allows in-memory persistent data objects to be updated at much higher throughput than when using disks as persistent storage. Previous persistent memory designs use logging or copy-on-write mechanisms to update persistent data, which unfortunately reduces the system performance to roughly half that of a native system with no persistence support. One of the great challenges in this application class is therefore how to efficiently enable atomic, consistent, and durable updates to ensure data persistence that survives application and/or system failures. Our goal is to design a persistent memory system with performance very close to that of a native system. We propose Kiln, a persistent memory design that adopts a nonvolatile cache and a nonvolatile main memory to enable atomic in-place updates without logging or copy-on-write. Our evaluation shows that Kiln can achieve 2× performance improvement compared with NVRAM-based persistent memory with write-ahead logging. In addition, our design has numerous practical advantages: a simple and intuitive abstract interface, microarchitecture-level optimizations, fast recovery from failures, and eliminating redundant writes to nonvolatile storage media.

Journal ArticleDOI
TL;DR: Simulation results demonstrate that the proposed data-gathering algorithm can greatly shorten the moving distance of the collectors compared with the covering line approximation algorithm and is close to the optimal algorithm for small networks.
Abstract: In this paper, we propose a new data-gathering mechanism for large-scale wireless sensor networks by introducing mobility into the network. A mobile data collector, for convenience called an M-collector in this paper, could be a mobile robot or a vehicle equipped with a powerful transceiver and battery, working like a mobile base station and gathering data while moving through the field. An M-collector starts the data-gathering tour periodically from the static data sink, polls each sensor while traversing its transmission range, then directly collects data from the sensor in single-hop communications, and finally transports the data to the static sink. Since data packets are directly gathered without relays and collisions, the lifetime of sensors is expected to be prolonged. In this paper, we mainly focus on the problem of minimizing the length of each data-gathering tour and refer to this as the single-hop data-gathering problem (SHDGP). We first formalize the SHDGP into a mixed-integer program and then present a heuristic tour-planning algorithm for the case where a single M-collector is employed. For the applications with strict distance/time constraints, we consider utilizing multiple M-collectors and propose a data-gathering algorithm where multiple M-collectors traverse through several shorter subtours concurrently to satisfy the distance/time constraints. Our single-hop mobile data-gathering scheme can improve the scalability and balance the energy consumption among sensors. It can be used in both connected and disconnected networks. Simulation results demonstrate that the proposed data-gathering algorithm can greatly shorten the moving distance of the collectors compared with the covering line approximation algorithm and is close to the optimal algorithm for small networks. In addition, the proposed data-gathering scheme can significantly prolong the network lifetime compared with a network with static data sink or a network in which the mobile collector can only move along straight lines.

Proceedings ArticleDOI
27 Aug 2013
TL;DR: ElasticSwitch is an efficient and practical approach for providing bandwidth guarantees and is work-conserving, even in challenging situations, and can be fully implemented in hypervisors, without requiring a specific topology or any support from switches.
Abstract: While cloud computing providers offer guaranteed allocations for resources such as CPU and memory, they do not offer any guarantees for network resources. The lack of network guarantees prevents tenants from predicting lower bounds on the performance of their applications. The research community has recognized this limitation but, unfortunately, prior solutions have significant limitations: either they are inefficient, because they are not work-conserving, or they are impractical, because they require expensive switch support or congestion-free network cores.In this paper, we propose ElasticSwitch, an efficient and practical approach for providing bandwidth guarantees. ElasticSwitch is efficient because it utilizes the spare bandwidth from unreserved capacity or underutilized reservations. ElasticSwitch is practical because it can be fully implemented in hypervisors, without requiring a specific topology or any support from switches. Because hypervisors operate mostly independently, there is no need for complex coordination between them or with a central controller. Our experiments, with a prototype implementation on a 100-server testbed, demonstrate that ElasticSwitch provides bandwidth guarantees and is work-conserving, even in challenging situations.

Proceedings ArticleDOI
23 Jun 2013
TL;DR: This work argues for an alternate architecture---Thin Servers with Smart Pipes (TSSP)---for cost-effective high-performance memcached deployment, and demonstrates the potential benefits of the TSSP architecture through an FPGA prototyping platform, and shows the potential for a 6X-16X power-performance improvement over conventional server baselines.
Abstract: Distributed in-memory key-value stores, such as memcached, are central to the scalability of modern internet services. Current deployments use commodity servers with high-end processors. However, given the cost-sensitivity of internet services and the recent proliferation of volume low-power System-on-Chip (SoC) designs, we see an opportunity for alternative architectures. We undertake a detailed characterization of memcached to reveal performance and power inefficiencies. Our study considers both high-performance and low-power CPUs and NICs across a variety of carefully-designed benchmarks that exercise the range of memcached behavior. We discover that, regardless of CPU microarchitecture, memcached execution is remarkably inefficient, saturating neither network links nor available memory bandwidth. Instead, we find performance is typically limited by the per-packet processing overheads in the NIC and OS kernel---long code paths limit CPU performance due to poor branch predictability and instruction fetch bottlenecks.Our insights suggest that neither high-performance nor low-power cores provide a satisfactory power-performance trade-off, and point to a need for tighter integration of the network interface. Hence, we argue for an alternate architecture---Thin Servers with Smart Pipes (TSSP)---for cost-effective high-performance memcached deployment. TSSP couples an embedded-class low-power core to a memcached accelerator that can process GET requests entirely in hardware, offloading both network handling and data look up. We demonstrate the potential benefits of our TSSP architecture through an FPGA prototyping platform, and show the potential for a 6X-16X power-performance improvement over conventional server baselines.

Journal ArticleDOI
TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks for manycore designs at the 22nm technology shows that 8-core clustering gives the best energy-delay product, whereas when die area is taken into account, 4-core clusters give the best EDA2P and EDAP.
Abstract: This article introduces McPAT, an integrated power, area, and timing modeling framework that supports comprehensive design space exploration for multicore and manycore processor configurations ranging from 90nm to 22nm and beyond. At microarchitectural level, McPAT includes models for the fundamental components of a complete chip multiprocessor, including in-order and out-of-order processor cores, networks-on-chip, shared caches, and integrated system components such as memory controllers and Ethernet controllers. At circuit level, McPAT supports detailed modeling of critical-path timing, area, and power. At technology level, McPAT models timing, area, and power for the device types forecast in the ITRS roadmap. McPAT has a flexible XML interface to facilitate its use with many performance simulators.Combined with a performance simulator, McPAT enables architects to accurately quantify the cost of new ideas and assess trade-offs of different architectures using new metrics such as Energy-Delay-Area2 Product (EDA2P) and Energy-Delay-Area Product (EDAP). This article explores the interconnect options of future manycore processors by varying the degree of clustering over generations of process technologies. Clustering will bring interesting trade-offs between area and performance because the interconnects needed to group cores into clusters incur area overhead, but many applications can make good use of them due to synergies from cache sharing. Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks for manycore designs at the 22nm technology shows that 8-core clustering gives the best energy-delay product, whereas when die area is taken into account, 4-core clustering gives the best EDA2P and EDAP.

Proceedings ArticleDOI
07 Dec 2013
TL;DR: Widx is introduced, an on-chip accelerator for database hash index lookups, which achieves both high performance and flexibility by decoupling key hashing from the list traversal, and processing multiple keys in parallel on a set of programmable walker units.
Abstract: The explosive growth in digital data and its growing role in real-time decision support motivate the design of high-performance database management systems (DBMSs). Meanwhile, slowdown in supply voltage scaling has stymied improvements in core performance and ushered an era of power-limited chips. These developments motivate the design of DBMS accelerators that (a) maximize utility by accelerating the dominant operations, and (b) provide flexibility in the choice of DBMS, data layout, and data types. We study data analytics workloads on contemporary in-memory databases and find hash index lookups to be the largest single contributor to the overall execution time. The critical path in hash index lookups consists of ALU-intensive key hashing followed by pointer chasing through a node list. Based on these observations, we introduce Widx, an on-chip accelerator for database hash index lookups, which achieves both high performance and flexibility by (1) decoupling key hashing from the list traversal, and (2) processing multiple keys in parallel on a set of programmable walker units. Widx reduces design cost and complexity through its tight integration with a conventional core, thus eliminating the need for a dedicated TLB and cache. An evaluation of Widx on a set of modern data analytics workloads (TPC-H, TPC-DS) using full-system simulation shows an average speedup of 3.1× over an aggressive OoO core on bulk hash table operations, while reducing the OoO core energy by 83%.

Journal ArticleDOI
TL;DR: In this article, a predictive model for device behavior that can be used in simulations and to guide designs of memristors has been proposed for high density, low power, and high speed memory.
Abstract: A key requirement for using memristors in circuits is a predictive model for device behavior that can be used in simulations and to guide designs We analyze one of the most promising materials, tantalum oxide, for high density, low power, and high-speed memory We perform an ensemble of measurements, including time dynamics across nine decades, to deduce the underlying state equations describing the switching in Pt/TaOx/Ta memristors A predictive, compact model is found in good agreement with the measured data The resulting model, compatible with SPICE, is then used to understand trends in terms of switching times and energy consumption, which in turn are important for choosing device operating points and handling interactions with other circuit elements

Journal ArticleDOI
TL;DR: Joule-heating induced conductance-switching is studied in VO2, a Mott insulator, using complementary in situ techniques including optical characterization, blackbody microscopy, scanning transmission X-ray microscopy and numerical simulations.
Abstract: Joule-heating induced conductance-switching is studied in VO2 , a Mott insulator. Complementary in situ techniques including optical characterization, blackbody microscopy, scanning transmission X-ray microscopy (STXM) and numerical simulations are used. Abrupt redistribution in local temperature is shown to occur upon conductance-switching along with a structural phase transition, at the same current.

Proceedings ArticleDOI
17 Jun 2013
TL;DR: Two algorithms for data centers by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure are developed by developing two algorithms via numerical simulations based on real world traces from production systems.
Abstract: Demand response is a crucial aspect of the future smart grid. It has the potential to provide significant peak demand reduction and to ease the incorporation of renewable energy into the grid. Data centers' participation in demand response is becoming increasingly important given the high and increasing energy consumption and the flexibility in demand management in data centers compared to conventional industrial facilities. In this extended abstract we briefly describe recent work in our full paper on two demand response schemes to reduce a data center's peak loads and energy expenditure: workload shifting and the use of local power generations. In our full paper, we conduct a detailed characterization study of coincident peak data over two decades from Fort Collins Utilities, Colorado and then develop two algorithms for data centers by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure. The first algorithm optimizes the expected cost and the second one provides a good worst-case guarantee for any coincident peak pattern. We evaluate these algorithms via numerical simulations based on real world traces from production systems. The results show that using workload shifting in combination with local generation can provide significant cost savings (up to 40% in the Fort Collins Utilities' case) compared to either alone.

Journal ArticleDOI
TL;DR: The electrical performance and scalability of this composite material of Pt nanoparticles dispersed in silicon dioxide is examined and devices with ultrafast switching, long state retention, and high endurance are demonstrated.
Abstract: Highly reproducible bipolar resistance switching was recently demonstrated in a composite material of Pt nanoparticles dispersed in silicon dioxide. Here, we examine the electrical performance and scalability of this system and demonstrate devices with ultrafast ( 3 × 107 cycles). A possible switching mechanism based on ion motion in the film is discussed based on these observations.

Journal ArticleDOI
TL;DR: Developing two algorithms for data centers by combining workload scheduling and local power generation to avoid the coincident peak and reduce the energy expenditure and the results show that using workload shifting in combination with local generation can provide significant cost savings.

Proceedings ArticleDOI
27 Aug 2013
TL;DR: A framework, Atlas, which incorporates application-awareness into Software-Defined Networking (SDN), which is currently capable of L2/3/4-based policy enforcement but agnostic to higher layers is presented.
Abstract: We present a framework, Atlas, which incorporates application-awareness into Software-Defined Networking (SDN), which is currently capable of L2/3/4-based policy enforcement but agnostic to higher layers. Atlas enables fine-grained, accurate and scalable application classification in SDN. It employs a machine learning (ML) based traffic classification technique, a crowd-sourcing approach to obtain ground truth data and leverages SDN's data reporting mechanism and centralized control. We prototype Atlas on HP Labs wireless networks and observe 94% accuracy on average, for top 40 Android applications.

Proceedings ArticleDOI
11 Aug 2013
TL;DR: KAURI is proposed, a graph-based framework to collectively link all the named entity mentions in all tweets posted by a user via modeling the user's topics of interest, and the experimental results show that KAURI significantly outperforms the baseline methods in terms of accuracy, and KA URI is efficient and scales well to tweet stream.
Abstract: Twitter has become an increasingly important source of information, with more than 400 million tweets posted per day. The task to link the named entity mentions detected from tweets with the corresponding real world entities in the knowledge base is called tweet entity linking. This task is of practical importance and can facilitate many different tasks, such as personalized recommendation and user interest discovery. The tweet entity linking task is challenging due to the noisy, short, and informal nature of tweets. Previous methods focus on linking entities in Web documents, and largely rely on the context around the entity mention and the topical coherence between entities in the document. However, these methods cannot be effectively applied to the tweet entity linking task due to the insufficient context information contained in a tweet. In this paper, we propose KAURI, a graph-based framework to collectively link all the named entity mentions in all tweets posted by a user via modeling the user's topics of interest. Our assumption is that each user has an underlying topic interest distribution over various named entities. KAURI integrates the intra-tweet local information with the inter-tweet user interest information into a unified graph-based framework. We extensively evaluated the performance of KAURI over manually annotated tweet corpus, and the experimental results show that KAURI significantly outperforms the baseline methods in terms of accuracy, and KAURI is efficient and scales well to tweet stream.

Journal ArticleDOI
TL;DR: The nitrogen-vacancy (NV) centers in diamond have been explored for their potential as a solid-state alternative to trapped ions for quantum computing as discussed by the authors and have shown an unprecedented capability to perform certain quantum processing and storage operations at room temperature.
Abstract: Much of the motivation for exploring nitrogen-vacancy (NV) centers in diamond in the past decade has been for their potential as a solid-state alternative to trapped ions for quantum computing. In this area, the NV center has exceeded expectations and even shown an unprecedented capability to perform certain quantum processing and storage operations at room temperature. The ability to operate in ambient conditions, combined with the atom-like magnetic Zeeman sensitivity, has also led to intensive investigation of NV centers as nanoscale magnetometers. Thus, aside from room-temperature solid-state quantum computers, the NV could also be used to image individual spins in biological systems, eventually leading to a new level of understanding of biomolecular interactions in living cells.