Showing papers by "Hewlett-Packard published in 2019"

PDF

Open Access

Posted Content•DOI•

KofamKOALA: KEGG ortholog assignment based on profile HMM and adaptive score threshold

[...]

Takuya Aramaki¹, Romain Blanc-Mathieu¹, Hisashi Endo¹, Koichi Ohkubo¹, Koichi Ohkubo², Minoru Kanehisa¹, Susumu Goto, Hiroyuki Ogata¹ - Show less +4 more•Institutions (2)

Kyoto University¹, Hewlett-Packard²

08 Apr 2019-bioRxiv

TL;DR: KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds.

...read moreread less

Abstract: Summary KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds. KofamKOALA is faster than existing KO assignment tools with its accuracy being comparable to the best performing tools. Function annotation by KofamKOALA helps linking genes to KEGG resources such as the KEGG pathway maps and facilitates molecular network reconstruction. Availability KofamKOALA, KofamScan, and KOfam are freely available from https://www.genome.jp/tools/kofamkoala/ Contact ogata@kuicr.kyoto-u.ac.jp

...read moreread less

457 citations

Journal Article•DOI•

Long short-term memory networks in memristor crossbar arrays

[...]

Can Li¹, Can Li², Zhongrui Wang², Mingyi Rao², Daniel Belkin², Wenhao Song², Hao Jiang², Peng Yan², Yunning Li², Peng Lin², Miao Hu¹, Ning Ge¹, John Paul Strachan¹, Mark Barnell³, Qing Wu³, R. Stanley Williams¹, Jianhua Yang², Qiangfei Xia² - Show less +14 more•Institutions (3)

Hewlett-Packard¹, University of Massachusetts Amherst², Air Force Research Laboratory³

01 Jan 2019-Nature Machine Intelligence

TL;DR: It is demonstrated experimentally that the synaptic weights shared in different time steps in an LSTM can be implemented with a memristor crossbar array, which has a small circuit footprint, can store a large number of parameters and offers in-memory computing capability that contributes to circumventing the ‘von Neumann bottleneck’.

...read moreread less

Abstract: Recent breakthroughs in recurrent deep neural networks with long short-term memory (LSTM) units have led to major advances in artificial intelligence. However, state-of-the-art LSTM models with significantly increased complexity and a large number of parameters have a bottleneck in computing power resulting from both limited memory capacity and limited data communication bandwidth. Here we demonstrate experimentally that the synaptic weights shared in different time steps in an LSTM can be implemented with a memristor crossbar array, which has a small circuit footprint, can store a large number of parameters and offers in-memory computing capability that contributes to circumventing the ‘von Neumann bottleneck’. We illustrate the capability of our crossbar system as a core component in solving real-world problems in regression and classification, which shows that memristor LSTM is a promising low-power and low-latency hardware platform for edge inference. Deep neural networks are increasingly popular in data-intensive applications, but are power-hungry. New types of computer chips that are suited to the task of deep learning, such as memristor arrays where data handling and computing take place within the same unit, are required. A well-used deep learning model called long short-term memory, which can handle temporal sequential data analysis, is now implemented in a memristor crossbar array, promising an energy-efficient and low-footprint deep learning platform.

...read moreread less

251 citations

Proceedings Article•DOI•

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference

[...]

Aayush Ankit¹, Izzat El Hajj², Sai Rahul Chalamalasetti³, Geoffrey Ndu³, Martin Foltin³, R. Stanley Williams³, Paolo Faraboschi³, Wen-mei W. Hwu², John Paul Strachan³, Kaushik Roy¹, Dejan Milojicic³ - Show less +7 more•Institutions (3)

Purdue University¹, University of Illinois at Urbana–Champaign², Hewlett-Packard³

04 Apr 2019

TL;DR: The Programmable Ultra-efficient Memristor-based Accelerator (PUMA) as mentioned in this paper enhances memristor crossbars with general purpose execution units to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads.

...read moreread less

Abstract: Memristor crossbars are circuits capable of performing analog matrix-vector multiplications, overcoming the fundamental energy efficiency limitations of digital logic. They have been shown to be effective in special-purpose accelerators for a limited set of neural network applications. We present the Programmable Ultra-efficient Memristor-based Accelerator (PUMA) which enhances memristor crossbars with general purpose execution units to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads. PUMA's microarchitecture techniques exposed through a specialized Instruction Set Architecture (ISA) retain the efficiency of in-memory computing and analog circuitry, without compromising programmability. We also present the PUMA compiler which translates high-level code to PUMA ISA. The compiler partitions the computational graph and optimizes instruction scheduling and register allocation to generate code for large and complex workloads to run on thousands of spatial cores. We have developed a detailed architecture simulator that incorporates the functionality, timing, and power models of PUMA's components to evaluate performance and energy consumption. A PUMA accelerator running at 1 GHz can reach area and power efficiency of 577 GOPS/s/mm 2 and 837~GOPS/s/W, respectively. Our evaluation of diverse ML applications from image recognition, machine translation, and language modelling (5M-800M synapses) shows that PUMA achieves up to 2,446× energy and 66× latency improvement for inference compared to state-of-the-art GPUs. Compared to an application-specific memristor-based accelerator, PUMA incurs small energy overheads at similar inference latency and added programmability.

...read moreread less

228 citations

Journal Article•DOI•

Reinforcement learning with analogue memristor arrays

[...]

Zhongrui Wang¹, Can Li¹, Wenhao Song¹, Mingyi Rao¹, Daniel Belkin¹, Yunning Li¹, Peng Yan¹, Hao Jiang¹, Peng Lin¹, Miao Hu², John Paul Strachan³, Ning Ge³, Mark Barnell⁴, Qing Wu⁴, Andrew G. Barto¹, Qinru Qiu⁵, R. Stanley Williams⁶, Qiangfei Xia¹, Jianhua Yang¹ - Show less +15 more•Institutions (6)

University of Massachusetts Amherst¹, Binghamton University², Hewlett-Packard³, Air Force Research Laboratory⁴, Syracuse University⁵, Texas A&M University⁶

01 Mar 2019

TL;DR: An experimental demonstration of reinforcement learning on a three-layer 1-transistor 1-memristor (1T1R) network using a modified learning algorithm tailored for the authors' hybrid analogue–digital platform, which has the potential to achieve a significant boost in speed and energy efficiency.

...read moreread less

Abstract: Reinforcement learning algorithms that use deep neural networks are a promising approach for the development of machines that can acquire knowledge and solve problems without human input or supervision. At present, however, these algorithms are implemented in software running on relatively standard complementary metal–oxide–semiconductor digital platforms, where performance will be constrained by the limits of Moore’s law and von Neumann architecture. Here, we report an experimental demonstration of reinforcement learning on a three-layer 1-transistor 1-memristor (1T1R) network using a modified learning algorithm tailored for our hybrid analogue–digital platform. To illustrate the capabilities of our approach in robust in situ training without the need for a model, we performed two classic control problems: the cart–pole and mountain car simulations. We also show that, compared with conventional digital systems in real-world reinforcement learning tasks, our hybrid analogue–digital computing system has the potential to achieve a significant boost in speed and energy efficiency. A reinforcement learning algorithm can be implemented on a hybrid analogue–digital platform based on memristive arrays for parallel and energy-efficient in situ training.

...read moreread less

225 citations

Journal Article•DOI•

Gene Expression Changes and Community Turnover Differentially Shape the Global Ocean Metatranscriptome

[...]

Guillem Salazar¹, Lucas Paoli¹, Adriana Alberti², Jaime Huerta-Cepas³, Hans-Joachim Ruscheweyh¹, Miguelangel Cuenca¹, Christopher M. Field¹, Luis Pedro Coelho⁴, Corinne Cruaud², Stefan Engelen², Ann C. Gregory⁵, Karine Labadie², Claudie Marec⁶, Claudie Marec⁷, Eric Pelletier², Marta Royo-Llonch⁸, Simon Roux⁵, Pablo Sánchez⁸, Hideya Uehara⁹, Ahmed A. Zayed⁵, Georg Zeller, Margaux Carmichael¹⁰, Céline Dimier¹⁰, Céline Dimier¹¹, Joannie Ferland⁶, Stefanie Kandels, Marc Picheral¹⁰, Sergey Pisarev¹², Julie Poulain², Silvia G. Acinas, Marcel Babin, Peer Bork, Emmanuel Boss¹³, Chris Bowler¹¹, Guy Cochrane¹⁴, Colomban de Vargas, Michael J. Follows¹⁵, Gabriel Gorsky, Nigel Grimsley, Lionel Guidi, Pascal Hingamp, Daniele Iudicone, Olivier Jaillon, Stefanie Kandels-Lewis, Lee Karp-Boss¹³, Eric Karsenti¹¹, Fabrice Not, Hiroyuki Ogata, Stephane Pesant, Nicole J. Poulton¹⁶, Jeroen Raes, Christian Sardet, Sabrina Speich, Lars Stemmann, Matthew B. Sullivan⁵, Shinichi Sunagawa, Patrick Wincker - Show less +53 more•Institutions (16)

Swiss Institute of Bioinformatics¹, French Alternative Energies and Atomic Energy Commission², Technical University of Madrid³, Fudan University⁴, Ohio State University⁵, Laval University⁶, IFREMER⁷, Spanish National Research Council⁸, Hewlett-Packard⁹, University of Paris¹⁰, École Normale Supérieure¹¹, Shirshov Institute of Oceanology¹², University of Maine¹³, European Bioinformatics Institute¹⁴, Massachusetts Institute of Technology¹⁵, Bigelow Laboratory For Ocean Sciences¹⁶

14 Nov 2019-Cell

TL;DR: The relative contribution of gene expression changes to be significantly lower in polar than in non-polar waters and it is hypothesized that in polar regions, alterations in community activity in response to ocean warming will be driven more strongly by changes in organismal composition than by gene regulatory mechanisms.

...read moreread less

217 citations

Journal Article•DOI•

A Survey of DevOps Concepts and Challenges

[...]

Leonardo Leite, Carla Rocha¹, Fabio Kon, Dejan Milojicic², Paulo Meirelles - Show less +1 more•Institutions (2)

University of Brasília¹, Hewlett-Packard²

14 Nov 2019-ACM Computing Surveys

TL;DR: The present survey investigates and discusses DevOps challenges from the perspective of engineers, managers, and researchers, and develops a DevOps conceptual map, correlating the DevOps automation tools with these concepts.

...read moreread less

Abstract: DevOpsis a collaborative and multidisciplinary organizational effort to automate continuous delivery of new software updates while guaranteeing their correctness and reliability. The present survey investigates and discusses DevOps challenges from the perspective of engineers, managers, and researchers. We review the literature and develop a DevOps conceptual map, correlating the DevOps automation tools with these concepts. We then discuss their practical implications for engineers, managers, and researchers. Finally, we critically explore some of the most relevant DevOps challenges reported by the literature.

...read moreread less

184 citations

Journal Article•DOI•

Ultra-fast NH4+ Storage: Strong H Bonding between NH4+ and Bi-layered V2O5

[...]

Shengyang Dong¹, Shengyang Dong², Woochul Shin², Heng Jiang², Xianyong Wu², Zhifei Li², John Holoubek², William F. Stickle³, Baris Key⁴, Cong Liu⁴, Jun Lu⁴, P. Alex Greaney⁵, Xiaogang Zhang¹, Xiulei Ji² - Show less +10 more•Institutions (5)

Nanjing University of Aeronautics and Astronautics¹, Oregon State University², Hewlett-Packard³, Argonne National Laboratory⁴, University of California, Riverside⁵

13 Jun 2019-Chem

TL;DR: In this paper, the authors show that the use of NH4+ results in battery performance governed by the chemical nature of the ion-electrode interaction, and they show that H bonding between NH4+, and a bi-layered V2O5 electrode is coupled with prominent pseudocapacitive behavior.

...read moreread less

164 citations

Journal Article•DOI•

In situ training of feed-forward and recurrent convolutional memristor networks

[...]

Zhongrui Wang¹, Can Li¹, Can Li², Peng Lin¹, Mingyi Rao¹, Yongyang Nie¹, Wenhao Song¹, Qinru Qiu³, Yunning Li¹, Peng Yan¹, John Paul Strachan², Ning Ge², Nathan McDonald⁴, Qing Wu⁴, Miao Hu⁵, Huaqiang Wu⁶, R. Stanley Williams⁷, Qiangfei Xia¹, Jianhua Yang¹ - Show less +15 more•Institutions (7)

University of Massachusetts Amherst¹, Hewlett-Packard², Syracuse University³, Air Force Research Laboratory⁴, Binghamton University⁵, Tsinghua University⁶, Texas A&M University⁷

01 Sep 2019-Nature Machine Intelligence

TL;DR: In situ training of a five-level convolutional neural network that self-adapts to non-idealities of the one-transistor one-memristor array to classify the MNIST dataset is experimentally demonstrated, achieving a 75% reduction in weights without compromising accuracy.

...read moreread less

Abstract: The explosive growth of machine learning is largely due to the recent advancements in hardware and architecture. The engineering of network structures, taking advantage of the spatial or temporal translational isometry of patterns, naturally leads to bio-inspired, shared-weight structures such as convolutional neural networks, which have markedly reduced the number of free parameters. State-of-the-art microarchitectures commonly rely on weight-sharing techniques, but still suffer from the von Neumann bottleneck of transistor-based platforms. Here, we experimentally demonstrate the in situ training of a five-level convolutional neural network that self-adapts to non-idealities of the one-transistor one-memristor array to classify the MNIST dataset, achieving similar accuracy to the memristor-based multilayer perceptron with a reduction in trainable parameters of ~75% owing to the shared weights. In addition, the memristors encoded both spatial and temporal translational invariance simultaneously in a convolutional long short-term memory network—a memristor-based neural network with intrinsic 3D input processing—which was trained in situ to classify a synthetic MNIST sequence dataset using just 850 weights. These proof-of-principle demonstrations combine the architectural advantages of weight sharing and the area/energy efficiency boost of the memristors, paving the way to future edge artificial intelligence. Memristive devices can provide energy-efficient neural network implementations, but they must be tailored to suit different network architectures. Wang et al. develop a trainable weight-sharing mechanism for memristor-based CNNs and ConvLSTMs, achieving a 75% reduction in weights without compromising accuracy.

...read moreread less

155 citations

Journal Article•DOI•

Low-voltage high-performance flexible digital and analog circuits based on ultrahigh-purity semiconducting carbon nanotubes.

[...]

Ting Lei¹, Leilai Shao², Yu-Qing Zheng¹, Gregory Pitner¹, Guanhua Fang¹, Chenxin Zhu¹, Sicheng Li³, Raymond G. Beausoleil³, H-S Philip Wong¹, Tsung-Ching Huang³, Kwang-Ting Cheng², Kwang-Ting Cheng⁴, Zhenan Bao¹ - Show less +9 more•Institutions (4)

Stanford University¹, University of California, Santa Barbara², Hewlett-Packard³, Hong Kong University of Science and Technology⁴

14 May 2019-Nature Communications

TL;DR: Low-voltage and high-performance digital and analog CNT TFT circuits based on high-yield and ultrahigh purity polymer-sorted semiconducting CNTs and the first tunable-gain amplifier with 1,000 gain at 20 kHz are reported.

...read moreread less

Abstract: Carbon nanotube (CNT) thin-film transistor (TFT) is a promising candidate for flexible and wearable electronics. However, it usually suffers from low semiconducting tube purity, low device yield, and the mismatch between p- and n-type TFTs. Here, we report low-voltage and high-performance digital and analog CNT TFT circuits based on high-yield (19.9%) and ultrahigh purity (99.997%) polymer-sorted semiconducting CNTs. Using high-uniformity deposition and pseudo-CMOS design, we demonstrated CNT TFTs with good uniformity and high performance at low operation voltage of 3 V. We tested forty-four 2-µm channel 5-stage ring oscillators on the same flexible substrate (1,056 TFTs). All worked as expected with gate delays of 42.7 ± 13.1 ns. With these high-performance TFTs, we demonstrated 8-stage shift registers running at 50 kHz and the first tunable-gain amplifier with 1,000 gain at 20 kHz. These results show great potentials of using solution-processed CNT TFTs for large-scale flexible electronics. Carbon nanotube thin-film transistor is promising for solution-processed, large-scale flexible electronics, but the device yields remain poor to date. Lei et al. show low-voltage flexible digital and analog circuits based on high-purity and high-yield separation of semiconducting carbon nanotubes.

...read moreread less

132 citations

Posted Content•

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference

[...]

Aayush Ankit¹, Izzat El Hajj², Sai Rahul Chalamalasetti³, Geoffrey Ndu³, Martin Foltin³, R. Stanley Williams³, Paolo Faraboschi³, Wen-mei W. Hwu², John Paul Strachan³, Kaushik Roy⁴, Dejan Milojicic⁴ - Show less +7 more•Institutions (4)

Purdue University¹, University of Illinois at Urbana–Champaign², Hewlett-Packard³, Association for Computing Machinery⁴

29 Jan 2019-arXiv: Emerging Technologies

TL;DR: The Programmable Ultra-efficient Memristor-based Accelerator (PUMA) is presented which enhances memristor crossbars with general purpose execution units to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads.

...read moreread less

Abstract: Memristor crossbars are circuits capable of performing analog matrix-vector multiplications, overcoming the fundamental energy efficiency limitations of digital logic. They have been shown to be effective in special-purpose accelerators for a limited set of neural network applications. We present the Programmable Ultra-efficient Memristor-based Accelerator (PUMA) which enhances memristor crossbars with general purpose execution units to enable the acceleration of a wide variety of Machine Learning (ML) inference workloads. PUMA's microarchitecture techniques exposed through a specialized Instruction Set Architecture (ISA) retain the efficiency of in-memory computing and analog circuitry, without compromising programmability. We also present the PUMA compiler which translates high-level code to PUMA ISA. The compiler partitions the computational graph and optimizes instruction scheduling and register allocation to generate code for large and complex workloads to run on thousands of spatial cores. We have developed a detailed architecture simulator that incorporates the functionality, timing, and power models of PUMA's components to evaluate performance and energy consumption. A PUMA accelerator running at 1 GHz can reach area and power efficiency of $577~GOPS/s/mm^2$ and $837~GOPS/s/W$, respectively. Our evaluation of diverse ML applications from image recognition, machine translation, and language modelling (5M-800M synapses) shows that PUMA achieves up to $2,446\times$ energy and $66\times$ latency improvement for inference compared to state-of-the-art GPUs. Compared to an application-specific memristor-based accelerator, PUMA incurs small energy overheads at similar inference latency and added programmability.

...read moreread less

108 citations

Journal Article•DOI•

An Aqueous Dual‑Ion Battery Cathode of Mn 3 O 4 via Reversible Insertion of Nitrate

[...]

Heng Jiang¹, Zhixuan Wei¹, Zhixuan Wei², Lu Ma³, Yifei Yuan³, Jessica J. Hong¹, Xianyong Wu¹, Daniel P. Leonard¹, John Holoubek¹, Joshua J. Razink⁴, William F. Stickle⁵, Fei Du², Tianpin Wu³, Jun Lu³, Xiulei Ji¹ - Show less +11 more•Institutions (5)

Oregon State University¹, Jilin University², Argonne National Laboratory³, University of Oregon⁴, Hewlett-Packard⁵

08 Apr 2019-Angewandte Chemie

TL;DR: Ex situ HRTEM and corresponding EDX mapping results suggest that NO3 - insertion de-crystallizes the structure of Mn3 O4 and may open a new direction for novel low-cost aqueous dual-ion batteries.

...read moreread less

Abstract: We report reversible electrochemical insertion of NO3 - into manganese(II, III) oxide (Mn3 O4 ) as a cathode for aqueous dual-ion batteries. Characterization by TGA, FTIR, EDX, XANES, EXAFS, and EQCM collectively provides unequivocal evidence that reversible oxidative NO3 - insertion takes place inside Mn3 O4 . Ex situ HRTEM and corresponding EDX mapping results suggest that NO3 - insertion de-crystallizes the structure of Mn3 O4 . Kinetic studies reveal fast migration of NO3 - in the Mn3 O4 structure. This finding may open a new direction for novel low-cost aqueous dual-ion batteries.

...read moreread less

Journal Article•DOI•

A Survey of DevOps Concepts and Challenges.

[...]

Leonardo Leite¹, Carla Rocha², Fabio Kon¹, Dejan Milojicic³, Paulo Meirelles⁴ - Show less +1 more•Institutions (4)

University of São Paulo¹, University of Brasília², Hewlett-Packard³, Federal University of São Paulo⁴

12 Sep 2019-arXiv: Software Engineering

TL;DR: In this article, the authors present a survey of DevOps challenges from the perspective of engineers, managers, and researchers, and discuss their practical implications for developers, managers and researchers.

...read moreread less

Abstract: DevOps is a collaborative and multidisciplinary organizational effort to automate continuous delivery of new software updates while guaranteeing their correctness and reliability. The present survey investigates and discusses DevOps challenges from the perspective of engineers, managers, and researchers. We review the literature and develop a DevOps conceptual map, correlating the DevOps automation tools with these concepts. We then discuss their practical implications for engineers, managers, and researchers. Finally, we critically explore some of the most relevant DevOps challenges reported by the literature.

...read moreread less

Journal Article•DOI•

All-2D Material Inkjet-Printed Capacitors: Toward Fully Printed Integrated Circuits

[...]

Robyn Worsley¹, Lorenzo Pimpolari², Daryl McManus¹, Ning Ge³, Robert Ionescu³, Wittkopf Jarrid³, Adriana Alieva¹, Giovanni Basso², Massimo Macucci², Giuseppe Iannaccone², Kostya S. Novoselov¹, Holder Helen A³, Gianluca Fiori², Cinzia Casiraghi¹ - Show less +10 more•Institutions (3)

University of Manchester¹, University of Pisa², Hewlett-Packard³

22 Jan 2019-ACS Nano

TL;DR: This work uses water-based and biocompatible graphene and hBN inks to fabricate all-2D material and inkjet-printed capacitors, and demonstrates an areal capacitance of 2.0 ± 0.3 nF cm-2 for a dielectric thickness of ∼3 μm and negligible leakage currents, averaged across more than 100 devices.

...read moreread less

Abstract: A well-defined insulating layer is of primary importance in the fabrication of passive (e.g., capacitors) and active (e.g., transistors) components in integrated circuits. One of the most widely known two-dimensional (2D) dielectric materials is hexagonal boron nitride (hBN). Solution-based techniques are cost-effective and allow simple methods to be used for device fabrication. In particular, inkjet printing is a low-cost, noncontact approach, which also allows for device design flexibility, produces no material wastage, and offers compatibility with almost any surface of interest, including flexible substrates. In this work, we use water-based and biocompatible graphene and hBN inks to fabricate all-2D material and inkjet-printed capacitors. We demonstrate an areal capacitance of 2.0 ± 0.3 nF cm–2 for a dielectric thickness of ∼3 μm and negligible leakage currents, averaged across more than 100 devices. This gives rise to a derived dielectric constant of 6.1 ± 1.7. The inkjet printed hBN dielectric has ...

...read moreread less

Journal Article•DOI•

III/V-on-Si MQW lasers by using a novel photonic integration method of regrowth on a bonding template.

[...]

Yingtao Hu¹, Di Liang¹, Kunal Mukherjee², Youli Li², Chong Zhang¹, Geza Kurczveil¹, Xue Huang¹, Raymond G. Beausoleil¹ - Show less +4 more•Institutions (2)

Hewlett-Packard¹, University of California, Santa Barbara²

09 Oct 2019-Light-Science & Applications

TL;DR: It is demonstrated that InP-based quantum well lasers can be grown onto silicon waveguides by using a growth template, and this generic concept can be applied to other material systems to provide higher integration density, more functionalities and lower total cost for photonics as well as microelectronics, MEMS, and many other applications.

...read moreread less

Abstract: Silicon photonics is becoming a mainstream data-transmission solution for next-generation data centers, high-performance computers, and many emerging applications. The inefficiency of light emission in silicon still requires the integration of a III/V laser chip or optical gain materials onto a silicon substrate. A number of integration approaches, including flip-chip bonding, molecule or polymer wafer bonding, and monolithic III/V epitaxy, have been extensively explored in the past decade. Here, we demonstrate a novel photonic integration method of epitaxial regrowth of III/V on a III/V-on-SOI bonding template to realize heterogeneous lasers on silicon. This method decouples the correlated root causes, i.e., lattice, thermal, and domain mismatches, which are all responsible for a large number of detrimental dislocations in the heteroepitaxy process. The grown multi-quantum well vertical p-i-n diode laser structure shows a significantly low dislocation density of 9.5 × 104 cm-2, two orders of magnitude lower than the state-of-the-art conventional monolithic growth on Si. This low dislocation density would eliminate defect-induced laser lifetime concerns for practical applications. The fabricated lasers show room-temperature pulsed and continuous-wave lasing at 1.31 μm, with a minimal threshold current density of 813 A/cm2. This generic concept can be applied to other material systems to provide higher integration density, more functionalities and lower total cost for photonics as well as microelectronics, MEMS, and many other applications.

...read moreread less

Journal Article•DOI•

Gas Sensor by Direct Growth and Functionalization of Metal Oxide/Metal Sulfide Core-Shell Nanowires on Flexible Substrates.

[...]

Daejong Yang¹, Incheol Cho, Donghwan Kim², Mi Ae Lim, Zhiyong Li³, Jong G. Ok⁴, Moonjin Lee, Inkyu Park - Show less +4 more•Institutions (4)

Kongju National University¹, Electric Power Research Institute², Hewlett-Packard³, Seoul National University of Science and Technology⁴

12 Jun 2019-ACS Applied Materials & Interfaces

TL;DR: A novel fabrication method for flexible gas sensors for toxic gases based on sequential wet chemical reaction using zinc oxide nanowires and palladium nanoparticles is developed, which shows a high sensitivity, fast response, and outstanding selectivity to other toxic gases.

...read moreread less

Abstract: We have developed a novel fabrication method for flexible gas sensors for toxic gases based on sequential wet chemical reaction. In specific, zinc oxide (ZnO) nanowires were locally synthesized and...

...read moreread less

Journal Article•DOI•

Fast Spiking of a Mott VO2-Carbon Nanotube Composite Device.

[...]

Stephanie M. Bohaichuk¹, Suhas Kumar², Greg Pitner¹, Connor J. McClellan¹, Jaewoo Jeong³, Mahesh G. Samant³, H.-S. Philip Wong¹, Stuart S. P. Parkin³, R. Stanley Williams⁴, Eric Pop¹ - Show less +6 more•Institutions (4)

Stanford University¹, Hewlett-Packard², IBM³, Texas A&M University⁴

21 Aug 2019-Nano Letters

TL;DR: DC-current or voltage-driven periodic spiking with sub-20 ns pulse widths from a single device composed of a thin VO2 film with a metallic carbon nanotube as a nanoscale heater, without using an external capacitor is demonstrated.

...read moreread less

Abstract: The recent surge of interest in brain-inspired computing and power-efficient electronics has dramatically bolstered development of computation and communication using neuron-like spiking signals. Devices that can produce rapid and energy-efficient spiking could significantly advance these applications. Here we demonstrate direct current or voltage-driven periodic spiking with sub-20 ns pulse widths from a single device composed of a thin VO2 film with a metallic carbon nanotube as a nanoscale heater, without using an external capacitor. Compared with VO2-only devices, adding the nanotube heater dramatically decreases the transient duration and pulse energy, and increases the spiking frequency, by up to 3 orders of magnitude. This is caused by heating and cooling of the VO2 across its insulator-metal transition being localized to a nanoscale conduction channel in an otherwise bulk medium. This result provides an important component of energy-efficient neuromorphic computing systems and a lithography-free technique for energy-scaling of electronic devices that operate via bulk mechanisms.

...read moreread less

Journal Article•DOI•

Low-Conductance and Multilevel CMOS-Integrated Nanoscale Oxide Memristors

[...]

Xia Sheng¹, Catherine Graves¹, Suhas Kumar¹, Xuema Li¹, Brent Buchanan¹, Le Zheng¹, Si-Ty Lam¹, Can Li¹, John Paul Strachan¹ - Show less +5 more•Institutions (1)

Hewlett-Packard¹

01 Sep 2019-Advanced electronic materials

Proceedings Article•DOI•

Designing Far Memory Data Structures: Think Outside the Box

[...]

Marcos K. Aguilera¹, Kimberly Keeton², Stanko Novakovic³, Sharad Singhal²•Institutions (3)

VMware¹, Hewlett-Packard², Microsoft³

13 May 2019

TL;DR: This paper argues that new data structures for far memory need to be built, borrowing techniques from concurrent data structures and distributed systems, and shows how to realize them using simple hardware extensions.

...read moreread less

Abstract: Technologies like RDMA and Gen-Z, which give access to memory outside the box, are gaining in popularity. These technologies provide the abstraction of far memory, where memory is attached to the network and can be accessed by remote processors without mediation by a local processor. Unfortunately, far memory is hard to use because existing data structures are mismatched to it. We argue that we need new data structures for far memory, borrowing techniques from concurrent data structures and distributed systems. We examine the requirements of these data structures and show how to realize them using simple hardware extensions.

...read moreread less

Proceedings Article•DOI•

Automatic Generation of Medical Imaging Diagnostic Report with Hierarchical Recurrent Neural Network

[...]

Changchang Yin¹, Buyue Qian¹, Jishang Wei², Xiaoyu Li¹, Xianli Zhang¹, Yang Li¹, Qinghua Zheng¹ - Show less +3 more•Institutions (2)

Xi'an Jiaotong University¹, Hewlett-Packard²

01 Nov 2019

TL;DR: A new framework to accurately detect the abnormalities and automatically generate medical reports is presented, based on hierarchical recurrent neural network (HRNN), and a topic matching mechanism is introduced to HRNN, so as to make generated reports more accurate and diverse.

...read moreread less

Abstract: Medical images are widely used in the medical domain for the diagnosis and treatment of diseases. Reading a medical image and summarizing its insights is a routine, yet nonetheless time-consuming task, which often represents a bottleneck in the clinical diagnosis process. Automatic report generation can relieve the issues. However, generating medical reports presents two major challenges: (i) it is hard to accurately detect all the abnormalities simultaneously, especially the rare diseases; (ii) a medical image report consists of many paragraphs and sentences, which are longer than natural image captions. We present a new framework to accurately detect the abnormalities and automatically generate medical reports. The report generation model is based on hierarchical recurrent neural network (HRNN). We introduce a topic matching mechanism to HRNN, so as to make generated reports more accurate and diverse. The soft attention mechanism is also introduced to HRNN model. Experimental results on two image-paragraph pair datasets show that our framework outperforms all the state-of-art methods.

...read moreread less

Journal Article•DOI•

Redox-based memristive devices for new computing paradigm

[...]

Regina Dittmann¹, John Paul Strachan²•Institutions (2)

Forschungszentrum Jülich¹, Hewlett-Packard²

22 Nov 2019-APL Materials

TL;DR: The status in the understanding of the most common redox-based memristive devices is presented and a rational design of the materials stacks will be required, enabling nanoscale control over the ionic dynamics that gives these devices their variety of capabilities.

...read moreread less

Abstract: Memristive devices have been a hot topic in nanoelectronics for the last two decades in both academia and industry. Originally proposed as digital (binary) nonvolatile random access memories, research in this field was predominantly driven by the search for higher performance solid-state drive technologies (e.g., flash replacement) or higher density memories (storage class memory). However, based on their large dynamic range in resistance with analog-tunability along with complex switching dynamics, memristive devices enable revolutionary novel functions and computing paradigms. We present the prospects, opportunities, and materials challenges of memristive devices in computing applications, both near and far terms. Memristive devices offer at least three main types of novel computing applications: in-memory computing, analog computing, and state dynamics. We will present the status in the understanding of the most common redox-based memristive devices while addressing the challenges that materials research will need to tackle in the future. In order to pave the way toward novel computing paradigms, a rational design of the materials stacks will be required, enabling nanoscale control over the ionic dynamics that gives these devices their variety of capabilities.

...read moreread less

Proceedings Article•DOI•

Source Compression with Bounded DNN Perception Loss for IoT Edge Computer Vision

[...]

Xiufeng Xie¹, Kyu-Han Kim¹•Institutions (1)

Hewlett-Packard¹

11 Oct 2019

TL;DR: GRACE is presented, a DNN-aware compression algorithm that facilitates the edge inference by significantly saving the network bandwidth consumption without disturbing the inference performance and achieves the superior compression performance over existing strategies for key DNN applications.

...read moreread less

Abstract: IoT and deep learning based computer vision together create an immense market opportunity, but running deep neural networks (DNNs) on resource-constrained IoT devices remains challenging. Offloading DNN inference to an edge server is a promising solution, but limited wireless bandwidth bottlenecks its end-to-end performance and scalability. While IoT devices can adopt source compression to cope with the limited bandwidth, existing compression algorithms (or codecs) are not designed for DNN (but for human eyes), and thus, suffer from either low compression rates or high DNN inference errors. This paper presents GRACE, a DNN-aware compression algorithm that facilitates the edge inference by significantly saving the network bandwidth consumption without disturbing the inference performance. Given a target DNN, GRACE (i) analyzes DNN's perception model w.r.t both spatial frequencies and colors and (ii) generates an optimized compression strategy for the model -- one-time offline process. Next, GRACE deploys thus-generated compression strategy at IoT devices (or source) to perform online source compression within the existing codec framework, adding no extra overhead. We prototype GRACE on JPEG (the most popular image codec framework), and our evaluation results show that GRACE indeed achieves the superior compression performance over existing strategies for key DNN applications. For semantic segmentation tasks, GRACE reduces a source size by 23% compared to JPEG with similar interference accuracy (0.38% lower than GRACE). Further, GRACE even achieves 7.5% higher inference accuracy than JPEG with a commonly used quality level of 75 does. For classification tasks, GRACE reduces the bandwidth consumption by 90% over JPEG with the same inference accuracy.

...read moreread less

Journal Article•DOI•

Silicon–germanium avalanche photodiodes with direct control of electric field in charge multiplication region

[...]

Xiaoge Zeng¹, Zhihong Huang¹, Binhao Wang¹, Di Liang¹, Marco Fiorentino¹, Raymond G. Beausoleil¹ - Show less +2 more•Institutions (1)

Hewlett-Packard¹

20 Jun 2019

TL;DR: A waveguide-coupled silicon-germanium avalanche photodiode (APD) detector with three electric terminals was demonstrated with breakdown voltage of −6'V, bandwidth of 18.9'GHz, DC photocurrent gain of 15, open-eye diagram at a data rate of 35'G/s, and sensitivity of −11.4'dBm.

...read moreread less

Abstract: A CMOS-compatible avalanche photodiode (APD) with high speed and high sensitivity is a critical component of a low-cost, high-data-rate, and energy-efficient optical communication link. A novel waveguide-coupled silicon–germanium APD detector with three electric terminals was demonstrated with breakdown voltage of −6 V, bandwidth of 18.9 GHz, DC photocurrent gain of 15, open-eye diagram at a data rate of 35 Gb/s, and sensitivity of −11.4 dBm at a data rate of 25 Gb/s. This three-terminal APD allows high-yield fabrication in the standard CMOS process and provides robust high-sensitivity operation under small voltage supply.

...read moreread less

Journal Article•DOI•

Analog content addressable memories with memristors

[...]

Can Li¹, Catherine Graves¹, Xia Sheng¹, Darrin Miller¹, Martin Foltin¹, Giacomo Pedretti², Giacomo Pedretti¹, John Paul Strachan¹ - Show less +4 more•Institutions (2)

Hewlett-Packard¹, Polytechnic University of Milan²

18 Jul 2019-arXiv: Emerging Technologies

TL;DR: In this paper, the analog content-addressable-memory (CA-MAM) concept and circuit is proposed to overcome the limitations of traditional content-addressed memory by utilizing the analog conductance tunability of memristors.

...read moreread less

Abstract: A content-addressable-memory compares an input search word against all rows of stored words in an array in a highly parallel manner. While supplying a very powerful functionality for many applications in pattern matching and search, it suffers from large area, cost and power consumption, limiting its use. Past improvements have been realized by using memristors to replace the static-random-access-memory cell in conventional designs, but employ similar schemes based only on binary or ternary states for storage and search. We propose a new analog content-addressable-memory concept and circuit to overcome these limitations by utilizing the analog conductance tunability of memristors. Our analog content-addressable-memory stores data within the programmable conductance and can take as input either analog or digital search values. Experimental demonstrations, scaled simulations and analysis show that our analog content-addressable-memory can reduce area and power consumption, which enables the acceleration of existing applications, but also new computing application areas.

...read moreread less

Journal Article•DOI•

Multi-criteria active deep learning for image classification

[...]

Jin Yuan¹, Xingxing Hou¹, Yaoqiang Xiao¹, Da Cao¹, Weili Guan², Liqiang Nie³ - Show less +2 more•Institutions (3)

Hunan University¹, Hewlett-Packard², Shandong University³

15 May 2019-Knowledge Based Systems

TL;DR: This work devised a novel solution “multi-criteria active leep learning” (MCADL) to learn an active learning strategy for deep neural networks in image classification and demonstrates that the proposed method consistently outperforms highly competitive active learning approaches.

...read moreread less

Abstract: As a robust and heuristic technique in machine learning, active learning has been established as an effective method for addressing large volumes of unlabeled data; it interactively queries users (or certain information sources) to obtain desired outputs at new data points. With regard to deep learning techniques (e.g., CNN) and their applications (e.g., image classification), labeling work is of great significance as training processes for obtaining parameters in neural networks which requires abundant labeled samples. Although a few active learning algorithms have been proposed for devising certain straightforward sampling strategies (e.g., density, similarity, uncertainty, and label-based measure) for deep learning algorithms, these employ onefold sampling strategies and do not consider the relationship among multiple sampling strategies. To this end, we devised a novel solution “multi-criteria active leep learning”(MCADL) to learn an active learning strategy for deep neural networks in image classification. Our sample selection strategy selects informative samples by considering multiple criteria simultaneously (i.e., density, similarity, uncertainty, and label-based measure). Moreover, our approach is capable of adjusting weights adaptively to fuse the results from multiple criteria effectively by exploring the utilities of the criteria at different training stages. Through extensive experiments on two popular image datasets (i.e., MNIST and CIFAR-10), we demonstrate that our proposed method consistently outperforms highly competitive active learning approaches; thereby, it can be verified that our multi-criteria active learning proposal is rational and our solution is effective.

...read moreread less

Journal Article•DOI•

Understanding and Benchmarking the Impact of GDPR on Database Systems

[...]

Supreeth Shastri¹, Vinay Banakar², Melissa F. Wasserman¹, Arun Kumar³, Vijay Chidambaram¹ - Show less +1 more•Institutions (3)

University of Texas at Austin¹, Hewlett-Packard², University of California, San Diego³

02 Oct 2019-arXiv: Databases

TL;DR: The analysis of GDPR from a systems perspective reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements.

...read moreread less

Abstract: The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at this http URL

...read moreread less

Proceedings Article•DOI•

Learning to Coordinate Video Codec with Transport Protocol for Mobile Video Telephony

[...]

Anfu Zhou¹, Huanhuan Zhang¹, Guangyuan Su¹, Leilei Wu¹, Ruoxuan Ma¹, Zhen Meng¹, Xinyu Zhang², Xiufeng Xie³, Huadong Ma¹, Xiaojiang Chen⁴ - Show less +6 more•Institutions (4)

Beijing University of Posts and Telecommunications¹, University of California, San Diego², Hewlett-Packard³, Alibaba Group⁴

11 Oct 2019

TL;DR: A large-scale measurement campaign on an operational mobile video telephony service is conducted, showing that the application-layer video codec and transport-layer protocols remain highly uncoordinated, which represents one major reason for the low QoE.

...read moreread less

Abstract: Despite the pervasive use of real-time video telephony services, the users' quality of experience (QoE) remains unsatisfactory, especially over the mobile Internet. Previous work studied the problem via controlled experiments, while a systematic and in-depth investigation in the wild is still missing. To bridge the gap, we conduct a large-scale measurement campaign on \appname, an operational mobile video telephony service. Our measurement logs fine-grained performance metrics over 1 million video call sessions. Our analysis shows that the application-layer video codec and transport-layer protocols remain highly uncoordinated, which represents one major reason for the low QoE. We thus propose ame, a machine learning based framework to resolve the issue. Instead of blindly following the transport layer's estimation of network capacity, ame reviews historical logs of both layers, and extracts high-level features of codec/network dynamics, based on which it determines the highest bitrates for forthcoming video frames without incurring congestion. To attain the ability, we train ame with the aforementioned massive data traces using a custom-designed imitation learning algorithm, which enables ame to learn from past experience. We have implemented and incorporated ame into \appname. Our experiments show that ame outperforms state-of-the-art solutions, improving video quality while reducing stalling time by multi-folds under various practical scenarios.

...read moreread less

Journal Article•DOI•

Reversible intercalation of methyl viologen as a dicationic charge carrier in aqueous batteries.

[...]

Zhixuan Wei¹, Zhixuan Wei², Woochul Shin², Heng Jiang², Xianyong Wu², William F. Stickle³, Gang Chen¹, Jun Lu⁴, P. Alex Greaney⁵, Fei Du¹, Xiulei Ji² - Show less +7 more•Institutions (5)

Jilin University¹, Oregon State University², Hewlett-Packard³, Argonne National Laboratory⁴, University of California, Riverside⁵

19 Jul 2019-Nature Communications

TL;DR: The reversible insertion of a large molecular dication, methyl viologen, into the crystal structure of an aromatic solid electrode, 3,4,9,10-perylenetetracarboxylic dianhydride, is reported, the largest insertion charge carrier when non-solvated ever reported for batteries.

...read moreread less

Abstract: The interactions between charge carriers and electrode structures represent one of the most important considerations in the search for new energy storage devices. Currently, ionic bonding dominates the battery chemistry. Here we report the reversible insertion of a large molecular dication, methyl viologen, into the crystal structure of an aromatic solid electrode, 3,4,9,10-perylenetetracarboxylic dianhydride. This is the largest insertion charge carrier when non-solvated ever reported for batteries; surprisingly, the kinetic properties of the (de)insertion of methyl viologen are excellent with 60% of capacity retained when the current rate is increased from 100 mA g-1 to 2000 mA g-1. Characterization reveals that the insertion of methyl viologen causes phase transformation of the organic host, and embodies guest-host chemical bonding. First-principles density functional theory calculations suggest strong guest-host interaction beyond the pure ionic bonding, where a large extent of covalency may exist. This study extends the boundary of battery chemistry to large molecular ions as charge carriers and also highlights the electrochemical assembly of a supramolecular system.

...read moreread less

Journal Article•DOI•

Memristor TCAMs Accelerate Regular Expression Matching for Network Intrusion Detection

[...]

Catherine Graves¹, Can Li¹, Xia Sheng¹, Wen Ma¹, Sai Rahul Chalamalasetti¹, Darrin Miller¹, James S. Ignowski¹, Brent Buchanan¹, Le Zheng¹, Si-Ty Lam¹, Xuema Li¹, Lennie Kiyama¹, Martin Foltin¹, Matthew P. Hardy, John Paul Strachan¹ - Show less +11 more•Institutions (1)

Hewlett-Packard¹

26 Aug 2019-IEEE Transactions on Nanotechnology

TL;DR: This work proposes memristor-based TCAMs (Ternary Content Addressable Memory) circuits to accelerate Regular Expression (RegEx) matching through in memory processing of finite automata, demonstrating a promising path to wire-speed RegEx matching on large scale rulesets.

...read moreread less

Abstract: We propose memristor-based TCAMs (Ternary Content Addressable Memory) circuits to accelerate Regular Expression (RegEx) matching through in memory processing of finite automata. RegEx matching is a key function in network security to find malicious actors. However, RegEx matching latency and power can be incredibly high and current proposals are challenged to perform wire-speed matching for large rulesets. Our approach dramatically decreases operating power, enables high throughput, and the use of nanoscale memristor TCAM circuits (mTCAMs) enables compression techniques to expand rulesets. We fabricated and demonstrated nanoscale memristor TCAM cells. SPICE simulations investigate performance at scale and a mTCAM dynamic power model using 16 nm layout parameters demonstrates ~0.2 fJ/bit/search energy for a 36 × 250 mTCAM array. A tiled architecture is proposed to implement a Snort ruleset and assess application performance. Compared to a state-of-the-art FPGA approach (2 Gbps, ~1 W), we show ×4 throughput (8 Gbps) at 55% the power (0.55 W) without standard TCAM power-saving techniques. Our performance comparison improves further when striding (searching multiple characters at once) is considered, resulting in 47.2 Gbps at 1.2 W for our approach compared to 3.9 Gbps at 630 mW for strided FPGA NFA, demonstrating a promising path to wire-speed RegEx matching on large scale rulesets.

...read moreread less

Journal Article•DOI•

Indium arsenide quantum dot waveguide photodiodes heterogeneously integrated on silicon

[...]

Bassem Tossoun¹, Geza Kurczveil¹, Chong Zhang¹, Antoine Descos¹, Zhihong Huang¹, Andreas Beling², Joe C. Campbell², Di Liang¹, Raymond G. Beausoleil¹ - Show less +5 more•Institutions (2)

Hewlett-Packard¹, University of Virginia²

20 Oct 2019

TL;DR: In this paper, the first O-band InAs quantum dot (QD) waveguide photodiode (PD) heterogeneously integrated on silicon is reported. And the authors demonstrate a device sensitivity of −11 dBm at 10Gb/s and open-eye diagrams up to 12.5 dBm.

...read moreread less

Abstract: Silicon photonics provides a promising platform for energy-efficient interconnects within supercomputers and data centers. However, developing a complementary metal–oxide–semiconductor compatible high-speed photodetector with low dark current has long presented a challenge in the field. In this paper, we report the first O-band InAs quantum dot (QD) waveguide photodiode (PD) heterogeneously integrated on silicon. Record low dark currents as low as 0.01 nA, responsivities of 0.34 A/W at 1310 nm and 0.9 A/W at 1280 nm, and a record high 3 dB bandwidth of 15 GHz was measured. Avalanche gain was observed and a maximum gain of up to 45 and a gain bandwidth product (GBP) of 240 GHz were achieved, which are also record high results for any QD avalanche photodetector (APD) on silicon. Additionally, we demonstrate a device sensitivity of −11 dBm at 10 Gb/s and open-eye diagrams up to 12.5 Gb/s. These QD-based PDs are able to operate as p-i-n PDs or APDs under different bias conditions and offer a promising alternative to heterogeneous InGaAs-on-silicon and SiGe counterparts in low-power optical communication links. They also leverage the same epitaxial layers and processing steps as heterogeneously integrated QD lasers, significantly simplifying the processing and reducing the cost of a fully integrated QD transceiver on silicon.

...read moreread less

Proceedings Article•DOI•

An Active-Passive Measurement Study of TCP Performance over LTE on High-speed Rails

[...]

Jing Wang¹, Yufan Zheng¹, Ni Yunzhe¹, Chenren Xu¹, Feng Qian², Wangyang Li¹, Wantong Jiang¹, Yihua Cheng¹, Zhuo Cheng¹, Yuanjie Li³, Xiufeng Xie³, Yi Sun⁴, Zhongfeng Wang - Show less +9 more•Institutions (4)

Peking University¹, University of Minnesota², Hewlett-Packard³, Chinese Academy of Sciences⁴

05 Aug 2019

TL;DR: Wang et al. as discussed by the authors conducted a large-scale active-passive measurement study of TCP performance over LTE on HSR, and quantitatively studied the impact of frequent cellular handover on HRS networking performance, and conduct in-depth examination of TCP CUBIC and TCP BBR.

...read moreread less

Abstract: High-speed rail (HSR) systems potentially provide a more efficient way of door-to-door transportation than airplane. However, they also pose unprecedented challenges in delivering seamless Internet service for on-board passengers. In this paper, we conduct a large-scale active-passive measurement study of TCP performance over LTE on HSR. Our measurement targets the HSR routes in China operating at above 300 km/h. We performed extensive data collection through both controlled setting and passive monitoring, obtaining 1732.9 GB data collected over 135719 km of trips. Leveraging such a unique dataset, we measure important performance metrics such as TCP goodput, latency, loss rate, as well as key characteristics of TCP flows, application breakdown, and users' behaviors. We further quantitatively study the impact of frequent cellular handover on HSR networking performance, and conduct in-depth examination of the performance of two widely deployed transport-layer protocols: TCP CUBIC and TCP BBR. Our findings reveal the performance of today's commercial HSR networks "in the wild'', as well as identify several performance inefficiencies, which motivate us to design a simple yet effective congestion control algorithm based on BBR to further boost the throughput by up to 36.5%. They together highlight the need to develop dedicated protocol mechanisms that are friendly to extreme mobility.

...read moreread less

Collapse