Showing papers by "Hewlett-Packard published in 2020"

PDF

Open Access

Journal Article•DOI•

KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold.

[...]

Takuya Aramaki¹, Romain Blanc-Mathieu¹, Hisashi Endo¹, Koichi Ohkubo¹, Koichi Ohkubo², Minoru Kanehisa¹, Susumu Goto, Hiroyuki Ogata¹ - Show less +4 more•Institutions (2)

Kyoto University¹, Hewlett-Packard²

01 Apr 2020-Bioinformatics

TL;DR: KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds.

...read moreread less

Abstract: SUMMARY KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds. KofamKOALA is faster than existing KO assignment tools with its accuracy being comparable to the best performing tools. Function annotation by KofamKOALA helps linking genes to KEGG resources such as the KEGG pathway maps and facilitates molecular network reconstruction. AVAILABILITY AND IMPLEMENTATION KofamKOALA, KofamScan and KOfam are freely available from GenomeNet (https://www.genome.jp/tools/kofamkoala/). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

...read moreread less

607 citations

Journal Article•DOI•

The global rise of 3D printing during the COVID-19 pandemic

[...]

Yu Ying Clarrisa Choong¹, Hong Wei Tan², Deven C. Patel³, Wan Ting Natalie Choong¹, Chun-Hsien Chen¹, Hong Yee Low², Ming Jen Tan¹, Chandrakant D. Patel¹, Chandrakant D. Patel⁴, Chee Kai Chua² - Show less +6 more•Institutions (4)

Nanyang Technological University¹, Singapore University of Technology and Design², Cedars-Sinai Medical Center³, Hewlett-Packard⁴

12 Aug 2020-Nature Reviews Materials

TL;DR: In this article, 3D printing enables on-demand solutions for a wide spectrum of needs ranging from personal protection equipment to medical devices and isolation wards and is suited to address supply-demand imbalances caused by socio-economic trends and disruptions in supply chains.

...read moreread less

Abstract: 3D printing enables on-demand solutions for a wide spectrum of needs ranging from personal protection equipment to medical devices and isolation wards. This versatile technology is suited to address supply–demand imbalances caused by socio-economic trends and disruptions in supply chains.

...read moreread less

181 citations

Journal Article•DOI•

Power-efficient combinatorial optimization using intrinsic noise in memristor Hopfield neural networks

[...]

Fuxi Cai¹, Fuxi Cai², Suhas Kumar², Thomas Van Vaerenbergh², Xia Sheng², Rui Liu², Rui Liu³, Can Li², Zhan Liu², Martin Foltin², Shimeng Yu⁴, Qiangfei Xia⁵, Jianhua Yang⁵, Raymond G. Beausoleil², Wei Lu¹, John Paul Strachan² - Show less +12 more•Institutions (5)

University of Michigan¹, Hewlett-Packard², Arizona State University³, Georgia Institute of Technology⁴, University of Massachusetts Amherst⁵

01 Jul 2020

TL;DR: A memristor-based annealing system that uses an analogue neuromorphic architecture based on a Hopfield neural network can solve non-deterministic polynomial-time (NP)-hard max-cut problems in an approach that is potentially more efficient than current quantum, optical and digital approaches.

...read moreread less

Abstract: To tackle important combinatorial optimization problems, a variety of annealing-inspired computing accelerators, based on several different technology platforms, have been proposed, including quantum-, optical- and electronics-based approaches. However, to be of use in industrial applications, further improvements in speed and energy efficiency are necessary. Here, we report a memristor-based annealing system that uses an energy-efficient neuromorphic architecture based on a Hopfield neural network. Our analogue–digital computing approach creates an optimization solver in which massively parallel operations are performed in a dense crossbar array that can inject the needed computational noise through the analogue array and device errors, amplified or dampened by using a novel feedback algorithm. We experimentally show that the approach can solve non-deterministic polynomial-time (NP)-hard max-cut problems by harnessing the intrinsic hardware noise. We also use experimentally grounded simulations to explore scalability with problem size, which suggest that our memristor-based approach can offer a solution throughput over four orders of magnitude higher per power consumption relative to current quantum, optical and fully digital approaches. A memristor-based annealing system that uses an analogue neuromorphic architecture based on a Hopfield neural network can solve non-deterministic polynomial (NP)-hard max-cut problems in an approach that is potentially more efficient than current quantum, optical and digital approaches.

...read moreread less

174 citations

Journal Article•DOI•

Third-order nanocircuit elements for neuromorphic engineering

[...]

Suhas Kumar¹, R. Stanley Williams², Ziwen Wang³•Institutions (3)

Hewlett-Packard¹, Texas A&M University², Stanford University³

23 Sep 2020-Nature

TL;DR: This work shows how multiple electrophysical processes-including Mott transition dynamics-form a nanoscale third-order circuit element, and demonstrates simple transistorless networks of third- order elements that perform Boolean operations and find analogue solutions to a computationally hard graph-partitioning problem.

...read moreread less

Abstract: Current hardware approaches to biomimetic or neuromorphic artificial intelligence rely on elaborate transistor circuits to simulate biological functions. However, these can instead be more faithfully emulated by higher-order circuit elements that naturally express neuromorphic nonlinear dynamics1-4. Generating neuromorphic action potentials in a circuit element theoretically requires a minimum of third-order complexity (for example, three dynamical electrophysical processes)5, but there have been few examples of second-order neuromorphic elements, and no previous demonstration of any isolated third-order element6-8. Using both experiments and modelling, here we show how multiple electrophysical processes-including Mott transition dynamics-form a nanoscale third-order circuit element. We demonstrate simple transistorless networks of third-order elements that perform Boolean operations and find analogue solutions to a computationally hard graph-partitioning problem. This work paves a way towards very compact and densely functional neuromorphic computing primitives, and energy-efficient validation of neuroscientific models.

...read moreread less

163 citations

Journal Article•DOI•

A High-Rate Aqueous Proton Battery Delivering Power Below −78 °C via an Unfrozen Phosphoric Acid

[...]

Heng Jiang¹, Woochul Shin¹, Lu Ma², Jessica J. Hong¹, Zhixuan Wei¹, Yusung Liu¹, Suoying Zhang², Xianyong Wu¹, Yunkai Xu¹, Qiubo Guo¹, Munirpallam A. Subramanian¹, William F. Stickle³, Tianpin Wu², Jun Lu², Xiulei Ji¹ - Show less +11 more•Institutions (3)

Oregon State University¹, Argonne National Laboratory², Hewlett-Packard³

01 Jul 2020-Advanced Energy Materials

115 citations

Journal Article•DOI•

The building blocks of a brain-inspired computer

[...]

Jack D. Kendall, Suhas Kumar¹•Institutions (1)

Hewlett-Packard¹

14 Jan 2020-Applied physics reviews

TL;DR: This review points to the important primitives of a brain-inspired computer that could drive another decade-long wave of computer engineering.

...read moreread less

Abstract: Computers have undergone tremendous improvements in performance over the last 60 years, but those improvements have significantly slowed down over the last decade, owing to fundamental limits in the underlying computing primitives. However, the generation of data and demand for computing are increasing exponentially with time. Thus, there is a critical need to invent new computing primitives, both hardware and algorithms, to keep up with the computing demands. The brain is a natural computer that outperforms our best computers in solving certain problems, such as instantly identifying faces or understanding natural language. This realization has led to a flurry of research into neuromorphic or brain-inspired computing that has shown promise for enhanced computing capabilities. This review points to the important primitives of a brain-inspired computer that could drive another decade-long wave of computer engineering.

...read moreread less

113 citations

Journal Article•DOI•

Neural Multimodal Cooperative Learning Toward Micro-Video Understanding

[...]

Yinwei Wei¹, Xiang Wang², Weili Guan³, Liqiang Nie¹, Zhouchen Lin⁴, Baoquan Chen¹ - Show less +2 more•Institutions (4)

Shandong University¹, National University of Singapore², Hewlett-Packard³, Peking University⁴

01 Jan 2020-IEEE Transactions on Image Processing

TL;DR: A neural multimodal cooperative learning model is presented to split the consistent component and the complementary component by a novel relation-aware attention mechanism and outperforms the state-of-the-art methods on a real-world micro-video dataset.

...read moreread less

Abstract: The prevailing characteristics of micro-videos result in the less descriptive power of each modality. The micro-video representations, several pioneer efforts proposed, are limited in implicitly exploring the consistency between different modality information but ignore the complementarity. In this paper, we focus on how to explicitly separate the consistent features and the complementary features from the mixed information and harness their combination to improve the expressiveness of each modality. Toward this end, we present a neural multimodal cooperative learning (NMCL) model to split the consistent component and the complementary component by a novel relation-aware attention mechanism. Specifically, the computed attention score can be used to measure the correlation between the features extracted from different modalities. Then, a threshold is learned for each modality to distinguish the consistent and complementary features according to the score. Thereafter, we integrate the consistent parts to enhance the representations and supplement the complementary ones to reinforce the information in each modality. As to the problem of redundant information, which may cause overfitting and is hard to distinguish, we devise an attention network to dynamically capture the features which closely related the category and output a discriminative representation for prediction. The experimental results on a real-world micro-video dataset show that the NMCL outperforms the state-of-the-art methods. Further studies verify the effectiveness and cooperative effects brought by the attentive mechanism.

...read moreread less

107 citations

Journal Article•DOI•

Analog content-addressable memories with memristors.

[...]

Can Li¹, Catherine Graves¹, Xia Sheng¹, Darrin Miller¹, Martin Foltin¹, Giacomo Pedretti², Giacomo Pedretti¹, John Paul Strachan¹ - Show less +4 more•Institutions (2)

Hewlett-Packard¹, Polytechnic University of Milan²

02 Apr 2020-Nature Communications

TL;DR: In this paper, the analog content-addressable memory (CA-MAM) concept and circuit is proposed to reduce the area and power consumption by utilizing the analog conductance tunability of memristors.

...read moreread less

Abstract: A content-addressable memory compares an input search word against all rows of stored words in an array in a highly parallel manner. While supplying a very powerful functionality for many applications in pattern matching and search, it suffers from large area, cost and power consumption, limiting its use. Past improvements have been realized by using memristors to replace the static random-access memory cell in conventional designs, but employ similar schemes based only on binary or ternary states for storage and search. We propose a new analog content-addressable memory concept and circuit to overcome these limitations by utilizing the analog conductance tunability of memristors. Our analog content-addressable memory stores data within the programmable conductance and can take as input either analog or digital search values. Experimental demonstrations, scaled simulations and analysis show that our analog content-addressable memory can reduce area and power consumption, which enables the acceleration of existing applications, but also new computing application areas. Designing low power and high performance content-addressable memory remains a challenge. Here, the authors demonstrate a content-addressable memory concept and circuit which leverages the analog conductance tunability of memristors, reduces power consumption, and enables new functionalities and applications.

...read moreread less

69 citations

Journal Article•DOI•

PANTHER: A Programmable Architecture for Neural Network Training Harnessing Energy-Efficient ReRAM

[...]

Aayush Ankit¹, Izzat El Hajj², Sai Rahul Chalamalasetti³, Sapan Agarwal⁴, Matthew J. Marinella⁴, Martin Foltin³, John Paul Strachan³, Dejan Milojicic³, Wen-mei W. Hwu⁵, Wen-mei W. Hwu⁶, Kaushik Roy¹ - Show less +7 more•Institutions (6)

Purdue University¹, American University of Beirut², Hewlett-Packard³, Sandia National Laboratories⁴, University of Illinois at Urbana–Champaign⁵, National Center for Supercomputing Applications⁶

01 Aug 2020-IEEE Transactions on Computers

TL;DR: PANTHER, an ISA-programmable training accelerator with compiler support, is developed and can be integrated into other accelerators in the literature to enhance their efficiency.

...read moreread less

Abstract: The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training: both digital and hybrid digital-analog using resistive RAM (ReRAM) crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix-vector multiplication operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step. A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connected layers only. To address these limitations, we propose a bit-slicing technique for enhancing the precision of ReRAM-based outer products, which is substantially different from bit-slicing for matrix-vector multiplication only. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. To evaluate our design on different types of layers in neural networks (fully-connected, convolutional, etc.) and training algorithms, we develop PANTHER, an ISA-programmable training accelerator with compiler support. Our design can also be integrated into other accelerators in the literature to enhance their efficiency. Our evaluation shows that PANTHER achieves up to 8.02×, 54.21×, and 103× energy reductions as well as 7.16×, 4.02×, and 16× execution time reductions compared to digital accelerators, ReRAM-based accelerators, and GPUs, respectively.

...read moreread less

63 citations

Journal Article•DOI•

In-Memory Computing with Memristor Content Addressable Memories for Pattern Matching.

[...]

Catherine Graves¹, Can Li¹, Xia Sheng¹, Darrin Miller¹, Jim Ignowski¹, Lennie Kiyama¹, John Paul Strachan¹ - Show less +3 more•Institutions (1)

Hewlett-Packard¹

01 Sep 2020-Advanced Materials

TL;DR: The first experimental demonstration of two computational models in memristor TCAM arrays is reported: regular expression matching in a finite state machine for network security intrusion detection and definable inexact pattern matching in an Levenshtein automata for genomic sequencing.

...read moreread less

Abstract: The dramatic rise of data-intensive workloads has revived application-specific computational hardware for continuing speed and power improvements, frequently achieved by limiting data movement and implementing "in-memory computation". However, conventional complementary metal oxide semiconductor (CMOS) circuit designs can still suffer low power efficiency, motivating designs leveraging nonvolatile resistive random access memory (ReRAM), and with many studies focusing on crossbar circuit architectures. Another circuit primitive-content addressable memory (CAM)-shows great promise for mapping a diverse range of computational models for in-memory computation, with recent ReRAM-CAM designs proposed but few experimentally demonstrated. Here, programming and control of memristors across an 86 × 12 memristor ternary CAM (TCAM) array integrated with CMOS are demonstrated, and parameter tradeoffs for optimizing speed and search margin are evaluated. In addition to smaller area, this memristor TCAM results in significantly lower power due to very low programmable conductance states, motivating CAM use in a wider range of computational applications than conventional TCAMs are confined to today. Finally, the first experimental demonstration of two computational models in memristor TCAM arrays is reported: regular expression matching in a finite state machine for network security intrusion detection and definable inexact pattern matching in a Levenshtein automata for genomic sequencing.

...read moreread less

56 citations

Journal Article•DOI•

A Low-Current and Analog Memristor with Ru as Mobile Species.

[...]

Jung Ho Yoon¹, Jung Ho Yoon², Jung Ho Yoon³, Jiaming Zhang⁴, Peng Lin¹, Navnidhi K. Upadhyay¹, Peng Yan¹, Yuzi Liu², Qiangfei Xia¹, Jianhua Yang¹ - Show less +6 more•Institutions (4)

University of Massachusetts Amherst¹, Argonne National Laboratory², Korea Institute of Science and Technology³, Hewlett-Packard⁴

08 Oct 2020-Advanced Materials

TL;DR: Ru is studied as a new type of mobile species for memristors to achieve low switching current, fast speed, good reliability, scalability, and analog switching property simultaneously.

...read moreread less

Abstract: The switching parameters and device performance of memristors are predominately determined by their mobile species and matrix materials. Devices with oxygen or oxygen vacancies as the mobile species usually exhibit a great retention but also need a relatively high switching current (e.g., >30 µA), while devices with Ag or Cu as cation mobile species do not require a high switching current but usually show a poor retention. Here, Ru is studied as a new type of mobile species for memristors to achieve low switching current, fast speed, good reliability, scalability, and analog switching property simultaneously. An electrochemical metallization-like memristor with a stack of Pt/Ta2 O5 /Ru is developed. Migration of Ru ions is revealed by energy-dispersive X-ray spectroscopy mapping and in situ transmission electron microscopy within a sub-10 nm active device area before and after switching. The results open up a new avenue to engineer memristors for desired properties.

...read moreread less

Journal Article•DOI•

Gated CNN: Integrating multi-scale feature layers for object detection

[...]

Jin Yuan¹, Heng-Chang Xiong¹, Yi Xiao¹, Weili Guan², Meng Wang, Richang Hong, Zhi-Yong Li¹ - Show less +3 more•Institutions (2)

Hunan University¹, Hewlett-Packard²

01 Sep 2020-Pattern Recognition

TL;DR: The proposed G-CNN employs a detector with two branches to predict the locations and categories of objects, respectively, as well as an inter-class loss to help detectors learn discrepant information among categories so that the learned detectors could better differentiate similar objects of different categories.

...read moreread less

Journal Article•DOI•

Understanding and benchmarking the impact of GDPR on database systems

[...]

Supreeth Shastri¹, Vinay Banakar², Melissa F. Wasserman¹, Arun Kumar³, Vijay Chidambaram¹ - Show less +1 more•Institutions (3)

University of Texas at Austin¹, Hewlett-Packard², University of California, San Diego³

01 Mar 2020

TL;DR: GDPRbench as mentioned in this paper is an open-source benchmark that consists of workloads and metrics needed to understand and assess personal-data processing database systems, as well as identify new workloads that must be supported under GDPR.

...read moreread less

Abstract: The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR-compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these .ndings, and identify research challenges towards making GDPR-compliance efficient in production environments. We release all of our so.ware artifacts and datasets at h.p://www:gdprbench:org

...read moreread less

Journal Article•DOI•

Image caption generation with dual attention mechanism

[...]

Maofu Liu¹, Lingjun Li¹, Huijun Hu¹, Weili Guan², Jing Tian³ - Show less +1 more•Institutions (3)

Wuhan University of Science and Technology¹, Hewlett-Packard², National University of Singapore³

01 Mar 2020-Information Processing and Management

TL;DR: The label generation, attached to the textual attention mechanism, and the image caption generation, have been merged to form an end-to-end trainable framework to tackle the challenges of the lack of image information and the deviation from the core content of the image.

...read moreread less

Abstract: As a crossing domain of computer vision and natural language processing, the image caption generation has been an active research topic in recent years, which contributes to the multimodal social media translation from unstructured image data to structured text data. The conventional research works have proposed a series of image captioning methods, such as template-based, retrieval-based, encode-decode. Among these methods, the one with encode-decode framework is widely used in the image caption generation, in which the encoder extracts the image features by Convolutional Neural Network (CNN), and the decoder adopts Recurrent Neural Network (RNN) to generate the image description. The Neural Image Caption (NIC) model has achieved good performance in image captioning, and however, there still remains some challenges to be addressed. To tackle the challenges of the lack of image information and the deviation from the core content of the image, our proposed model explores visual attention to deepen the understanding of the image, incorporating the image labels generated by Fully Convolutional Network (FCN) into the generation of image caption. Furthermore, our proposed model exploits textual attention to increase the integrity of the information. Finally, the label generation, attached to the textual attention mechanism, and the image caption generation, have been merged to form an end-to-end trainable framework. In this paper, extensive experiments have been carried out on the AIC-ICC image caption benchmark dataset, and the experimental results show that our proposed model is effective and feasible in the image caption generation.

...read moreread less

Journal Article•DOI•

Fluorinated co-solvent promises Li-S batteries under lean-electrolyte conditions

[...]

Woochul Shin¹, Liangdong Zhu¹, Heng Jiang¹, William F. Stickle², Chong Fang¹, Cong Liu², Jun Lu³, Xiulei Ji¹ - Show less +4 more•Institutions (3)

Oregon State University¹, Hewlett-Packard², Argonne National Laboratory³

15 Jul 2020-Materials Today

TL;DR: Li et al. as discussed by the authors employed 1,1,2,2-tetrafluoroethyl 2,2.2-trifluorethyl ether as a co-solvent in the electrolyte of Li-S batteries to meet the demands.

...read moreread less

Journal Article•DOI•

A Low-Voltage Si-Ge Avalanche Photodiode for High-Speed and Energy Efficient Silicon Photonic Links

[...]

Binhao Wang¹, Zhihong Huang¹, Wayne V. Sorin¹, Xiaoge Zeng¹, Di Liang¹, Marco Fiorentino¹, Raymond G. Beausoleil¹ - Show less +3 more•Institutions (1)

Hewlett-Packard¹

15 Jun 2020-Journal of Lightwave Technology

TL;DR: In this article, a waveguide Si-Ge APD with low breakdown voltage of −10V, achieving 60-Gb/s PAM4 successfully, was demonstrated and compared to a PIN PD receiver.

...read moreread less

Abstract: Silicon-germanium (Si-Ge) avalanche photodiodes (APDs) have large gain bandwidth product (GBP) and low excess noise due to the low impact ionization coefficient ratio of silicon. Optical receivers using APDs are able to achieve high-speed and energy efficient optical transceiver systems. We demonstrate a waveguide Si-Ge APD with low breakdown voltage of −10 V, achieving 60 Gb/s PAM4 successfully. A compact APD circuit model was constructed to allow photonic devices and transceiver circuitry co-design. The APD receiver has achieved −16 dBm sensitivity at 50 Gb/s PAM4 with a bit error rate (BER) of 2.4 $\times \,10^{-4}$ . The sensitivity of APD receivers changes with the multiplication gain. In our analysis, compared to a PIN PD receiver the APD receiver operating at optimum gain can obtain $\sim$ 8 dB more sensitivity for NRZ signaling at $ 50 Gb/s and 3–4 dB more sensitivity for PAM4 signaling at 50–100 Gb/s. Also, the APD receiver operating at optimum gain can reduce power consumption by $\sim\!\text{10}\%$ at PAM4 data rates of 50 Gb/s and $\sim\!\text{15}\%$ at 100 Gb/s in a silicon carrier-depletion microring modulator based WDM photonic link.

...read moreread less

Posted Content•

An In-Depth Analysis of the Slingshot Interconnect

[...]

Daniele De Sensi¹, Salvatore Di Girolamo¹, Kim H. McMahon², Duncan Roweth², Torsten Hoefler¹ - Show less +1 more•Institutions (2)

ETH Zurich¹, Hewlett-Packard²

20 Aug 2020-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: Slingshot is an interconnection network for large scale computing systems based on high-radix switches that provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes, and it is found that applications running on Slingshot are less affected by congestion compared to previous generation networks.

...read moreread less

Abstract: The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size In this paper, we will describe Slingshot, an interconnection network for large scale computing systems Slingshot is based on high-radix switches, which allow building exascale and hyperscale datacenters networks with at most three switch-to-switch hops Moreover, Slingshot provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes Slingshot uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications We analyze the extent to which Slingshot provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications We find that applications running on Slingshot are less affected by congestion compared to previous generation networks

...read moreread less

Posted Content•

ShiftAddNet: A Hardware-Inspired Deep Network

[...]

Haoran You¹, Xiaohan Chen², Yongan Zhang¹, Chaojian Li¹, Sicheng Li³, Zihao Liu⁴, Zhangyang Wang², Yingyan Lin¹ - Show less +4 more•Institutions (4)

Rice University¹, Texas A&M University², Hewlett-Packard³, Florida International University⁴

24 Oct 2020-arXiv: Learning

TL;DR: This paper presented ShiftAddNet, whose main inspiration is drawn from a common practice in energy-efficient hardware implementation, that is, multiplication can be instead performed with additions and logical bit-shifts, yielding a new type of deep network that involves only bit-shift and additive weight layers.

...read moreread less

Abstract: Multiplication (e.g., convolution) is arguably a cornerstone of modern deep neural networks (DNNs). However, intensive multiplications cause expensive resource costs that challenge DNNs' deployment on resource-constrained edge devices, driving several attempts for multiplication-less deep networks. This paper presented ShiftAddNet, whose main inspiration is drawn from a common practice in energy-efficient hardware implementation, that is, multiplication can be instead performed with additions and logical bit-shifts. We leverage this idea to explicitly parameterize deep networks in this way, yielding a new type of deep network that involves only bit-shift and additive weight layers. This hardware-inspired ShiftAddNet immediately leads to both energy-efficient inference and training, without compromising the expressive capacity compared to standard DNNs. The two complementary operation types (bit-shift and add) additionally enable finer-grained control of the model's learning capacity, leading to more flexible trade-off between accuracy and (training) efficiency, as well as improved robustness to quantization and pruning. We conduct extensive experiments and ablation studies, all backed up by our FPGA-based ShiftAddNet implementation and energy measurements. Compared to existing DNNs or other multiplication-less models, ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies. Codes and pre-trained models are available at this https URL.

...read moreread less

Proceedings Article•DOI•

An In-Depth Analysis of the Slingshot Interconnect

[...]

Daniele De Sensi¹, Salvatore Di Girolamo¹, Kim H. McMahon², Duncan Roweth², Torsten Hoefler¹ - Show less +1 more•Institutions (2)

ETH Zurich¹, Hewlett-Packard²

20 Aug 2020

TL;DR: SLINGSHOT as mentioned in this paper is an interconnection network for large scale computing systems based on high-radix switches, which allows building exascale and hyper-scale datacenters networks with at most three switch-to-switch hops.

...read moreread less

Abstract: The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size. In this paper, we will describe SLINGSHOT, an interconnection network for large scale computing systems. SLINGSHOT is based on high-radix switches, which allow building exascale and hyper-scale datacenters networks with at most three switch-to-switch hops. Moreover, SLINGSHOT provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes. SLINGSHOT uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications. We analyze the extent to which SLINGSHOT provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications. We find that applications running on SLINGSHOT are less affected by congestion compared to previous generation networks.

...read moreread less

Journal Article•DOI•

Widely tunable, heterogeneously integrated quantum-dot O-band lasers on silicon

[...]

Aditya Malik¹, Joel Guo¹, Minh A. Tran¹, Geza Kurczveil², Di Liang², John E. Bowers¹ - Show less +2 more•Institutions (2)

University of California, Santa Barbara¹, Hewlett-Packard²

01 Oct 2020-Photonics Research

TL;DR: In this paper, the authors presented widely tunable quantum-dot lasers heterogeneously integrated on silicon-on-insulator substrate, and the tuning mechanism is based on Vernier dual-ring geometry, and a 47nm tuning range with 52dB side-mode suppression ratio is observed.

...read moreread less

Abstract: Heterogeneously integrated lasers in the O-band are a key component in realizing low-power optical interconnects for data centers and high-performance computing. Quantum-dot-based materials have been particularly appealing for light generation due to their ultralow lasing thresholds, small linewidth enhancement factor, and low sensitivity to reflections. Here, we present widely tunable quantum-dot lasers heterogeneously integrated on silicon-on-insulator substrate. The tuning mechanism is based on Vernier dual-ring geometry, and a 47 nm tuning range with 52 dB side-mode suppression ratio is observed. These parameters show an increase to 52 nm and 58 dB, respectively, when an additional wavelength filter in the form of a Mach–Zehnder interferometer is added to the cavity. The Lorentzian linewidth of the lasers is measured as low as 5.3 kHz.

...read moreread less

Journal Article•DOI•

64 Gb/s low-voltage waveguide SiGe avalanche photodiodes with distributed Bragg reflectors

[...]

Binhao Wang¹, Zhihong Huang¹, Yuan Yuan¹, Di Liang¹, Xiaoge Zeng¹, Marco Fiorentino¹, Raymond G. Beausoleil¹ - Show less +3 more•Institutions (1)

Hewlett-Packard¹

01 Jul 2020-Photonics Research

TL;DR: In this paper, the authors demonstrate low-voltage waveguide silicon-germanium avalanche photodiodes (APDs) integrated with distributed Bragg reflectors (DBRs).

...read moreread less

Abstract: We demonstrate low-voltage waveguide silicon-germanium avalanche photodiodes (APDs) integrated with distributed Bragg reflectors (DBRs). The internal quantum efficiency is improved from 60% to 90% at 1550 nm assisted with DBRs while still achieving a 25 GHz bandwidth. A low breakdown voltage of 10 V and a gain bandwidth product of near 500 GHz are obtained. APDs with DBRs at a data rate of 64 Gb/s pulse amplitude modulation with four levels (PAM4) show a 30%–40% increase in optical modulation amplitude (OMA) compared to APDs with no DBR. A sensitivity of around −13 dBm at a data rate of 64 Gb/s PAM4 and a bit error rate of 2.4×10−4 is realized for APDs with DBRs, which improves the sensitivity by ∼2 dB compared to APDs with no DBR.

...read moreread less

Proceedings Article•DOI•

Beyond 5G: Reliable Extreme Mobility Management

[...]

Yuanjie Li¹, Qianru Li², Zhehui Zhang², Ghufran Baig³, Lili Qiu³, Songwu Lu² - Show less +2 more•Institutions (3)

Hewlett-Packard¹, University of California, Los Angeles², University of Texas at Austin³

30 Jul 2020

TL;DR: REM, Reliable Extreme Mobility management for 4G, 5G, and beyond is devised and evaluation with operational high-speed rail datasets shows that, REM reduces failures comparable to static and low mobility, with low signaling and latency cost.

...read moreread less

Abstract: Extreme mobility has become a norm rather than an exception. However, 4G/5G mobility management is not always reliable in extreme mobility, with non-negligible failures and policy conflicts. The root cause is that, existing mobility management is primarily based on wireless signal strength. While reasonable in static and low mobility, it is vulnerable to dramatic wireless dynamics from extreme mobility in triggering, decision, and execution. We devise REM, Reliable Extreme Mobility management for 4G, 5G, and beyond. REM shifts to movement-based mobility management in the delay-Doppler domain. Its signaling overlay relaxes feedback via cross-band estimation, simplifies policies with provable conflict freedom, and stabilizes signaling via scheduling-based OTFS modulation. Our evaluation with operational high-speed rail datasets shows that, REM reduces failures comparable to static and low mobility, with low signaling and latency cost.

...read moreread less

Journal Article•DOI•

Integrated Coherent Ising Machines Based on Self-Phase Modulation in Microring Resonators

[...]

Nikolas Tezak¹, Thomas Van Vaerenbergh¹, Jason S. Pelc¹, Gabriel Mendoza¹, David Kielpinski¹, Hideo Mabuchi², Raymond G. Beausoleil¹ - Show less +3 more•Institutions (2)

Hewlett-Packard¹, Stanford University²

01 Jan 2020-IEEE Journal of Selected Topics in Quantum Electronics

TL;DR: A symmetric nonlinear photonic device is presented as the fundamental building block that can use self-phase modulation in two microring resonators to emulate a continuously tunable, symmetrically bistable Ising node.

...read moreread less

Abstract: We propose an integrated photonic circuit that acts as an optical coherent Ising machine and simulates its performance on the basis of some example problems. In contrast to previous all-optical approaches, the proposed integrated Ising machine does not require an optical parametric oscillator and can, hence, operate at a single wavelength, reducing the overall design complexity. We present a symmetric nonlinear photonic device as the fundamental building block that can use self-phase modulation in two microring resonators to emulate a continuously tunable, symmetrically bistable Ising node. We derive and verify using numerical simulation the key properties of the single Ising node device and the full Ising machine circuit. We estimate the full Ising machine's tolerance to realistic fabrication errors on the basis of randomly sampled example problems, and we discuss which technologies are required to obtain large-scale systems.

...read moreread less

Journal Article•DOI•

Energy Efficiency Analysis of Comb Source Carrier-Injection Ring-Based Silicon Photonic Link

[...]

Yanir London¹, Thomas Van Vaerenbergh², Anthony Rizzo¹, Peng Sun², Jared Hulme², Geza Kurczveil², M. Ashkan Seyedi², Binhao Wang², Xiaoge Zeng², Zhihong Huang², Jinsoo Rhim², Marco Fiorentino², Keren Bergman¹ - Show less +9 more•Institutions (2)

Columbia University¹, Hewlett-Packard²

01 Mar 2020-IEEE Journal of Selected Topics in Quantum Electronics

TL;DR: In this article, the authors presented an analysis of a ring-based DWDM silicon photonic (SiP) link architecture with a comb laser source and p-i-n photodetectors.

...read moreread less

Abstract: Current electronic interconnections in high performance computing (HPC) systems are reaching their limit in supporting high data traffic demands. Dense wavelength-division multiplexed (DWDM) links have gained interest as they can potentially alleviate these interconnect bandwidth demands while also lowering the cost and energy consumption compared to traditional electronic links. In this article we present an analysis of a ring-based DWDM silicon photonic (SiP) link architecture with a comb laser source and p-i-n photodetectors. Specifically, we consider microring resonators (MRRs) with narrow bus waveguides and carrier-injection ring modulators. We propose a new method to select the optimal comb source setting to minimize the laser power consumption at a particular data rate. Additionally, we leverage power penalty models supported by measurements to estimate the effective received optical power at the receiver input of each of the DWDM channels which yields a bit error rate (BER) of $\text{10}^{-{\text{12}}}$ or lower. We show that the analyzed comb source has the lowest power consumption per channel for 24 consecutive lines. For these comb settings, the maximum channel data rate of non-return to zero on-off keying (NRZ-OOK) signals is 22 Gbps, and the minimum energy consumption is 3.28 $\frac{\text{pJ}}{\text{bit}}$ .

...read moreread less

Journal Article•DOI•

Performance Requirements for Terabit-Class Silicon Photonic Links Based on Cascaded Microring Resonators

[...]

Yanir London¹, Thomas Van Vaerenbergh², Luca Ramini², Anthony Rizzo¹, Peng Sun², Geza Kurczveil², M. Ashkan Seyedi², Jinsoo Rhim², Marco Fiorentino², Keren Bergman¹ - Show less +6 more•Institutions (2)

Columbia University¹, Hewlett-Packard²

01 Jul 2020-Journal of Lightwave Technology

TL;DR: This paper presents a comprehensive analysis of a comb source microring-based SiP link architecture with p-i-n photodetectors, and shows that a select few comb configurations satisfy these requirements, and energy consumption as low as $3.3 Tbps is achievable.

...read moreread less

Abstract: The electrical interconnects in high performance computing (HPC) systems are reaching their bandwidth capacities in supporting data-intensive applications. Currently, communication between compute nodes through these interconnects is the main bottleneck for overall HPC system performance. Optical interconnects based on the emerging silicon photonics (SiP) platform are considered to be a promising replacement to boost the speed of the data transfer with reduced cost and energy consumption compared to electrical interconnects. In this paper, we present a comprehensive analysis of a comb source microring-based SiP link architecture with p-i-n photodetectors. In particular, we direct our focus on improved grating coupler and bus waveguide designs to reduce the link power penalties. Additionally, we map the required performance from the comb laser to provide an aggregated data rate of 1 Tbps under the constraints of free spectral range (FSR) and nonlinearities of the microring resonators (MRRs). We show that a select few comb configurations satisfy these requirements, and energy consumption as low as $3\,\frac{\text{pJ}}{\text{bit}}$ is achievable.

...read moreread less

Journal Article•DOI•

Fully-Integrated Heterogeneous DML Transmitters for High-Performance Computing

[...]

Di Liang¹, Ashkan Roshan-Zamir², Yang-Hang Fan², Chong Zhang¹, Binhao Wang¹, Antoine Descos¹, Wenqing Shen³, Kunzhi Yu², Cheng Li¹, Gaofeng Fan², Geza Kurczveil¹, Yingtao Hu¹, Zhihong Huang¹, Marco Fiorentino¹, Satish Kumar³, Samuel Palermo², Raymond G. Beausoleil¹ - Show less +13 more•Institutions (3)

Hewlett-Packard¹, Texas A&M University², Georgia Institute of Technology³

01 Jul 2020-Journal of Lightwave Technology

TL;DR: The strategy to develop appropriate optical link solutions for different data traffic scenarios in memory-driven HPCs and detailed review on recent work to demonstrate fully photonics-electronics-integrated single- and multi-wavelength directly modulated laser (DML) transmitters on silicon for the first time are discussed.

...read moreread less

Abstract: Optical connectivity, which has been widely deployed in today's datacenters and high-performance computing (HPC) systems, is a disruptive technological revolution to the IT industry in the new Millennium. In our journey to debut an Exascale supercomputer, a completely new computing concept, called memory-driven computing, was innovated recently. This new computing architecture brings challenges and opportunities for novel optical interconnect solutions. Here, we first discuss our strategy to develop appropriate optical link solutions for different data traffic scenarios in memory-driven HPCs. Then, we present detailed review on recent work to demonstrate fully photonics-electronics-integrated single- and multi-wavelength directly modulated laser (DML) transmitters on silicon for the first time. Compact heterogeneous microring lasers and laser arrays were fabricated as photonic engines to work with a customized complementary metal-oxide semiconductor (CMOS) driver circuit. Microring lasers based on conventional quantum well and new quantum dot lasing medium were compared in the experiment. Thermal shunt and MOS capacitor structures were integrated into the lasers for effective thermal management and ultra low-energy tuning. It enables a controllable dense wavelength division multiplexing (DWDM) link architecture in an HPC environment. An equivalent microring laser circuit model was constructed to allow photonics-electronics co-simulation. Equalization functionality in the CMOS driver circuit proved to be critical to achieve up to 14 Gb/s direct modulation with 6 dB extinction ratio. Finally, the on-going and future work is discussed towards more robust, higher speed, and more energy efficient DML transmitters.

...read moreread less

Journal Article•DOI•

The Edge-to-Cloud Continuum

[...]

Dejan Milojicic¹•Institutions (1)

Hewlett-Packard¹

22 Oct 2020-IEEE Computer

TL;DR: Computer hosts a virtual roundtable with three experts to discuss the opportunities and obstacles regarding edge-to-cloud technology.

...read moreread less

Abstract: Computer hosts a virtual roundtable with three experts to discuss the opportunities and obstacles regarding edge-to-cloud technology.

...read moreread less

Proceedings Article•DOI•

Homa: An Efficient Topology and Route Management Approach in SD-WAN Overlays

[...]

Diman Zad Tootaghaj¹, Faraz Ahmed¹, Puneet Sharma¹, Mihalis Yannakakis²•Institutions (2)

Hewlett-Packard¹, Columbia University²

06 Jul 2020

TL;DR: An efficient topology and route management approach in Software-Defined Wide Area Networks (SD-WAN) is presented and a centralized control approach that minimizes the total cost while satisfying the quality of service (QoS) on all flows is proposed.

...read moreread less

Abstract: This paper presents an efficient topology and route management approach in Software-Defined Wide Area Networks (SD-WAN). Traditional WANs suffer from low utilization and lack of global view of the network. Therefore, during failures, topology/service/traffic changes, or new policy requirements, the system does not always converge to the global optimal state. Using Software Defined Networking architectures in WANs provides the opportunity to design WANs with higher fault tolerance, scalability, and manageability. We exploit the correlation matrix derived from monitoring system between the virtual links to infer the underlying route topology and propose a route update approach that minimizes the total route update cost on all flows. We formulate the problem as an integer linear programming optimization problem and provide a centralized control approach that minimizes the total cost while satisfying the quality of service (QoS) on all flows. Experimental results on real network topologies demonstrate the effectiveness of the proposed approach in terms of disruption cost and average disrupted flows.

...read moreread less

Posted Content•DOI•

Swarm Learning as a privacy-preserving machine learning approach for disease classification

[...]

Stefanie Warnat-Herresthal¹, Hartmut Schultze², Krishna Prasad Lingadahalli Shastry², Sathyanarayanan Manamohan², Saikat Mukherjee², Vishesh Garg², Ravi Sarveswara², Kristian Haendler³, Peter Pickkers⁴, N. Ahmad Aziz³, Sofia Ktena⁵, Christian Siever², Michael Kraut³, Milind Desai², Bruno Monet², Maria Saridaki⁵, Charles Siegel², Anna Drews³, Melanie Nuesch-Germano¹, Heidi Theis³, Mihai G. Netea⁴, Fabian J. Theis, Anna C. Aschenbrenner¹, Thomas Ulas³, Monique M.B. Breteler³, Evangelos J. Giamarellos-Bourboulis⁵, Matthijs Kox⁴, Matthias Becker³, Sorin Cheran², Woodacre Michael S², Eng Lim Goh², Joachim L. Schultze³, German Covid Omics Initiative - Show less +29 more•Institutions (5)

University of Bonn¹, Hewlett-Packard², German Center for Neurodegenerative Diseases³, Radboud University Nijmegen⁴, National and Kapodistrian University of Athens⁵

26 Jun 2020-bioRxiv

TL;DR: The Swarm Learning (SL) approach as mentioned in this paper is a decentralized machine learning approach that unifies edge computing, blockchain-based peer-to-peer networking and coordination as well as privacy protection without the need for a central coordinator.

...read moreread less

Abstract: Identification of patients with life-threatening diseases including leukemias or infections such as tuberculosis and COVID-19 is an important goal of precision medicine We recently illustrated that leukemia patients are identified by machine learning (ML) based on their blood transcriptomes However, there is an increasing divide between what is technically possible and what is allowed because of privacy legislation To facilitate integration of any omics data from any data owner world-wide without violating privacy laws, we here introduce Swarm Learning (SL), a decentralized machine learning approach uniting edge computing, blockchain-based peer-to-peer networking and coordination as well as privacy protection without the need for a central coordinator thereby going beyond federated learning Using more than 14,000 blood transcriptomes derived from over 100 individual studies with non-uniform distribution of cases and controls and significant study biases, we illustrate the feasibility of SL to develop disease classifiers based on distributed data for COVID-19, tuberculosis or leukemias that outperform those developed at individual sites Still, SL completely protects local privacy regulations by design We propose this approach to noticeably accelerate the introduction of precision medicine

...read moreread less

Journal Article•DOI•

64 Gbps PAM4 Si-Ge Waveguide Avalanche Photodiodes With Excellent Temperature Stability

[...]

Yuan Yuan¹, Zhihong Huang², Binhao Wang², Wayne V. Sorin², Xiaoge Zeng², Di Liang², Marco Fiorentino², Joe C. Campbell¹, Raymond G. Beausoleil² - Show less +5 more•Institutions (2)

University of Virginia¹, Hewlett-Packard²

01 Sep 2020-Journal of Lightwave Technology

TL;DR: In this article, a Si-Ge waveguide avalanche photodiode with extremely high temperature stability was demonstrated, where the breakdown voltage increases ∼4.2mV/°C, bandwidth reduces ∼0.09%/mV, and gain-bandwidth product reduces ∼ 0.24mV with temperature increased from 30°C to 90°C.

...read moreread less

Abstract: A Si-Ge waveguide avalanche photodiode with extremely high temperature stability is demonstrated. The breakdown voltage increases ∼4.2 mV/°C, bandwidth reduces ∼0.09%/°C, and gain-bandwidth product reduces ∼0.24%/°C with temperature increased from 30 °C to 90 °C. Additionally, it maintains superior performance with low breakdown voltage of ∼10 V, high multiplication gain of >15, high bandwidth of ∼24.6 GHz, high gain-bandwidth product of >240 GHz, high internal quantum efficiency of ∼100%, and clear eye diagrams with 64 Gbps PAM4 modulation at 90 °C.

...read moreread less

Collapse