Showing papers by "IBM published in 2020"

PDF

Open Access

Journal Article•DOI•

CP2K: An electronic structure and molecular dynamics software package - Quickstep: Efficient and accurate electronic structure calculations

[...]

Thomas D. Kühne¹, Marcella Iannuzzi², Mauro Del Ben³, Vladimir V. Rybkin², Patrick Seewald², Frederick Stein², Teodoro Laino⁴, Rustam Z. Khaliullin⁵, Ole Schütt⁶, Florian Schiffmann⁷, Dorothea Golze⁸, Jan Wilhelm⁹, Sergey Chulkov¹⁰, Mohammad Hossein Bani-Hashemian⁶, Valéry Weber⁴, Urban Borštnik⁶, Mathieu Taillefumier⁶, Alice Shoshana Jakobovits⁶, A. Lazzaro, Hans Pabst¹¹, Tiziano Müller², Robert Schade¹, Manuel Guidon², Samuel Andermatt⁶, Nico Holmberg⁸, Gregory K. Schenter¹², Anna Hehn², Augustin Bussy², Fabian Belleflamme², Gloria Tabacchi¹³, Andreas Glöß¹⁴, Michael Lass¹, Iain Bethune¹⁵, Christopher J. Mundy¹², Christian Plessl¹, Matthew Watkins¹⁰, Joost VandeVondele⁶, Matthias Krack¹⁶, Jürg Hutter² - Show less +35 more•Institutions (16)

University of Paderborn¹, University of Zurich², Lawrence Berkeley National Laboratory³, IBM⁴, McGill University⁵, ETH Zurich⁶, Victoria University, Australia⁷, Aalto University⁸, University of Regensburg⁹, University of Lincoln¹⁰, Intel¹¹, Pacific Northwest National Laboratory¹², University of Insubria¹³, Bosch¹⁴, Science and Technology Facilities Council¹⁵, Paul Scherrer Institute¹⁶

21 May 2020-Journal of Chemical Physics

TL;DR: CP2K as discussed by the authors is an open source electronic structure and molecular dynamics software package to perform atomistic simulations of solid-state, liquid, molecular, and biological systems, especially aimed at massively parallel and linear-scaling electronic structure methods and state-of-the-art ab initio molecular dynamics simulations.

...read moreread less

Abstract: CP2K is an open source electronic structure and molecular dynamics software package to perform atomistic simulations of solid-state, liquid, molecular, and biological systems. It is especially aimed at massively parallel and linear-scaling electronic structure methods and state-of-the-art ab initio molecular dynamics simulations. Excellent performance for electronic structure calculations is achieved using novel algorithms implemented for modern high-performance computing systems. This review revisits the main capabilities of CP2K to perform efficient and accurate electronic structure simulations. The emphasis is put on density functional theory and multiple post–Hartree–Fock methods using the Gaussian and plane wave approach and its augmented all-electron extension.

...read moreread less

938 citations

Journal Article•DOI•

Memory devices and applications for in-memory computing

[...]

Abu Sebastian¹, Manuel Le Gallo¹, Riduan Khaddam-Aljameh¹, Evangelos Eleftheriou¹•Institutions (1)

IBM¹

30 Mar 2020-Nature Nanotechnology

TL;DR: This Review provides an overview of memory devices and the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing.

...read moreread less

Abstract: Traditional von Neumann computing systems involve separate processing and memory units. However, data movement is costly in terms of time and energy and this problem is aggravated by the recent explosive growth in highly data-centric applications related to artificial intelligence. This calls for a radical departure from the traditional systems and one such non-von Neumann computational approach is in-memory computing. Hereby certain computational tasks are performed in place in the memory itself by exploiting the physical attributes of the memory devices. Both charge-based and resistance-based memory devices are being explored for in-memory computing. In this Review, we provide a broad overview of the key computational primitives enabled by these memory devices as well as their applications spanning scientific computing, signal processing, optimization, machine learning, deep learning and stochastic computing. This Review provides an overview of memory devices and the key computational primitives for in-memory computing, and examines the possibilities of applying this computing approach to a wide range of applications.

...read moreread less

841 citations

Journal Article•DOI•

Event-based Vision: A Survey

[...]

Guillermo Gallego¹, Tobi Delbruck, Garrick Orchard², Chiara Bartolozzi, Brian Taba³, Andrea Censi⁴, Stefan Leutenegger⁵, Andrew J. Davison⁵, Jörg Conradt, Kostas Daniilidis⁶, Davide Scaramuzza⁷ - Show less +7 more•Institutions (7)

Technical University of Berlin¹, National University of Singapore², IBM³, ETH Zurich⁴, Imperial College London⁵, University of Pennsylvania⁶, University of Zurich⁷

10 Jul 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras.

...read moreread less

Abstract: Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of is), very high dynamic range (140dB vs. 60dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world.

...read moreread less

697 citations

Journal Article•DOI•

CP2K: An Electronic Structure and Molecular Dynamics Software Package -- Quickstep: Efficient and Accurate Electronic Structure Calculations.

[...]

08 Mar 2020-arXiv: Chemical Physics

TL;DR: This review revisits the main capabilities of CP2K to perform efficient and accurate electronic structure simulations and puts the emphasis on density functional theory and multiple post-Hartree-Fock methods using the Gaussian and plane wave approach and its augmented all-electron extension.

...read moreread less

Abstract: CP2K is an open source electronic structure and molecular dynamics software package to perform atomistic simulations of solid-state, liquid, molecular and biological systems. It is especially aimed at massively-parallel and linear-scaling electronic structure methods and state-of-the-art ab-initio molecular dynamics simulations. Excellent performance for electronic structure calculations is achieved using novel algorithms implemented for modern high-performance computing systems. This review revisits the main capabilities of CP2k to perform efficient and accurate electronic structure simulations. The emphasis is put on density functional theory and multiple post-Hartree-Fock methods using the Gaussian and plane wave approach and its augmented all-electron extension.

...read moreread less

632 citations

Journal Article•DOI•

Resistive switching materials for information processing

[...]

Zhongrui Wang¹, Huaqiang Wu², Geoffrey W. Burr³, Cheol Seong Hwang⁴, Kang L. Wang⁵, Qiangfei Xia¹, Jianhua Yang¹ - Show less +3 more•Institutions (5)

University of Massachusetts Amherst¹, Tsinghua University², IBM³, Seoul National University⁴, University of California, Los Angeles⁵

13 Jan 2020-Nature Reviews Materials

TL;DR: This Review surveys the four physical mechanisms that lead to resistive switching materials enable novel, in-memory information processing, which may resolve the von Neumann bottleneck and examines the device requirements for systems based on RSMs.

...read moreread less

Abstract: The rapid increase in information in the big-data era calls for changes to information-processing paradigms, which, in turn, demand new circuit-building blocks to overcome the decreasing cost-effectiveness of transistor scaling and the intrinsic inefficiency of using transistors in non-von Neumann computing architectures. Accordingly, resistive switching materials (RSMs) based on different physical principles have emerged for memories that could enable energy-efficient and area-efficient in-memory computing. In this Review, we survey the four physical mechanisms that lead to such resistive switching: redox reactions, phase transitions, spin-polarized tunnelling and ferroelectric polarization. We discuss how these mechanisms equip RSMs with desirable properties for representation capability, switching speed and energy, reliability and device density. These properties are the key enablers of processing-in-memory platforms, with applications ranging from neuromorphic computing and general-purpose memcomputing to cybersecurity. Finally, we examine the device requirements for such systems based on RSMs and provide suggestions to address challenges in materials engineering, device optimization, system integration and algorithm design. Resistive switching materials enable novel, in-memory information processing, which may resolve the von Neumann bottleneck. This Review focuses on how the switching mechanisms and the resultant electrical properties lead to various computing applications.

...read moreread less

564 citations

Journal Article•DOI•

GPT-3: Its Nature, Scope, Limits, and Consequences

[...]

Luciano Floridi¹, Luciano Floridi², Massimo Chiriatti³•Institutions (3)

University of Oxford¹, The Turing Institute², IBM³

01 Nov 2020-Minds and Machines

TL;DR: The nature of reversible and irreversible questions is discussed, that is, questions that may enable one to identify the nature of the source of their answers, and GPT-3, a third-generation, autoregressive language model that uses deep learning to produce human-like texts, is introduced.

...read moreread less

Abstract: In this commentary, we discuss the nature of reversible and irreversible questions, that is, questions that may enable one to identify the nature of the source of their answers. We then introduce GPT-3, a third-generation, autoregressive language model that uses deep learning to produce human-like texts, and use the previous distinction to analyse it. We expand the analysis to present three tests based on mathematical, semantic (that is, the Turing Test), and ethical questions and show that GPT-3 is not designed to pass any of them. This is a reminder that GPT-3 does not do what it is not supposed to do, and that any interpretation of GPT-3 as the beginning of the emergence of a general form of artificial intelligence is merely uninformed science fiction. We conclude by outlining some of the significant consequences of the industrialisation of automatic and cheap production of good, semantic artefacts.

...read moreread less

529 citations

Journal Article•DOI•

Variant analysis of SARS-CoV-2 genomes

[...]

Takahiko Koyama¹, Daniel E. Platt¹, Laxmi Parida¹•Institutions (1)

IBM¹

01 Jul 2020-Bulletin of The World Health Organization

TL;DR: It is found that several variants of the SARS-CoV-2 genome exist and that the D614G clade has become the most common variant since December 2019, which indicated structured transmission, with the possibility of multiple introductions into the population.

...read moreread less

Abstract: Objective To analyse genome variants of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Methods Between 1 February and 1 May 2020, we downloaded 10 022 SARS CoV-2 genomes from four databases. The genomes were from infected patients in 68 countries. We identified variants by extracting pairwise alignment to the reference genome NC_045512, using the EMBOSS needle. Nucleotide variants in the coding regions were converted to corresponding encoded amino acid residues. For clade analysis, we used the open source software Bayesian evolutionary analysis by sampling trees, version 2.5. Findings We identified 5775 distinct genome variants, including 2969 missense mutations, 1965 synonymous mutations, 484 mutations in the non-coding regions, 142 non-coding deletions, 100 in-frame deletions, 66 non-coding insertions, 36 stop-gained variants, 11 frameshift deletions and two in-frame insertions. The most common variants were the synonymous 3037C > T (6334 samples), P4715L in the open reading frame 1ab (6319 samples) and D614G in the spike protein (6294 samples). We identified six major clades, (that is, basal, D614G, L84S, L3606F, D448del and G392D) and 14 subclades. Regarding the base changes, the C > T mutation was the most common with 1670 distinct variants. Conclusion We found that several variants of the SARS-CoV-2 genome exist and that the D614G clade has become the most common variant since December 2019. The evolutionary analysis indicated structured transmission, with the possibility of multiple introductions into the population.

...read moreread less

442 citations

Journal Article•DOI•

Bandgap engineering of two-dimensional semiconductor materials

[...]

Andrey Chaves¹, Javad G. Azadani², Hussain Alsalman³, Hussain Alsalman², D. R. da Costa¹, Riccardo Frisenda⁴, A. J. Chaves⁵, Seung Hyun Song⁶, Young Duck Kim⁷, Daowei He⁸, Daowei He⁹, Jiadong Zhou¹⁰, Andres Castellanos-Gomez⁴, François M. Peeters¹¹, Zheng Liu¹⁰, Christopher L. Hinkle¹², Sang Hyun Oh², Peide D. Ye¹³, Steven J. Koester², Young Hee Lee¹⁴, Phaedon Avouris¹⁵, Xinran Wang⁹, Tony Low² - Show less +19 more•Institutions (15)

Federal University of Ceará¹, University of Minnesota², King Abdulaziz City for Science and Technology³, Spanish National Research Council⁴, Instituto Tecnológico de Aeronáutica⁵, Sookmyung Women's University⁶, Kyung Hee University⁷, University of California, Los Angeles⁸, Nanjing University⁹, Nanyang Technological University¹⁰, University of Antwerp¹¹, University of Notre Dame¹², Purdue University¹³, Sungkyunkwan University¹⁴, IBM¹⁵

24 Aug 2020

TL;DR: In this paper, a review of the basic physical principles of these various techniques on the engineering of quasi-particle and optical bandgaps, their bandgap tunability, potentials and limitations in practical 2D device technologies are provided.

...read moreread less

Abstract: Semiconductors are the basis of many vital technologies such as electronics, computing, communications, optoelectronics, and sensing. Modern semiconductor technology can trace its origins to the invention of the point contact transistor in 1947. This demonstration paved the way for the development of discrete and integrated semiconductor devices and circuits that has helped to build a modern society where semiconductors are ubiquitous components of everyday life. A key property that determines the semiconductor electrical and optical properties is the bandgap. Beyond graphene, recently discovered two-dimensional (2D) materials possess semiconducting bandgaps ranging from the terahertz and mid-infrared in bilayer graphene and black phosphorus, visible in transition metal dichalcogenides, to the ultraviolet in hexagonal boron nitride. In particular, these 2D materials were demonstrated to exhibit highly tunable bandgaps, achieved via the control of layers number, heterostructuring, strain engineering, chemical doping, alloying, intercalation, substrate engineering, as well as an external electric field. We provide a review of the basic physical principles of these various techniques on the engineering of quasi-particle and optical bandgaps, their bandgap tunability, potentials and limitations in practical realization in future 2D device technologies.

...read moreread less

434 citations

Journal Article•DOI•

Moments in Time Dataset: One Million Videos for Event Understanding

[...]

Mathew Monfort¹, Carl Vondrick², Aude Oliva¹, Alex Andonian¹, Bolei Zhou³, Kandan Ramakrishnan¹, Sarah Adel Bargal⁴, Tom Yan¹, Lisa M. Brown⁵, Quanfu Fan⁵, Dan Gutfreund⁵ - Show less +7 more•Institutions (5)

Massachusetts Institute of Technology¹, Columbia University², The Chinese University of Hong Kong³, Boston University⁴, IBM⁵

01 Feb 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The Moments in Time dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds, can serve as a new challenge to develop models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis.

...read moreread less

Abstract: We present the Moments in Time Dataset, a large-scale human-annotated collection of one million short videos corresponding to dynamic events unfolding within three seconds. Modeling the spatial-audio-temporal dynamics even for actions occurring in 3 second videos poses many challenges: meaningful events do not include only people, but also objects, animals, and natural phenomena; visual and auditory events can be symmetrical in time (“opening” is “closing” in reverse), and either transient or sustained. We describe the annotation process of our dataset (each video is tagged with one action or activity label among 339 different classes), analyze its scale and diversity in comparison to other large-scale video datasets for action recognition, and report results of several baseline models addressing separately, and jointly, three modalities: spatial, temporal and auditory. The Moments in Time dataset, designed to have a large coverage and diversity of events in both visual and auditory modalities, can serve as a new challenge to develop models that scale to the level of complexity and abstract reasoning that a human processes on a daily basis.

...read moreread less

416 citations

Proceedings Article•

Federated Learning with Matched Averaging

[...]

Hongyi Wang¹, Mikhail Yurochkin², Yuekai Sun³, Dimitris S. Papailiopoulos¹, Yasaman Khazaeni² - Show less +1 more•Institutions (3)

University of Wisconsin-Madison¹, IBM², University of Michigan³

30 Apr 2020

TL;DR: This work proposes Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs and indicates that FedMA outperforms popular state-of-the-art federatedLearning algorithms on deep CNN and L STM architectures trained on real world datasets, while improving the communication efficiency.

...read moreread less

Abstract: Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud. We propose Federated matched averaging (FedMA) algorithm designed for federated learning of modern neural network architectures e.g. convolutional neural networks (CNNs) and LSTMs. FedMA constructs the shared global model in a layer-wise manner by matching and averaging hidden elements (i.e. channels for convolution layers; hidden states for LSTM; neurons for fully connected layers) with similar feature extraction signatures. Our experiments indicate that FedMA outperforms popular state-of-the-art federated learning algorithms on deep CNN and LSTM architectures trained on real world datasets, while improving the communication efficiency.

...read moreread less

402 citations

Proceedings Article•

Once for All: Train One Network and Specialize it for Efficient Deployment

[...]

Han Cai¹, Chuang Gan², Tianzhe Wang³, Zhekai Zhang¹, Song Han¹ - Show less +1 more•Institutions (3)

Massachusetts Institute of Technology¹, IBM², Shanghai Jiao Tong University³

30 Apr 2020

TL;DR: In this paper, the authors propose to train a once-for-all network (OFA) that supports diverse architectural settings (depth, width, kernel size, and resolution) given a deployment scenario, and then select a specialized subnetwork by selecting from the OFA network without additional training.

...read moreread less

Abstract: We address the challenging problem of efficient deep learning model deployment, where the goal is to design neural network architectures that can fit different hardware platform constraints. Most of the traditional approaches either manually design or use Neural Architecture Search (NAS) to find a specialized neural network and train it from scratch for each case, which is computationally expensive and unscalable. Our key idea is to decouple model training from architecture search to save the cost. To this end, we propose to train a once-for-all network (OFA) that supports diverse architectural settings (depth, width, kernel size, and resolution). Given a deployment scenario, we can then quickly get a specialized sub-network by selecting from the OFA network without additional training. To prevent interference between many sub-networks during training, we also propose a novel progressive shrinking algorithm, which can train a surprisingly large number of sub-networks (> 10^{19}) simultaneously, while maintaining the same accuracy as independently trained networks. Extensive experiments on various hardware platforms (CPU, GPU, mCPU, mGPU, FPGA accelerator) show that OFA consistently achieves the same level (or better) ImageNet accuracy than SOTA NAS methods while reducing orders of magnitude GPU hours and CO_2 emission than NAS. In particular, OFA requires 16x fewer GPU hours than ProxylessNAS, 19x fewer GPU hours than FBNet and 1,300x fewer GPU hours than MnasNet under 40 deployment scenarios.

...read moreread less

Proceedings Article•

Improving Adversarial Robustness Requires Revisiting Misclassified Examples

[...]

Yisen Wang¹, Difan Zou², Jinfeng Yi³, James Bailey⁴, Xingjun Ma⁴, Quanquan Gu⁵ - Show less +2 more•Institutions (5)

Peking University¹, University of California, Berkeley², IBM³, University of Melbourne⁴, University of California, Los Angeles⁵

30 Apr 2020

TL;DR: This paper proposes a new defense algorithm called MART, which explicitly differentiates the misclassified and correctly classified examples during the training, and shows that MART and its variant could significantly improve the state-of-the-art adversarial robustness.

...read moreread less

Abstract: Deep neural networks (DNNs) are vulnerable to adversarial examples crafted by imperceptible perturbations. A range of defense techniques have been proposed to improve DNN robustness to adversarial examples, among which adversarial training has been demonstrated to be the most effective. Adversarial training is often formulated as a min-max optimization problem, with the inner maximization for generating adversarial examples. However, there exists a simple, yet easily overlooked fact that adversarial examples are only defined on correctly classified (natural) examples, but inevitably, some (natural) examples will be misclassified during training. In this paper, we investigate the distinctive influence of misclassified and correctly classified examples on the final robustness of adversarial training. Specifically, we find that misclassified examples indeed have a significant impact on the final robustness. More surprisingly, we find that different maximization techniques on misclassified examples may have a negligible influence on the final robustness, while different minimization techniques are crucial. Motivated by the above discovery, we propose a new defense algorithm called {\em Misclassification Aware adveRsarial Training} (MART), which explicitly differentiates the misclassified and correctly classified examples during the training. We also propose a semi-supervised extension of MART, which can leverage the unlabeled data to further improve the robustness. Experimental results show that MART and its variant could significantly improve the state-of-the-art adversarial robustness.

...read moreread less

Journal Article•DOI•

Recent developments in the PySCF program package

[...]

Qiming Sun, Xing Zhang¹, Samragni Banerjee², Peng Bao³, Marc Barbry, Nick S. Blunt⁴, Nikolay A. Bogdanov⁵, George H. Booth⁶, Jia Chen⁷, Zhi-Hao Cui¹, Janus J. Eriksen⁸, Yang Gao¹, Sheng Guo⁹, Jan Hermann¹⁰, Matthew R. Hermes¹¹, Kevin Koh¹², Peter Koval, Susi Lehtola¹³, Zhendong Li¹⁴, Junzi Liu¹⁵, Narbe Mardirossian¹⁶, James McClain, Mario Motta¹⁷, Bastien Mussard¹⁸, Hung Q. Pham¹¹, Artem Pulkin¹⁹, Wirawan Purwanto²⁰, Paul J. Robinson²¹, Enrico Ronca, Elvira R. Sayfutyarova²², Maximilian Scheurer²³, Henry F. Schurkus¹, James E. T. Smith¹⁸, Chong Sun¹, Shi Ning Sun¹, Shiv Upadhyay²⁴, Lucas K. Wagner²⁵, Xiao Wang, Alec F. White¹, James D. Whitfield²⁶, Mark J. Williamson⁴, Sebastian Wouters, Jun Yang²⁷, Jason M. Yu²⁸, Tianyu Zhu¹, Timothy C. Berkelbach²¹, Sandeep Sharma¹⁸, Alexander Yu. Sokolov², Garnet Kin-Lic Chan¹ - Show less +45 more•Institutions (28)

California Institute of Technology¹, Ohio State University², Chinese Academy of Sciences³, University of Cambridge⁴, Max Planck Society⁵, King's College London⁶, University of Florida⁷, University of Bristol⁸, Google⁹, Free University of Berlin¹⁰, University of Minnesota¹¹, University of Notre Dame¹², University of Helsinki¹³, Beijing Normal University¹⁴, Johns Hopkins University¹⁵, Amgen¹⁶, IBM¹⁷, University of Colorado Boulder¹⁸, Delft University of Technology¹⁹, Old Dominion University²⁰, Columbia University²¹, Yale University²², Interdisciplinary Center for Scientific Computing²³, University of Pittsburgh²⁴, University of Illinois at Urbana–Champaign²⁵, Dartmouth College²⁶, University of Hong Kong²⁷, University of California, Irvine²⁸

14 Jul 2020-Journal of Chemical Physics

TL;DR: PySCF as mentioned in this paper is a Python-based general-purpose electronic structure platform that supports first-principles simulations of molecules and solids as well as accelerates the development of new methodology and complex computational workflows.

...read moreread less

Abstract: PySCF is a Python-based general-purpose electronic structure platform that supports first-principles simulations of molecules and solids as well as accelerates the development of new methodology and complex computational workflows. This paper explains the design and philosophy behind PySCF that enables it to meet these twin objectives. With several case studies, we show how users can easily implement their own methods using PySCF as a development environment. We then summarize the capabilities of PySCF for molecular and solid-state simulations. Finally, we describe the growing ecosystem of projects that use PySCF across the domains of quantum chemistry, materials science, machine learning, and quantum information science.

...read moreread less

Proceedings Article•DOI•

Questioning the AI: Informing Design Practices for Explainable AI User Experiences

[...]

Q. Vera Liao¹, Daniel M. Gruen¹, Sarah Miller¹•Institutions (1)

IBM¹

21 Apr 2020

TL;DR: An algorithm-informed XAI question bank is developed in which user needs for explainability are represented as prototypical questions users might ask about the AI, and used as a study probe to identify gaps between current XAI algorithmic work and practices to create explainable AI products.

...read moreread less

Abstract: A surge of interest in explainable AI (XAI) has led to a vast collection of algorithmic work on the topic. While many recognize the necessity to incorporate explainability features in AI systems, how to address real-world user needs for understanding AI remains an open question. By interviewing 20 UX and design practitioners working on various AI products, we seek to identify gaps between the current XAI algorithmic work and practices to create explainable AI products. To do so, we develop an algorithm-informed XAI question bank in which user needs for explainability are represented as prototypical questions users might ask about the AI, and use it as a study probe. Our work contributes insights into the design space of XAI, informs efforts to support design practices in this space, and identifies opportunities for future XAI work. We also provide an extended XAI question bank and discuss how it can be used for creating user-centered XAI.

...read moreread less

Journal Article•DOI•

Human–computer collaboration for skin cancer recognition

[...]

Philipp Tschandl¹, Christoph Rinner¹, Zoe Apalla², Giuseppe Argenziano, Noel C. F. Codella³, Allan C. Halpern⁴, Monika Janda⁵, Aimilios Lallas², Caterina Longo⁶, Josep Malvehy⁷, Josep Malvehy⁸, John Paoli⁹, John Paoli¹⁰, Susana Puig⁷, Susana Puig⁸, Cliff Rosendahl⁵, H. Peter Soyer⁵, Iris Zalaudek¹¹, Harald Kittler¹ - Show less +15 more•Institutions (11)

Medical University of Vienna¹, Aristotle University of Thessaloniki², IBM³, Memorial Sloan Kettering Cancer Center⁴, University of Queensland⁵, University of Modena and Reggio Emilia⁶, University of Barcelona⁷, Carlos III Health Institute⁸, University of Gothenburg⁹, Sahlgrenska University Hospital¹⁰, University of Trieste¹¹

22 Jun 2020-Nature Medicine

TL;DR: A systematic evaluation of the value of AI-based decision support in skin tumor diagnosis demonstrates the superiority of human–computer collaboration over each individual approach and supports the potential of automated approaches in diagnostic medicine.

...read moreread less

Abstract: The rapid increase in telemedicine coupled with recent advances in diagnostic artificial intelligence (AI) create the imperative to consider the opportunities and risks of inserting AI-based support into new paradigms of care Here we build on recent achievements in the accuracy of image-based AI for skin cancer diagnosis to address the effects of varied representations of AI-based support across different levels of clinical expertise and multiple clinical workflows We find that good quality AI-based support of clinical decision-making improves diagnostic accuracy over that of either AI or physicians alone, and that the least experienced clinicians gain the most from AI-based support We further find that AI-based multiclass probabilities outperformed content-based image retrieval (CBIR) representations of AI in the mobile technology environment, and AI-based support had utility in simulations of second opinions and of telemedicine triage In addition to demonstrating the potential benefits associated with good quality AI in the hands of non-expert clinicians, we find that faulty AI can mislead the entire spectrum of clinicians, including experts Lastly, we show that insights derived from AI class-activation maps can inform improvements in human diagnosis Together, our approach and findings offer a framework for future studies across the spectrum of image-based diagnostics to improve human-computer collaboration in clinical practice

...read moreread less

Journal Article•DOI•

EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs

[...]

Aldo Pareja¹, Giacomo Domeniconi¹, Jie Chen¹, Tengfei Ma¹, Toyotaro Suzumura¹, Hiroki Kanezashi¹, Tim Kaler², Tao B. Schardl², Charles E. Leiserson² - Show less +5 more•Institutions (2)

IBM¹, Massachusetts Institute of Technology²

03 Apr 2020

TL;DR: In this article, the authors proposed EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings.

...read moreread less

Abstract: Graph representation learning resurges as a trending research subject owing to the widespread use of deep learning for Euclidean data, which inspire various creative designs of neural networks in the non-Euclidean domain, particularly graphs. With the success of these graph neural networks (GNN) in the static setting, we approach further practical scenarios where the graph dynamically evolves. Existing approaches typically resort to node embeddings and use a recurrent neural network (RNN, broadly speaking) to regulate the embeddings and learn the temporal dynamics. These methods require the knowledge of a node in the full time span (including both training and testing) and are less applicable to the frequent change of the node set. In some extreme scenarios, the node sets at different time steps may completely differ. To resolve this challenge, we propose EvolveGCN, which adapts the graph convolutional network (GCN) model along the temporal dimension without resorting to node embeddings. The proposed approach captures the dynamism of the graph sequence through using an RNN to evolve the GCN parameters. Two architectures are considered for the parameter evolution. We evaluate the proposed approach on tasks including link prediction, edge classification, and node classification. The experimental results indicate a generally higher performance of EvolveGCN compared with related approaches. The code is available at https://github.com/IBM/EvolveGCN.

...read moreread less

Journal Article•DOI•

Quantum Algorithms for Quantum Chemistry and Quantum Materials Science.

[...]

Bela Bauer¹, Sergey Bravyi², Mario Motta², Garnet Kin-Lic Chan³•Institutions (3)

University of California, Santa Barbara¹, IBM², California Institute of Technology³

22 Oct 2020-Chemical Reviews

TL;DR: In this review, a detailed snapshot of current progress in quantum algorithms for ground-state, dynamics, and thermal-state simulation is taken and their strengths and weaknesses for future developments are analyzed.

...read moreread less

Abstract: As we begin to reach the limits of classical computing, quantum computing has emerged as a technology that has captured the imagination of the scientific world. While for many years, the ability to execute quantum algorithms was only a theoretical possibility, recent advances in hardware mean that quantum computing devices now exist that can carry out quantum computation on a limited scale. Thus, it is now a real possibility, and of central importance at this time, to assess the potential impact of quantum computers on real problems of interest. One of the earliest and most compelling applications for quantum computers is Feynman's idea of simulating quantum systems with many degrees of freedom. Such systems are found across chemistry, physics, and materials science. The particular way in which quantum computing extends classical computing means that one cannot expect arbitrary simulations to be sped up by a quantum computer, thus one must carefully identify areas where quantum advantage may be achieved. In this review, we briefly describe central problems in chemistry and materials science, in areas of electronic structure, quantum statistical mechanics, and quantum dynamics that are of potential interest for solution on a quantum computer. We then take a detailed snapshot of current progress in quantum algorithms for ground-state, dynamics, and thermal-state simulation and analyze their strengths and weaknesses for future developments.

...read moreread less

Proceedings Article•

DBA: Distributed Backdoor Attacks against Federated Learning

[...]

Chulin Xie¹, Keli Huang, Pin-Yu Chen², Bo Li³•Institutions (3)

University of Illinois at Urbana–Champaign¹, IBM², University of California, Los Angeles³

30 Apr 2020

TL;DR: The distributed backdoor attack (DBA) is proposed --- a novel threat assessment framework developed by fully exploiting the distributed nature of FL that can evade two state-of-the-art robust FL algorithms against centralized backdoors.

...read moreread less

Abstract: Backdoor attacks aim to manipulate a subset of training data by injecting adversarial triggers such that machine learning models trained on the tampered dataset will make arbitrarily (targeted) incorrect prediction on the testset with the same trigger embedded. While federated learning (FL) is capable of aggregating information provided by different parties for training a better model, its distributed learning methodology and inherently heterogeneous data distribution across parties may bring new vulnerabilities. In addition to recent centralized backdoor attacks on FL where each party embeds the same global trigger during training, we propose the distributed backdoor attack (DBA) --- a novel threat assessment framework developed by fully exploiting the distributed nature of FL. DBA decomposes a global trigger pattern into separate local patterns and embed them into the training set of different adversarial parties respectively. Compared to standard centralized backdoors, we show that DBA is substantially more persistent and stealthy against FL on diverse datasets such as finance and image data. We conduct extensive experiments to show that the attack success rate of DBA is significantly higher than centralized backdoors under different settings. Moreover, we find that distributed attacks are indeed more insidious, as DBA can evade two state-of-the-art robust FL algorithms against centralized backdoors. We also provide explanations for the effectiveness of DBA via feature visual interpretation and feature importance ranking. To further explore the properties of DBA, we test the attack performance by varying different trigger factors, including local trigger variations (size, gap, and location), scaling factor in FL, data distribution, and poison ratio and interval. Our proposed DBA and thorough evaluation results shed lights on characterizing the robustness of FL.

...read moreread less

Journal Article•DOI•

Skyrmion-electronics: writing, deleting, reading and processing magnetic skyrmions toward spintronic applications.

[...]

Xichao Zhang¹, Yan Zhou¹, Kyung Mee Song², Tae Eon Park², Jing Xia¹, Motohiko Ezawa³, Xiaoxi Liu⁴, Weisheng Zhao⁵, Guoping Zhao⁶, Seonghoon Woo⁷ - Show less +6 more•Institutions (7)

The Chinese University of Hong Kong¹, Korea Institute of Science and Technology², University of Tokyo³, Shinshu University⁴, Beihang University⁵, Sichuan Normal University⁶, IBM⁷

03 Apr 2020-Journal of Physics: Condensed Matter

TL;DR: The field of magnetic skyrmions has been actively investigated across a wide range of topics during the last decades as discussed by the authors, including information storage, logic computing gates and non-conventional devices such as neuromorphic computing devices.

...read moreread less

Abstract: The field of magnetic skyrmions has been actively investigated across a wide range of topics during the last decades. In this topical review, we mainly review and discuss key results and findings in skyrmion research since the first experimental observation of magnetic skyrmions in 2009. We particularly focus on the theoretical, computational and experimental findings and advances that are directly relevant to the spintronic applications based on magnetic skyrmions, i.e. their writing, deleting, reading and processing driven by magnetic field, electric current and thermal energy. We then review several potential applications including information storage, logic computing gates and non-conventional devices such as neuromorphic computing devices. Finally, we discuss possible future research directions on magnetic skyrmions, which also cover rich topics on other topological textures such as antiskyrmions and bimerons in antiferromagnets and frustrated magnets.

...read moreread less

Proceedings Article•

Learned step size quantization

[...]

Steven K. Esser¹, Jeffrey L. McKinstry¹, Deepika Bablani¹, Rathinakumar Appuswamy¹, Dharmendra S. Modha¹ - Show less +1 more•Institutions (1)

IBM¹

30 Apr 2020

TL;DR: This work introduces a novel means to estimate and scale the task loss gradient at each weight and activation layer's quantizer step size, such that it can be learned in conjunction with other network parameters.

...read moreread less

Abstract: Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases. Here, we present a method for training such networks, Learned Step Size Quantization, that achieves the highest accuracy to date on the ImageNet dataset when using models, from a variety of architectures, with weights and activations quantized to 2-, 3- or 4-bits of precision, and that can train 3-bit models that reach full precision baseline accuracy. Our approach builds upon existing methods for learning weights in quantized networks by improving how the quantizer itself is configured. Specifically, we introduce a novel means to estimate and scale the task loss gradient at each weight and activation layer's quantizer step size, such that it can be learned in conjunction with other network parameters. This approach works using different levels of precision as needed for a given system and requires only a simple modification of existing training code.

...read moreread less

Proceedings Article•DOI•

Explainable machine learning in deployment

[...]

Umang Bhatt¹, Alice Xiang, Shubham Sharma², Adrian Weller³, Ankur Taly, Yunhan Jia⁴, Joydeep Ghosh², Ruchir Puri⁵, José M. F. Moura¹, Peter Eckersley - Show less +6 more•Institutions (5)

Carnegie Mellon University¹, University of Texas at Austin², University of Cambridge³, Baidu⁴, IBM⁵

27 Jan 2020

TL;DR: In this paper, the authors explore how organizations view and use explainability for stakeholder consumption and find that, currently, the majority of deployments are not for end users affected by the model but rather for machine learning engineers who use the explainability to debug the model itself.

...read moreread less

Abstract: Explainable machine learning offers the potential to provide stakeholders with insights into model behavior by using various methods such as feature importance scores, counterfactual explanations, or influential training data. Yet there is little understanding of how organizations use these methods in practice. This study explores how organizations view and use explainability for stakeholder consumption. We find that, currently, the majority of deployments are not for end users affected by the model but rather for machine learning engineers, who use explainability to debug the model itself. There is thus a gap between explainability in practice and the goal of transparency, since explanations primarily serve internal stakeholders rather than external ones. Our study synthesizes the limitations of current explainability techniques that hamper their use for end users. To facilitate end user interaction, we develop a framework for establishing clear goals for explainability. We end by discussing concerns raised regarding explainability.

...read moreread less

Proceedings Article•DOI•

Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making

[...]

Yunfeng Zhang¹, Q. Vera Liao¹, Rachel K. E. Bellamy¹•Institutions (1)

IBM¹

27 Jan 2020

TL;DR: It is shown that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors.

...read moreread less

Abstract: Today, AI is being increasingly used to help human experts make decisions in high-stakes scenarios. In these scenarios, full automation is often undesirable, not only due to the significance of the outcome, but also because human experts can draw on their domain knowledge complementary to the model's to ensure task success. We refer to these scenarios as AI-assisted decision making, where the individual strengths of the human and the AI come together to optimize the joint decision outcome. A key to their success is to appropriately calibrate human trust in the AI on a case-by-case basis; knowing when to trust or distrust the AI allows the human expert to appropriately apply their knowledge, improving decision outcomes in cases where the model is likely to perform poorly. This research conducts a case study of AI-assisted decision making in which humans and AI have comparable performance alone, and explores whether features that reveal case-specific model information can calibrate trust and improve the joint performance of the human and AI. Specifically, we study the effect of showing confidence score and local explanation for a particular prediction. Through two human experiments, we show that confidence score can help calibrate people's trust in an AI model, but trust calibration alone is not sufficient to improve AI-assisted decision making, which may also depend on whether the human can bring in enough unique knowledge to complement the AI's errors. We also highlight the problems in using local explanation for AI-assisted decision making scenarios and invite the research community to explore new approaches to explainability for calibrating human trust in AI.

...read moreread less

Journal Article•DOI•

The 'Digital Twin' to enable the vision of precision cardiology.

[...]

Jorge Corral-Acero¹, Francesca Margara², Maciej Marciniak³, Cristobal Rodero³, Filip Loncaric⁴, Yingjing Feng, Andrew Gilbert, Joao Filipe Fernandes³, Hassaan A. Bukhari⁵, Ali Wajdan⁶, Manuel Villegas Martinez⁶, Mariana Sousa Santos, Mehrdad Shamohammdi⁷, Hongxing Luo⁷, Philip Westphal, Paul Leeson⁸, Paolo DiAchille⁹, Viatcheslav Gurev⁹, Manuel Mayr³, Liesbet Geris¹⁰, Pras Pathmanathan¹¹, Tina M. Morrison¹¹, Richard Cornelussen, Frits W. Prinzen⁷, Tammo Delhaas⁷, Ada Doltra⁴, Marta Sitges, Edward J. Vigmond, Ernesto Zacur¹, Vicente Grau¹, Blanca Rodriguez², Espen W. Remme⁶, Steven A. Niederer³, Peter Mortier, Kristin McLeod, Mark Potse, Esther Pueyo⁵, Alfonso Bueno-Orovio², Pablo Lamata³ - Show less +35 more•Institutions (11)

University of Oxford¹, British Heart Foundation², King's College London³, University of Barcelona⁴, University of Zaragoza⁵, Oslo University Hospital⁶, Maastricht University⁷, John Radcliffe Hospital⁸, IBM⁹, Katholieke Universiteit Leuven¹⁰, Center for Devices and Radiological Health¹¹

21 Dec 2020-European Heart Journal

TL;DR: It is argued that the second enabling pillar towards this vision is the increasing power of computers and algorithms to learn, reason, and build the ‘digital twin’ of a patient.

...read moreread less

Abstract: Providing therapies tailored to each patient is the vision of precision medicine, enabled by the increasing ability to capture extensive data about individual patients. In this position paper, we argue that the second enabling pillar towards this vision is the increasing power of computers and algorithms to learn, reason, and build the 'digital twin' of a patient. Computational models are boosting the capacity to draw diagnosis and prognosis, and future treatments will be tailored not only to current health status and data, but also to an accurate projection of the pathways to restore health by model predictions. The early steps of the digital twin in the area of cardiovascular medicine are reviewed in this article, together with a discussion of the challenges and opportunities ahead. We emphasize the synergies between mechanistic and statistical models in accelerating cardiovascular research and enabling the vision of precision medicine.

...read moreread less

Journal Article•DOI•

Skyrmion-based artificial synapses for neuromorphic computing

[...]

Kyung Mee Song¹, Jaeseung Jeong¹, Biao Pan², Xichao Zhang³, Jing Xia³, Sun Kyung Cha¹, Tae Eon Park¹, Kwangsu Kim⁴, Kwangsu Kim¹, Simone Finizio⁵, Jörg Raabe⁵, Joonyeon Chang⁶, Joonyeon Chang¹, Yan Zhou³, Weisheng Zhao², Wang Kang², Hyunsu Ju¹, Seonghoon Woo¹, Seonghoon Woo⁷ - Show less +15 more•Institutions (7)

Korea Institute of Science and Technology¹, Beihang University², The Chinese University of Hong Kong³, University of Ulsan⁴, Paul Scherrer Institute⁵, Yonsei University⁶, IBM⁷

01 Mar 2020

TL;DR: In this article, the accumulation and dissipation of magnetic skyrmions in ferrimagnetic multilayers can be controlled with electrical pulses to represent the variations in the synaptic weights.

...read moreread less

Abstract: Magnetic skyrmions are topologically protected spin textures that have nanoscale dimensions and can be manipulated by an electric current. These properties make the structures potential information carriers in data storage, processing and transmission devices. However, the development of functional all-electrical electronic devices based on skyrmions remains challenging. Here we show that the current-induced creation, motion, detection and deletion of skyrmions at room temperature can be used to mimic the potentiation and depression behaviours of biological synapses. In particular, the accumulation and dissipation of magnetic skyrmions in ferrimagnetic multilayers can be controlled with electrical pulses to represent the variations in the synaptic weights. Using chip-level simulations, we demonstrate that such artificial synapses based on magnetic skyrmions could be used for neuromorphic computing tasks such as pattern recognition. For a handwritten pattern dataset, our system achieves a recognition accuracy of ~89%, which is comparable to the accuracy achieved with software-based ideal training (~93%). The electrical current-induced creation, motion, detection and deletion of skyrmions in ferrimagnetic multilayers can be used to mimic the behaviour of biological synapses, providing devices that could be used for neuromorphic computing tasks such as pattern recognition.

...read moreread less

Proceedings Article•DOI•

A Joint Neural Model for Information Extraction with Global Features.

[...]

Ying Lin¹, Heng Ji², Fei Huang³, Lingfei Wu⁴•Institutions (4)

Rensselaer Polytechnic Institute¹, University of Illinois at Urbana–Champaign², Alibaba Group³, IBM⁴

01 Jul 2020

TL;DR: A joint neural framework that aims to extract the globally optimal IE result as a graph from an input sentence and can be easily applied to new languages or trained in a multilingual manner, as OneIE does not use any language-specific feature.

...read moreread less

Abstract: Most existing joint neural models for Information Extraction (IE) use local task-specific classifiers to predict labels for individual instances (e.g., trigger, relation) regardless of their interactions. For example, a victim of a die event is likely to be a victim of an attack event in the same sentence. In order to capture such cross-subtask and cross-instance inter-dependencies, we propose a joint neural framework, OneIE, that aims to extract the globally optimal IE result as a graph from an input sentence. OneIE performs end-to-end IE in four stages: (1) Encoding a given sentence as contextualized word representations; (2) Identifying entity mentions and event triggers as nodes; (3) Computing label scores for all nodes and their pairwise links using local classifiers; (4) Searching for the globally optimal graph with a beam decoder. At the decoding stage, we incorporate global features to capture the cross-subtask and cross-instance interactions. Experiments show that adding global features improves the performance of our model and achieves new state of-the-art on all subtasks. In addition, as OneIE does not use any language-specific feature, we prove it can be easily applied to new languages or trained in a multilingual manner.

...read moreread less

Proceedings Article•DOI•

Evaluating Models’ Local Decision Boundaries via Contrast Sets

[...]

Matt Gardner¹, Yoav Artzi², Victoria Basmov, Jonathan Berant³, Ben Bogin⁴, Sihao Chen⁵, Pradeep Dasigi⁶, Dheeru Dua⁶, Yanai Elazar⁷, Ananth Gottumukkala, Nitish Gupta³, Hannaneh Hajishirzi⁶, Gabriel Ilharco¹, Daniel Khashabi⁶, Kevin Lin¹, Jiangming Liu⁸, Nelson F. Liu⁹, Phoebe Mulcaire¹, Qiang Ning⁶, Sameer Singh¹⁰, Noah A. Smith⁶, Sanjay Subramanian³, Reut Tsarfaty⁷, Eric Wallace¹¹, Ally Zhang, Ben Zhou⁶ - Show less +22 more•Institutions (11)

University of Washington¹, Cornell University², Tel Aviv University³, IBM⁴, University of Pennsylvania⁵, Allen Institute for Artificial Intelligence⁶, Bar-Ilan University⁷, University of Edinburgh⁸, Stanford University⁹, University of California, Irvine¹⁰, University of California, Berkeley¹¹

01 Nov 2020

TL;DR: A more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data, and recommends that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets.

...read moreread less

Abstract: Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture the abilities a dataset is intended to test. We propose a more rigorous annotation paradigm for NLP that helps to close systematic gaps in the test data. In particular, after a dataset is constructed, we recommend that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Contrast sets provide a local view of a model’s decision boundary, which can be used to more accurately evaluate a model’s true linguistic capabilities. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets (e.g., DROP reading comprehension, UD parsing, and IMDb sentiment analysis). Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets—up to 25% in some cases. We release our contrast sets as new evaluation benchmarks and encourage future dataset construction efforts to follow similar annotation processes.

...read moreread less

Proceedings Article•DOI•

SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)

[...]

Marcos Zampieri¹, Preslav Nakov², Sara Rosenthal³, Pepa Atanasova⁴, Georgi Karadzhov⁵, Hamdy Mubarak², Leon Derczynski⁶, Zeses Pitenis⁷, Çağrı Çöltekin⁸ - Show less +5 more•Institutions (8)

Rochester Institute of Technology¹, Qatar Computing Research Institute², IBM³, University of Copenhagen⁴, Massachusetts Institute of Technology⁵, IT University of Copenhagen⁶, University of Wolverhampton⁷, University of Tübingen⁸

17 Jul 2020

TL;DR: The SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval 2020) as mentioned in this paper included three subtasks corresponding to the hierarchical taxonomy of the OLID schema, and was offered in five languages: Arabic, Danish, English, Greek, and Turkish.

...read moreread less

Abstract: We present the results and the main findings of SemEval-2020 Task 12 on Multilingual Offensive Language Identification in Social Media (OffensEval-2020). The task included three subtasks corresponding to the hierarchical taxonomy of the OLID schema from OffensEval-2019, and it was offered in five languages: Arabic, Danish, English, Greek, and Turkish. OffensEval-2020 was one of the most popular tasks at SemEval-2020, attracting a large number of participants across all subtasks and languages: a total of 528 teams signed up to participate in the task, 145 teams submitted official runs on the test data, and 70 teams submitted system description papers.

...read moreread less

Journal Article•DOI•

Polygenic background modifies penetrance of monogenic variants for tier 1 genomic conditions

[...]

Akl C. Fahed, Minxian Wang¹, Julian R. Homburger, Aniruddh P. Patel, Alexander G. Bick, Cynthia L. Neben, Carmen Lai, Deanna Brockman¹, Deanna Brockman², Anthony A. Philippakis¹, Patrick T. Ellinor, Christopher A. Cassa³, Matthew S. Lebo⁴, Kenney Ng⁵, Eric S. Lander, Alicia Y. Zhou, Sekar Kathiresan, Amit Khera - Show less +14 more•Institutions (5)

Broad Institute¹, Harvard University², Brigham and Women's Hospital³, Partners HealthCare⁴, IBM⁵

20 Aug 2020-Nature Communications

TL;DR: It is proposed that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant, and in carriers of monogenic variants, they show that disease risk is a gradient influenced by polygenic Background.

...read moreread less

Abstract: Genetic variation can predispose to disease both through (i) monogenic risk variants that disrupt a physiologic pathway with large effect on disease and (ii) polygenic risk that involves many variants of small effect in different pathways. Few studies have explored the interplay between monogenic and polygenic risk. Here, we study 80,928 individuals to examine whether polygenic background can modify penetrance of disease in tier 1 genomic conditions - familial hypercholesterolemia, hereditary breast and ovarian cancer, and Lynch syndrome. Among carriers of a monogenic risk variant, we estimate substantial gradients in disease risk based on polygenic background - the probability of disease by age 75 years ranged from 17% to 78% for coronary artery disease, 13% to 76% for breast cancer, and 11% to 80% for colon cancer. We propose that accounting for polygenic background is likely to increase accuracy of risk estimation for individuals who inherit a monogenic risk variant.

...read moreread less

Journal Article•DOI•

On the Financing Benefits of Supply Chain Transparency and Blockchain Adoption

[...]

Jiri Chod¹, Nikolaos Trichakis², Gerry Tsoukalas³, Henry Aspegren⁴, Mark Weber⁵ - Show less +1 more•Institutions (5)

Boston College¹, Massachusetts Institute of Technology², University of Pennsylvania³, Google⁴, IBM⁵

14 May 2020-Management Science

TL;DR: A theory is developed that shows signaling a firm’s fundamental quality to lenders through inventory transactions to be more efficient—it leads to less costly ope...

...read moreread less

Abstract: We develop a theory that shows signaling a firm’s fundamental quality (e.g., its operational capabilities) to lenders through inventory transactions to be more efficient—it leads to less costly ope...

...read moreread less

Journal Article•DOI•

Do Not Have Enough Data? Deep Learning to the Rescue!

[...]

Ateret Anaby-Tavor¹, Boaz Carmeli¹, Esther Goldbraich¹, Amir Kantor¹, George Kour¹, Segev Shlomov¹, Naama Tepper¹, Naama Zwerdling¹ - Show less +4 more•Institutions (1)

IBM¹

03 Apr 2020

TL;DR: This work uses a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning and shows that LAMBADA improves classifiers' performance on a variety of datasets.

...read moreread less

Abstract: Based on recent advances in natural language modeling and those in text generation capabilities, we propose a novel data augmentation method for text classification tasks. We use a powerful pre-trained neural network model to artificially synthesize new labeled data for supervised learning. We mainly focus on cases with scarce labeled data. Our method, referred to as language-model-based data augmentation (LAMBADA), involves fine-tuning a state-of-the-art language generator to a specific task through an initial training phase on the existing (usually small) labeled data. Using the fine-tuned model and given a class label, new sentences for the class are generated. Our process then filters these new sentences by using a classifier trained on the original data. In a series of experiments, we show that LAMBADA improves classifiers' performance on a variety of datasets. Moreover, LAMBADA significantly improves upon the state-of-the-art techniques for data augmentation, specifically those applicable to text classification tasks with little data.

...read moreread less

Collapse