scispace - formally typeset
Search or ask a question
Browse all papers

Proceedings ArticleDOI
23 Apr 2018
TL;DR: This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model, and shows that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations.
Abstract: Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric is the first truly extensible blockchain system for running distributed applications. It supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing block-chain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model. We further evaluate Fabric by implementing and benchmarking a Bitcoin-inspired digital currency. We show that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations, with sub-second latency, scaling well to over 100 peers.

2,813 citations


Journal ArticleDOI
22 Feb 2018-Nature
TL;DR: Tumours from a large cohort of patients with metastatic urothelial cancer who were treated with an anti-PD-L1 agent were examined and major determinants of clinical outcome were identified and suggested that TGFβ shapes the tumour microenvironment to restrain anti-tumour immunity by restricting T-cell infiltration.
Abstract: Therapeutic antibodies that block the programmed death-1 (PD-1)-programmed death-ligand 1 (PD-L1) pathway can induce robust and durable responses in patients with various cancers, including metastatic urothelial cancer. However, these responses only occur in a subset of patients. Elucidating the determinants of response and resistance is key to improving outcomes and developing new treatment strategies. Here we examined tumours from a large cohort of patients with metastatic urothelial cancer who were treated with an anti-PD-L1 agent (atezolizumab) and identified major determinants of clinical outcome. Response to treatment was associated with CD8+ T-effector cell phenotype and, to an even greater extent, high neoantigen or tumour mutation burden. Lack of response was associated with a signature of transforming growth factor β (TGFβ) signalling in fibroblasts. This occurred particularly in patients with tumours, which showed exclusion of CD8+ T cells from the tumour parenchyma that were instead found in the fibroblast- and collagen-rich peritumoural stroma; a common phenotype among patients with metastatic urothelial cancer. Using a mouse model that recapitulates this immune-excluded phenotype, we found that therapeutic co-administration of TGFβ-blocking and anti-PD-L1 antibodies reduced TGFβ signalling in stromal cells, facilitated T-cell penetration into the centre of tumours, and provoked vigorous anti-tumour immunity and tumour regression. Integration of these three independent biological features provides the best basis for understanding patient outcome in this setting and suggests that TGFβ shapes the tumour microenvironment to restrain anti-tumour immunity by restricting T-cell infiltration.

2,808 citations


Journal ArticleDOI
TL;DR: In this paper, the authors provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box decision support systems, given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work.
Abstract: In recent years, many accurate decision support systems have been constructed as black boxes, that is as systems that hide their internal logic to the user. This lack of explanation constitutes both a practical and an ethical issue. The literature reports many approaches aimed at overcoming this crucial weakness, sometimes at the cost of sacrificing accuracy for interpretability. The applications in which black box decision systems can be used are various, and each approach is typically developed to provide a solution for a specific problem and, as a consequence, it explicitly or implicitly delineates its own definition of interpretability and explanation. The aim of this article is to provide a classification of the main problems addressed in the literature with respect to the notion of explanation and the type of black box system. Given a problem definition, a black box type, and a desired explanation, this survey should help the researcher to find the proposals more useful for his own work. The proposed classification of approaches to open black box models should also be useful for putting the many research open questions in perspective.

2,805 citations


Journal ArticleDOI
TL;DR: In this article, the international 14C calibration curves for both the Northern and Southern Hemispheres, as well as for the ocean surface layer, have been updated to include a wealth of new data and extended to 55,000 cal BP.
Abstract: Radiocarbon (14C) ages cannot provide absolutely dated chronologies for archaeological or paleoenvironmental studies directly but must be converted to calendar age equivalents using a calibration curve compensating for fluctuations in atmospheric 14C concentration. Although calibration curves are constructed from independently dated archives, they invariably require revision as new data become available and our understanding of the Earth system improves. In this volume the international 14C calibration curves for both the Northern and Southern Hemispheres, as well as for the ocean surface layer, have been updated to include a wealth of new data and extended to 55,000 cal BP. Based on tree rings, IntCal20 now extends as a fully atmospheric record to ca. 13,900 cal BP. For the older part of the timescale, IntCal20 comprises statistically integrated evidence from floating tree-ring chronologies, lacustrine and marine sediments, speleothems, and corals. We utilized improved evaluation of the timescales and location variable 14C offsets from the atmosphere (reservoir age, dead carbon fraction) for each dataset. New statistical methods have refined the structure of the calibration curves while maintaining a robust treatment of uncertainties in the 14C ages, the calendar ages and other corrections. The inclusion of modeled marine reservoir ages derived from a three-dimensional ocean circulation model has allowed us to apply more appropriate reservoir corrections to the marine 14C data rather than the previous use of constant regional offsets from the atmosphere. Here we provide an overview of the new and revised datasets and the associated methods used for the construction of the IntCal20 curve and explore potential regional offsets for tree-ring data. We discuss the main differences with respect to the previous calibration curve, IntCal13, and some of the implications for archaeology and geosciences ranging from the recent past to the time of the extinction of the Neanderthals.

2,800 citations


Journal ArticleDOI
05 Jan 2018-Science
TL;DR: Examination of the oral and gut microbiome of melanoma patients undergoing anti-programmed cell death 1 protein (PD-1) immunotherapy suggested enhanced systemic and antitumor immunity in responding patients with a favorable gut microbiome as well as in germ-free mice receiving fecal transplants from responding patients.
Abstract: Preclinical mouse models suggest that the gut microbiome modulates tumor response to checkpoint blockade immunotherapy; however, this has not been well-characterized in human cancer patients. Here we examined the oral and gut microbiome of melanoma patients undergoing anti-programmed cell death 1 protein (PD-1) immunotherapy (n = 112). Significant differences were observed in the diversity and composition of the patient gut microbiome of responders versus nonresponders. Analysis of patient fecal microbiome samples (n = 43, 30 responders, 13 nonresponders) showed significantly higher alpha diversity (P < 0.01) and relative abundance of bacteria of the Ruminococcaceae family (P < 0.01) in responding patients. Metagenomic studies revealed functional differences in gut bacteria in responders, including enrichment of anabolic pathways. Immune profiling suggested enhanced systemic and antitumor immunity in responding patients with a favorable gut microbiome as well as in germ-free mice receiving fecal transplants from responding patients. Together, these data have important implications for the treatment of melanoma patients with immune checkpoint inhibitors.

2,791 citations


Journal ArticleDOI
TL;DR: The experimental discovery of a Weyl semimetal, tantalum arsenide (TaAs), using photoemission spectroscopy, which finds that Fermi arcs terminate on the Weyl fermion nodes, consistent with their topological character.

2,789 citations


Proceedings Article
17 Feb 2017
TL;DR: The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks, but it becomes unwieldy when learning large datasets, so Mean Teacher, a method that averages model weights instead of label predictions, is proposed.
Abstract: The recently proposed Temporal Ensembling has achieved state-of-the-art results in several semi-supervised learning benchmarks. It maintains an exponential moving average of label predictions on each training example, and penalizes predictions that are inconsistent with this target. However, because the targets change only once per epoch, Temporal Ensembling becomes unwieldy when learning large datasets. To overcome this problem, we propose Mean Teacher, a method that averages model weights instead of label predictions. As an additional benefit, Mean Teacher improves test accuracy and enables training with fewer labels than Temporal Ensembling. Without changing the network architecture, Mean Teacher achieves an error rate of 4.35% on SVHN with 250 labels, outperforming Temporal Ensembling trained with 1000 labels. We also show that a good network architecture is crucial to performance. Combining Mean Teacher and Residual Networks, we improve the state of the art on CIFAR-10 with 4000 labels from 10.55% to 6.28%, and on ImageNet 2012 with 10% of the labels from 35.24% to 9.11%.

2,784 citations


Journal ArticleDOI
TL;DR: This work develops a novel framework to discover governing equations underlying a dynamical system simply from data measurements, leveraging advances in sparsity techniques and machine learning and using sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data.
Abstract: Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing.

2,784 citations


Journal ArticleDOI
Bin Zhou1, Yuan Lu2, Kaveh Hajifathalian2, James Bentham1  +494 moreInstitutions (170)
TL;DR: In this article, the authors used a Bayesian hierarchical model to estimate trends in diabetes prevalence, defined as fasting plasma glucose of 7.0 mmol/L or higher, or history of diagnosis with diabetes, or use of insulin or oral hypoglycaemic drugs in 200 countries and territories in 21 regions, by sex and from 1980 to 2014.

2,782 citations


Book ChapterDOI
08 Oct 2016
TL;DR: Temporal Segment Networks (TSN) as discussed by the authors combine a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video, which obtains the state-of-the-art performance on the datasets of HMDB51 and UCF101.
Abstract: Deep convolutional networks have achieved great success for visual recognition in still images. However, for action recognition in videos, the advantage over traditional methods is not so evident. This paper aims to discover the principles to design effective ConvNet architectures for action recognition in videos and learn these models given limited training samples. Our first contribution is temporal segment network (TSN), a novel framework for video-based action recognition. which is based on the idea of long-range temporal structure modeling. It combines a sparse temporal sampling strategy and video-level supervision to enable efficient and effective learning using the whole action video. The other contribution is our study on a series of good practices in learning ConvNets on video data with the help of temporal segment network. Our approach obtains the state-the-of-art performance on the datasets of HMDB51 (\( 69.4\,\% \)) and UCF101 (\( 94.2\,\% \)). We also visualize the learned ConvNet models, which qualitatively demonstrates the effectiveness of temporal segment network and the proposed good practices (Models and code at https://github.com/yjxiong/temporal-segment-networks).

2,778 citations


Posted Content
TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.
Abstract: For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this http URL .

Posted Content
TL;DR: On the ImageNet-1K dataset, it is empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy and is more effective than going deeper or wider when the authors increase the capacity.
Abstract: We present a simple, highly modularized network architecture for image classification. Our network is constructed by repeating a building block that aggregates a set of transformations with the same topology. Our simple design results in a homogeneous, multi-branch architecture that has only a few hyper-parameters to set. This strategy exposes a new dimension, which we call "cardinality" (the size of the set of transformations), as an essential factor in addition to the dimensions of depth and width. On the ImageNet-1K dataset, we empirically show that even under the restricted condition of maintaining complexity, increasing cardinality is able to improve classification accuracy. Moreover, increasing cardinality is more effective than going deeper or wider when we increase the capacity. Our models, named ResNeXt, are the foundations of our entry to the ILSVRC 2016 classification task in which we secured 2nd place. We further investigate ResNeXt on an ImageNet-5K set and the COCO detection set, also showing better results than its ResNet counterpart. The code and models are publicly available online.

Proceedings ArticleDOI
18 Apr 2019
TL;DR: This work presents SpecAugment, a simple data augmentation method for speech recognition that is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients) and achieves state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work.
Abstract: We present SpecAugment, a simple data augmentation method for speech recognition. SpecAugment is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients). The augmentation policy consists of warping the features, masking blocks of frequency channels, and masking blocks of time steps. We apply SpecAugment on Listen, Attend and Spell networks for end-to-end speech recognition tasks. We achieve state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work. On LibriSpeech, we achieve 6.8% WER on test-other without the use of a language model, and 5.8% WER with shallow fusion with a language model. This compares to the previous state-of-the-art hybrid system of 7.5% WER. For Switchboard, we achieve 7.2%/14.6% on the Switchboard/CallHome portion of the Hub5'00 test set without the use of a language model, and 6.8%/14.1% with shallow fusion, which compares to the previous state-of-the-art hybrid system at 8.3%/17.3% WER.

Journal ArticleDOI
TL;DR: For most cancers, 5-year net survival remains among the highest in the world in the USA and Canada, in Australia and New Zealand, and in Finland, Iceland, Norway, and Sweden, while for many cancers, Denmark is closing the survival gap with the other Nordic countries.

Journal ArticleDOI
TL;DR: A binary neutron star coalescence candidate (later designated GW170817) with merger time 12:41:04 UTC was observed through gravitational waves by the Advanced LIGO and Advanced Virgo detectors.
Abstract: On 2017 August 17 a binary neutron star coalescence candidate (later designated GW170817) with merger time 12:41:04 UTC was observed through gravitational waves by the Advanced LIGO and Advanced Virgo detectors. The Fermi Gamma-ray Burst Monitor independently detected a gamma-ray burst (GRB 170817A) with a time delay of $\sim 1.7\,{\rm{s}}$ with respect to the merger time. From the gravitational-wave signal, the source was initially localized to a sky region of 31 deg2 at a luminosity distance of ${40}_{-8}^{+8}$ Mpc and with component masses consistent with neutron stars. The component masses were later measured to be in the range 0.86 to 2.26 $\,{M}_{\odot }$. An extensive observing campaign was launched across the electromagnetic spectrum leading to the discovery of a bright optical transient (SSS17a, now with the IAU identification of AT 2017gfo) in NGC 4993 (at $\sim 40\,{\rm{Mpc}}$) less than 11 hours after the merger by the One-Meter, Two Hemisphere (1M2H) team using the 1 m Swope Telescope. The optical transient was independently detected by multiple teams within an hour. Subsequent observations targeted the object and its environment. Early ultraviolet observations revealed a blue transient that faded within 48 hours. Optical and infrared observations showed a redward evolution over ~10 days. Following early non-detections, X-ray and radio emission were discovered at the transient's position $\sim 9$ and $\sim 16$ days, respectively, after the merger. Both the X-ray and radio emission likely arise from a physical process that is distinct from the one that generates the UV/optical/near-infrared emission. No ultra-high-energy gamma-rays and no neutrino candidates consistent with the source were found in follow-up searches. These observations support the hypothesis that GW170817 was produced by the merger of two neutron stars in NGC 4993 followed by a short gamma-ray burst (GRB 170817A) and a kilonova/macronova powered by the radioactive decay of r-process nuclei synthesized in the ejecta.

Book ChapterDOI
TL;DR: In this article, the authors discuss the reasons for the persistence of corruption that have to do with frequency-dependent equilibria or intertemporal externalities, and suggest that corruption may actually improve efficiency and help growth.
Abstract: Corruption has its adverse effects not just on static efficiency but also on investment and growth. This chapter discusses the reasons for the persistence of corruption that have to do with frequency-dependent equilibria or intertemporal externalities. There are many cases where corruption is mutually beneficial between the official and his client, so neither the briber nor the bribee has an incentive to report or protest, for example, when a customs officer lets contraband through, or a tax auditor purposely overlooks a case of tax evasion, and so on. The idea of multiple equilibria in the incidence of corruption is salient in some of the recent economic theorists' explanations. There is a strand in the corruption literature, contributed both by economists and non-economists, suggesting that, in the context of pervasive and cumbersome regulations in developing countries, corruption may actually improve efficiency and help growth.

Journal ArticleDOI
TL;DR: Applied cancer control measures are needed to reduce rates in HICs and arrest the growing burden in LMICs, as well as for lung, colorectal, breast, and prostate cancer, although some low- and middle-income countries (LMIC) now count among those with the highest rates.
Abstract: There are limited published data on recent cancer incidence and mortality trends worldwide. We used the International Agency for Research on Cancer's CANCERMondial clearinghouse to present age-standardized cancer incidence and death rates for 2003-2007. We also present trends in incidence through 2007 and mortality through 2012 for select countries from five continents. High-income countries (HIC) continue to have the highest incidence rates for all sites, as well as for lung, colorectal, breast, and prostate cancer, although some low- and middle-income countries (LMIC) now count among those with the highest rates. Mortality rates from these cancers are declining in many HICs while they are increasing in LMICs. LMICs have the highest rates of stomach, liver, esophageal, and cervical cancer. Although rates remain high in HICs, they are plateauing or decreasing for the most common cancers due to decreases in known risk factors, screening and early detection, and improved treatment (mortality only). In contrast, rates in several LMICs are increasing for these cancers due to increases in smoking, excess body weight, and physical inactivity. LMICs also have a disproportionate burden of infection-related cancers. Applied cancer control measures are needed to reduce rates in HICs and arrest the growing burden in LMICs.

Journal ArticleDOI
TL;DR: This paper aims to demonstrate the efforts towards in-situ applicability of EMMARM, as to provide real-time information about concrete mechanical properties such as E-modulus and compressive strength.

Journal ArticleDOI
TL;DR: A deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs is presented.
Abstract: Signal peptides (SPs) are short amino acid sequences in the amino terminus of many newly synthesized proteins that target proteins into, or across, membranes. Bioinformatic tools can predict SPs from amino acid sequences, but most cannot distinguish between various types of signal peptides. We present a deep neural network-based approach that improves SP prediction across all domains of life and distinguishes between three types of prokaryotic SPs.

Journal ArticleDOI
TL;DR: Radiomics, the high-throughput mining of quantitative image features from standard-of-care medical imaging that enables data to be extracted and applied within clinical-decision support systems to improve diagnostic, prognostic, and predictive accuracy, is gaining importance in cancer research as mentioned in this paper.
Abstract: Radiomics, the high-throughput mining of quantitative image features from standard-of-care medical imaging that enables data to be extracted and applied within clinical-decision support systems to improve diagnostic, prognostic, and predictive accuracy, is gaining importance in cancer research. Radiomic analysis exploits sophisticated image analysis tools and the rapid development and validation of medical imaging data that uses image-based signatures for precision diagnosis and treatment, providing a powerful tool in modern medicine. Herein, we describe the process of radiomics, its pitfalls, challenges, opportunities, and its capacity to improve clinical decision making, emphasizing the utility for patients with cancer. Currently, the field of radiomics lacks standardized evaluation of both the scientific integrity and the clinical relevance of the numerous published radiomics investigations resulting from the rapid growth of this area. Rigorous evaluation criteria and reporting guidelines need to be established in order for radiomics to mature as a discipline. Herein, we provide guidance for investigations to meet this urgent need in the field of radiomics.

Journal ArticleDOI
TL;DR: The mRNA-1273 vaccine as discussed by the authors is a lipid nanoparticle-encapsulated mRNA-based vaccine that encodes the prefusion stabilized full-length spike protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes Covid-19.
Abstract: Background Vaccines are needed to prevent coronavirus disease 2019 (Covid-19) and to protect persons who are at high risk for complications. The mRNA-1273 vaccine is a lipid nanoparticle-encapsulated mRNA-based vaccine that encodes the prefusion stabilized full-length spike protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes Covid-19. Methods This phase 3 randomized, observer-blinded, placebo-controlled trial was conducted at 99 centers across the United States. Persons at high risk for SARS-CoV-2 infection or its complications were randomly assigned in a 1:1 ratio to receive two intramuscular injections of mRNA-1273 (100 μg) or placebo 28 days apart. The primary end point was prevention of Covid-19 illness with onset at least 14 days after the second injection in participants who had not previously been infected with SARS-CoV-2. Results The trial enrolled 30,420 volunteers who were randomly assigned in a 1:1 ratio to receive either vaccine or placebo (15,210 participants in each group). More than 96% of participants received both injections, and 2.2% had evidence (serologic, virologic, or both) of SARS-CoV-2 infection at baseline. Symptomatic Covid-19 illness was confirmed in 185 participants in the placebo group (56.5 per 1000 person-years; 95% confidence interval [CI], 48.7 to 65.3) and in 11 participants in the mRNA-1273 group (3.3 per 1000 person-years; 95% CI, 1.7 to 6.0); vaccine efficacy was 94.1% (95% CI, 89.3 to 96.8%; P Conclusions The mRNA-1273 vaccine showed 94.1% efficacy at preventing Covid-19 illness, including severe disease. Aside from transient local and systemic reactions, no safety concerns were identified. (Funded by the Biomedical Advanced Research and Development Authority and the National Institute of Allergy and Infectious Diseases; COVE ClinicalTrials.gov number, NCT04470427.).

Proceedings ArticleDOI
07 Dec 2015
TL;DR: A novel semantic segmentation algorithm by learning a deep deconvolution network on top of the convolutional layers adopted from VGG 16-layer net, which demonstrates outstanding performance in PASCAL VOC 2012 dataset.
Abstract: We propose a novel semantic segmentation algorithm by learning a deep deconvolution network. We learn the network on top of the convolutional layers adopted from VGG 16-layer net. The deconvolution network is composed of deconvolution and unpooling layers, which identify pixelwise class labels and predict segmentation masks. We apply the trained network to each proposal in an input image, and construct the final semantic segmentation map by combining the results from all proposals in a simple manner. The proposed algorithm mitigates the limitations of the existing methods based on fully convolutional networks by integrating deep deconvolution network and proposal-wise prediction, our segmentation method typically identifies detailed structures and handles objects in multiple scales naturally. Our network demonstrates outstanding performance in PASCAL VOC 2012 dataset, and we achieve the best accuracy (72.5%) among the methods trained without using Microsoft COCO dataset through ensemble with the fully convolutional network.

Proceedings Article
05 Dec 2016
TL;DR: This work presents a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs.
Abstract: In this work, we are interested in generalizing convolutional neural networks (CNNs) from low-dimensional regular grids, where image, video and speech are represented, to high-dimensional irregular domains, such as social networks, brain connectomes or words' embedding, represented by graphs. We present a formulation of CNNs in the context of spectral graph theory, which provides the necessary mathematical background and efficient numerical schemes to design fast localized convolutional filters on graphs. Importantly, the proposed technique offers the same linear computational complexity and constant learning complexity as classical CNNs, while being universal to any graph structure. Experiments on MNIST and 20NEWS demonstrate the ability of this novel deep learning system to learn local, stationary, and compositional features on graphs.

Journal ArticleDOI
Dan R. Robinson1, Eliezer M. Van Allen2, Eliezer M. Van Allen3, Yi-Mi Wu1, Nikolaus Schultz4, Robert J. Lonigro1, Juan Miguel Mosquera, Bruce Montgomery5, Mary-Ellen Taplin3, Colin C. Pritchard5, Gerhardt Attard6, Gerhardt Attard7, Himisha Beltran, Wassim Abida4, Robert K. Bradley5, Jake Vinson4, Xuhong Cao1, Pankaj Vats1, Lakshmi P. Kunju1, Maha Hussain1, Felix Y. Feng1, Scott A. Tomlins, Kathleen A. Cooney1, David Smith1, Christine Brennan1, Javed Siddiqui1, Rohit Mehra1, Yu Chen8, Yu Chen4, Dana E. Rathkopf4, Dana E. Rathkopf8, Michael J. Morris8, Michael J. Morris4, Stephen B. Solomon4, Jeremy C. Durack4, Victor E. Reuter4, Anuradha Gopalan4, Jianjiong Gao4, Massimo Loda, Rosina T. Lis3, Michaela Bowden3, Michaela Bowden9, Stephen P. Balk10, Glenn C. Gaviola9, Carrie Sougnez2, Manaswi Gupta2, Evan Y. Yu5, Elahe A. Mostaghel5, Heather H. Cheng5, Hyojeong Mulcahy5, Lawrence D. True11, Stephen R. Plymate5, Heidi Dvinge5, Roberta Ferraldeschi7, Roberta Ferraldeschi6, Penny Flohr7, Penny Flohr6, Susana Miranda6, Susana Miranda7, Zafeiris Zafeiriou7, Zafeiris Zafeiriou6, Nina Tunariu6, Nina Tunariu7, Joaquin Mateo7, Joaquin Mateo6, Raquel Perez-Lopez6, Raquel Perez-Lopez7, Francesca Demichelis8, Francesca Demichelis12, Brian D. Robinson, Marc H. Schiffman8, David M. Nanus, Scott T. Tagawa, Alexandros Sigaras8, Kenneth Eng8, Olivier Elemento8, Andrea Sboner8, Elisabeth I. Heath13, Howard I. Scher8, Howard I. Scher4, Kenneth J. Pienta14, Philip W. Kantoff3, Johann S. de Bono7, Johann S. de Bono6, Mark A. Rubin, Peter S. Nelson, Levi A. Garraway3, Levi A. Garraway2, Charles L. Sawyers4, Arul M. Chinnaiyan 
21 May 2015-Cell
TL;DR: This cohort study provides clinically actionable information that could impact treatment decisions for affected individuals and identified new genomic alterations in PIK3CA/B, R-spondin, BRAF/RAF1, APC, β-catenin, and ZBTB16/PLZF.

Proceedings Article
06 Aug 2017
TL;DR: In this article, the authors identify two fundamental axioms (sensitivity and implementation invariance) that attribution methods ought to satisfy and use them to guide the design of a new attribution method called Integrated Gradients.
Abstract: We study the problem of attributing the prediction of a deep network to its input features, a problem previously studied by several other works. We identify two fundamental axioms— Sensitivity and Implementation Invariance that attribution methods ought to satisfy. We show that they are not satisfied by most known attribution methods, which we consider to be a fundamental weakness of those methods. We use the axioms to guide the design of a new attribution method called Integrated Gradients. Our method requires no modification to the original network and is extremely simple to implement; it just needs a few calls to the standard gradient operator. We apply this method to a couple of image models, a couple of text models and a chemistry model, demonstrating its ability to debug networks, to extract rules from a network, and to enable users to engage with models better.

Journal ArticleDOI
TL;DR: The SQUEEZE method is documents as an alternative means of addressing the solvent disorder issue and conveniently interfaces with the 2014 version of the least-squares refinement program SHELXL, and many twinned structures containing disordered solvents are now also treatable by SQUEEze.
Abstract: The completion of a crystal structure determination is often hampered by the presence of embedded solvent molecules or ions that are seriously disordered. Their contribution to the calculated structure factors in the least-squares refinement of a crystal structure has to be included in some way. Traditionally, an atomistic solvent disorder model is attempted. Such an approach is generally to be preferred, but it does not always lead to a satisfactory result and may even be impossible in cases where channels in the structure are filled with continuous electron density. This paper documents the SQUEEZE method as an alternative means of addressing the solvent disorder issue. It conveniently interfaces with the 2014 version of the least-squares refinement program SHELXL [Sheldrick (2015). Acta Cryst. C71. In the press] and other refinement programs that accept externally provided fixed contributions to the calculated structure factors. The PLATON SQUEEZE tool calculates the solvent contribution to the structure factors by back-Fourier transformation of the electron density found in the solvent-accessible region of a phase-optimized difference electron-density map. The actual least-squares structure refinement is delegated to, for example, SHELXL. The current versions of PLATON SQUEEZE and SHELXL now address several of the unnecessary complications with the earlier implementation of the SQUEEZE procedure that were a necessity because least-squares refinement with the now superseded SHELXL97 program did not allow for the input of fixed externally provided contributions to the structure-factor calculation. It is no longer necessary to subtract the solvent contribution temporarily from the observed intensities to be able to use SHELXL for the least-squares refinement, since that program now accepts the solvent contribution from an external file (.fab file) if the ABIN instruction is used. In addition, many twinned structures containing disordered solvents are now also treatable by SQUEEZE. The details of a SQUEEZE calculation are now automatically included in the CIF archive file, along with the unmerged reflection data. The current implementation of the SQUEEZE procedure is described, and discussed and illustrated with three examples. Two of them are based on the reflection data of published structures and one on synthetic reflection data generated for a published structure.

Proceedings ArticleDOI
02 Apr 2017
TL;DR: This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.
Abstract: Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

Journal ArticleDOI
TL;DR: In this paper , the authors presented the most comprehensive estimates of AMR burden to date, which can be divided into five broad components: number of deaths where infection played a role, proportion of infectious deaths attributable to a given infectious syndrome, proportionof infectious syndrome deaths attributed to a particular pathogen, the percentage of a given pathogen resistant to an antibiotic of interest, and the excess risk of death or duration of an infection associated with this resistance.

Posted Content
TL;DR: This paper proposes the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator, and shows that minimizing the objective function of LSGAN yields minimizing the Pearson X2 divergence.
Abstract: Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson $\chi^2$ divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on five scene datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.

Posted Content
TL;DR: High quality image synthesis results are presented using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics, which naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding.
Abstract: We present high quality image synthesis results using diffusion probabilistic models, a class of latent variable models inspired by considerations from nonequilibrium thermodynamics. Our best results are obtained by training on a weighted variational bound designed according to a novel connection between diffusion probabilistic models and denoising score matching with Langevin dynamics, and our models naturally admit a progressive lossy decompression scheme that can be interpreted as a generalization of autoregressive decoding. On the unconditional CIFAR10 dataset, we obtain an Inception score of 9.46 and a state-of-the-art FID score of 3.17. On 256x256 LSUN, we obtain sample quality similar to ProgressiveGAN. Our implementation is available at this https URL