scispace - formally typeset
Search or ask a question
Browse all papers

Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations


28 Oct 2017
TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.
Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

13,268 citations


Journal ArticleDOI
TL;DR: There is evidence that human-to-human transmission has occurred among close contacts since the middle of December 2019 and considerable efforts to reduce transmission will be required to control outbreaks if similar dynamics apply elsewhere.
Abstract: Background The initial cases of novel coronavirus (2019-nCoV)–infected pneumonia (NCIP) occurred in Wuhan, Hubei Province, China, in December 2019 and January 2020. We analyzed data on the...

13,101 citations


Journal ArticleDOI
TL;DR: Many of the estimated cancer cases and deaths can be prevented through reducing the prevalence of risk factors, while increasing the effectiveness of clinical care delivery, particularly for those living in rural areas and in disadvantaged populations.
Abstract: With increasing incidence and mortality, cancer is the leading cause of death in China and is a major public health problem. Because of China's massive population (1.37 billion), previous national incidence and mortality estimates have been limited to small samples of the population using data from the 1990s or based on a specific year. With high-quality data from an additional number of population-based registries now available through the National Central Cancer Registry of China, the authors analyzed data from 72 local, population-based cancer registries (2009-2011), representing 6.5% of the population, to estimate the number of new cases and cancer deaths for 2015. Data from 22 registries were used for trend analyses (2000-2011). The results indicated that an estimated 4292,000 new cancer cases and 2814,000 cancer deaths would occur in China in 2015, with lung cancer being the most common incident cancer and the leading cause of cancer death. Stomach, esophageal, and liver cancers were also commonly diagnosed and were identified as leading causes of cancer death. Residents of rural areas had significantly higher age-standardized (Segi population) incidence and mortality rates for all cancers combined than urban residents (213.6 per 100,000 vs 191.5 per 100,000 for incidence; 149.0 per 100,000 vs 109.5 per 100,000 for mortality, respectively). For all cancers combined, the incidence rates were stable during 2000 through 2011 for males (+0.2% per year; P = .1), whereas they increased significantly (+2.2% per year; P < .05) among females. In contrast, the mortality rates since 2006 have decreased significantly for both males (-1.4% per year; P < .05) and females (-1.1% per year; P < .05). Many of the estimated cancer cases and deaths can be prevented through reducing the prevalence of risk factors, while increasing the effectiveness of clinical care delivery, particularly for those living in rural areas and in disadvantaged populations.

13,073 citations


Journal ArticleDOI
TL;DR: GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules, and provides a rich set of calculation types.

12,985 citations


Book
17 Apr 2015
TL;DR: A "balanced scorecard" is developed, a new performance measurement system that gives top managers a fast but comprehensive view of the business and complements those financial measures with three sets of operational measures having to do with customer satisfaction, internal processes, and the organization's ability to learn and improve.
Abstract: Frustrated by the inadequacies of traditional performance measurement systems, some managers have abandoned financial measures like return on equity and earnings per share. "Make operational improvements and the numbers will follow," the argument goes. But managers do not want to choose between financial and operational measures. Executives want a balanced presentation of measures that allow them to view the company from several perspectives simultaneously. During a year-long research project with 12 companies at the leading edge of performance measurement, the authors developed a "balanced scorecard," a new performance measurement system that gives top managers a fast but comprehensive view of the business. The balanced scorecard includes financial measures that tell the results of actions already taken. And it complements those financial measures with three sets of operational measures having to do with customer satisfaction, internal processes, and the organization's ability to learn and improve--the activities that drive future financial performance. Managers can create a balanced scorecard by translating their company's strategy and mission statements into specific goals and measures. To create the part of the scorecard that focuses on the customer perspective, for example, executives at Electronic Circuits Inc. established general goals for customer performance: get standard products to market sooner, improve customers' time-to-market, become customers' supplier of choice through partnerships, and develop innovative products tailored to customer needs. Managers translated these elements of strategy into four specific goals and identified a measure for each.

12,976 citations


Posted Content
TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
Abstract: A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.

12,857 citations


Journal ArticleDOI
TL;DR: In this paper, the heterotrait-monotrait ratio of correlations is used to assess discriminant validity in variance-based structural equation modeling. But it does not reliably detect the lack of validity in common research situations.
Abstract: Discriminant validity assessment has become a generally accepted prerequisite for analyzing relationships between latent variables. For variance-based structural equation modeling, such as partial least squares, the Fornell-Larcker criterion and the examination of cross-loadings are the dominant approaches for evaluating discriminant validity. By means of a simulation study, we show that these approaches do not reliably detect the lack of discriminant validity in common research situations. We therefore propose an alternative approach, based on the multitrait-multimethod matrix, to assess discriminant validity: the heterotrait-monotrait ratio of correlations. We demonstrate its superior performance by means of a Monte Carlo simulation study, in which we compare the new approach to the Fornell-Larcker criterion and the assessment of (partial) cross-loadings. Finally, we provide guidelines on how to handle discriminant validity issues in variance-based structural equation modeling.

12,855 citations


Journal ArticleDOI
TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.
Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

12,774 citations


Posted Content
TL;DR: The authors present some updates to YOLO!
Abstract: We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at this https URL

12,770 citations


Posted Content
TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.
Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several common benchmarks.

Posted Content
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Abstract: While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

Book ChapterDOI
TL;DR: SSD as mentioned in this paper discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, and combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.
Abstract: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For $300\times 300$ input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for $500\times 500$ input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at this https URL .

Journal ArticleDOI
Adam Auton1, Gonçalo R. Abecasis2, David Altshuler3, Richard Durbin4  +514 moreInstitutions (90)
01 Oct 2015-Nature
TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.
Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

Journal ArticleDOI
TL;DR: The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects, and implementing the Satterthwaite's method for approximating degrees of freedom for the t and F tests.
Abstract: One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.


Posted Content
TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

Proceedings ArticleDOI
07 Dec 2015
TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.
Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on the learnable activation and advanced initialization, we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66% [33]). To our knowledge, our result is the first to surpass the reported human-level performance (5.1%, [26]) on this dataset.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.
Abstract: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F : Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

Journal ArticleDOI
TL;DR: This document provides updated normal values for all four cardiac chambers, including three-dimensional echocardiography and myocardial deformation, when possible, on the basis of considerably larger numbers of normal subjects, compiled from multiple databases.
Abstract: The rapid technological developments of the past decade and the changes in echocardiographic practice brought about by these developments have resulted in the need for updated recommendations to the previously published guidelines for cardiac chamber quantification, which was the goal of the joint writing group assembled by the American Society of Echocardiography and the European Association of Cardiovascular Imaging. This document provides updated normal values for all four cardiac chambers, including three-dimensional echocardiography and myocardial deformation, when possible, on the basis of considerably larger numbers of normal subjects, compiled from multiple databases. In addition, this document attempts to eliminate several minor discrepancies that existed between previously published guidelines.

Book
10 Oct 2016
TL;DR: In this paper, it is argued that the suggested courses of action are inappropriate, in that they lead to results which are not necessarily, or even usually, desirable, and therefore, it is recommended to exclude the factory from residential districts (and presumably from other areas in which the emission of smoke would have harmful effects on others).
Abstract: This paper is concerned with those actions of business firms which have harmful effects on others. The standard example is that of a factory the smoke from which has harmful effects on those occupying neighbouring properties. The economic analysis of such a situation has usually proceeded in terms of a divergence between the private and social product of the factory, in which economists have largely followed the treatment of Pigou in The Economics of Welfare. The conclusions to which this kind of analysis seems to have led most economists is that it would be desirable to make the owner of the factory liable for the damage caused to those injured by the smoke, or alternatively, to place a tax on the factory owner varying with the amount of smoke produced and equivalent in money terms to the damage it would cause, or finally, to exclude the factory from residential districts (and presumably from other areas in which the emission of smoke would have harmful effects on others). It is my contention that the suggested courses of action are inappropriate, in that they lead to results which are not necessarily, or even usually, desirable.

Proceedings Article
20 Mar 2017
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.
Abstract: We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: this https URL

Journal ArticleDOI
TL;DR: The 2016 World Health Organization Classification of Tumors of the Central Nervous System is both a conceptual and practical advance over its 2007 predecessor and is hoped that it will facilitate clinical, experimental and epidemiological studies that will lead to improvements in the lives of patients with brain tumors.
Abstract: The 2016 World Health Organization Classification of Tumors of the Central Nervous System is both a conceptual and practical advance over its 2007 predecessor. For the first time, the WHO classification of CNS tumors uses molecular parameters in addition to histology to define many tumor entities, thus formulating a concept for how CNS tumor diagnoses should be structured in the molecular era. As such, the 2016 CNS WHO presents major restructuring of the diffuse gliomas, medulloblastomas and other embryonal tumors, and incorporates new entities that are defined by both histology and molecular features, including glioblastoma, IDH-wildtype and glioblastoma, IDH-mutant; diffuse midline glioma, H3 K27M-mutant; RELA fusion-positive ependymoma; medulloblastoma, WNT-activated and medulloblastoma, SHH-activated; and embryonal tumour with multilayered rosettes, C19MC-altered. The 2016 edition has added newly recognized neoplasms, and has deleted some entities, variants and patterns that no longer have diagnostic and/or biological relevance. Other notable changes include the addition of brain invasion as a criterion for atypical meningioma and the introduction of a soft tissue-type grading system for the now combined entity of solitary fibrous tumor / hemangiopericytoma-a departure from the manner by which other CNS tumors are graded. Overall, it is hoped that the 2016 CNS WHO will facilitate clinical, experimental and epidemiological studies that will lead to improvements in the lives of patients with brain tumors.

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

Proceedings ArticleDOI
13 Aug 2016
TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.
Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

Journal ArticleDOI
TL;DR: The overall cancer death rate decreased from 215.1 (per 100,000 population) in 1991 to 168.7 in 2011, a total relative decline of 22%.
Abstract: Each year the American Cancer Society estimates the numbers of new cancer cases and deaths that will occur in the United States in the current year and compiles the most recent data on cancer incidence, mortality, and survival. Incidence data were collected by the National Cancer Institute (Surveillance, Epidemiology, and End Results [SEER] Program), the Centers for Disease Control and Prevention (National Program of Cancer Registries), and the North American Association of Central Cancer Registries. Mortality data were collected by the National Center for Health Statistics. A total of 1,658,370 new cancer cases and 589,430 cancer deaths are projected to occur in the United States in 2015. During the most recent 5 years for which there are data (2007-2011), delay-adjusted cancer incidence rates (13 oldest SEER registries) declined by 1.8% per year in men and were stable in women, while cancer death rates nationwide decreased by 1.8% per year in men and by 1.4% per year in women. The overall cancer death rate decreased from 215.1 (per 100,000 population) in 1991 to 168.7 in 2011, a total relative decline of 22%. However, the magnitude of the decline varied by state, and was generally lowest in the South (∼15%) and highest in the Northeast (≥20%). For example, there were declines of 25% to 30% in Maryland, New Jersey, Massachusetts, New York, and Delaware, which collectively averted 29,000 cancer deaths in 2011 as a result of this progress. Further gains can be accelerated by applying existing cancer control knowledge across all segments of the population.

Proceedings ArticleDOI
02 Nov 2016
TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.
Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

Journal ArticleDOI
Peter A. R. Ade1, Nabila Aghanim2, Monique Arnaud3, M. Ashdown4  +334 moreInstitutions (82)
TL;DR: In this article, the authors present a cosmological analysis based on full-mission Planck observations of temperature and polarization anisotropies of the cosmic microwave background (CMB) radiation.
Abstract: This paper presents cosmological results based on full-mission Planck observations of temperature and polarization anisotropies of the cosmic microwave background (CMB) radiation. Our results are in very good agreement with the 2013 analysis of the Planck nominal-mission temperature data, but with increased precision. The temperature and polarization power spectra are consistent with the standard spatially-flat 6-parameter ΛCDM cosmology with a power-law spectrum of adiabatic scalar perturbations (denoted “base ΛCDM” in this paper). From the Planck temperature data combined with Planck lensing, for this cosmology we find a Hubble constant, H0 = (67.8 ± 0.9) km s-1Mpc-1, a matter density parameter Ωm = 0.308 ± 0.012, and a tilted scalar spectral index with ns = 0.968 ± 0.006, consistent with the 2013 analysis. Note that in this abstract we quote 68% confidence limits on measured parameters and 95% upper limits on other parameters. We present the first results of polarization measurements with the Low Frequency Instrument at large angular scales. Combined with the Planck temperature and lensing data, these measurements give a reionization optical depth of τ = 0.066 ± 0.016, corresponding to a reionization redshift of . These results are consistent with those from WMAP polarization measurements cleaned for dust emission using 353-GHz polarization maps from the High Frequency Instrument. We find no evidence for any departure from base ΛCDM in the neutrino sector of the theory; for example, combining Planck observations with other astrophysical data we find Neff = 3.15 ± 0.23 for the effective number of relativistic degrees of freedom, consistent with the value Neff = 3.046 of the Standard Model of particle physics. The sum of neutrino masses is constrained to ∑ mν < 0.23 eV. The spatial curvature of our Universe is found to be very close to zero, with | ΩK | < 0.005. Adding a tensor component as a single-parameter extension to base ΛCDM we find an upper limit on the tensor-to-scalar ratio of r0.002< 0.11, consistent with the Planck 2013 results and consistent with the B-mode polarization constraints from a joint analysis of BICEP2, Keck Array, and Planck (BKP) data. Adding the BKP B-mode data to our analysis leads to a tighter constraint of r0.002 < 0.09 and disfavours inflationarymodels with a V(φ) ∝ φ2 potential. The addition of Planck polarization data leads to strong constraints on deviations from a purely adiabatic spectrum of fluctuations. We find no evidence for any contribution from isocurvature perturbations or from cosmic defects. Combining Planck data with other astrophysical data, including Type Ia supernovae, the equation of state of dark energy is constrained to w = −1.006 ± 0.045, consistent with the expected value for a cosmological constant. The standard big bang nucleosynthesis predictions for the helium and deuterium abundances for the best-fit Planck base ΛCDM cosmology are in excellent agreement with observations. We also constraints on annihilating dark matter and on possible deviations from the standard recombination history. In neither case do we find no evidence for new physics. The Planck results for base ΛCDM are in good agreement with baryon acoustic oscillation data and with the JLA sample of Type Ia supernovae. However, as in the 2013 analysis, the amplitude of the fluctuation spectrum is found to be higher than inferred from some analyses of rich cluster counts and weak gravitational lensing. We show that these tensions cannot easily be resolved with simple modifications of the base ΛCDM cosmology. Apart from these tensions, the base ΛCDM cosmology provides an excellent description of the Planck CMB observations and many other astrophysical data sets.

Journal ArticleDOI
TL;DR: The latest version of STRING more than doubles the number of organisms it covers, and offers an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input.
Abstract: Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.