200 million+ research papers across 250,000+ topics on SciSpace

Browse all papers

PDF

Open Access

Proceedings Article•DOI•

XGBoost: A Scalable Tree Boosting System

[...]

Tianqi Chen¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

09 Mar 2016-arXiv: Learning

TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.

...read moreread less

Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

...read moreread less

13,333 citations

Automatic differentiation in PyTorch

[...]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Z. Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, Adam Lerer - Show less +6 more

28 Oct 2017

TL;DR: An automatic differentiation module of PyTorch is described — a library designed to enable rapid research on machine learning models that focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead.

...read moreread less

Abstract: In this article, we describe an automatic differentiation module of PyTorch — a library designed to enable rapid research on machine learning models. It builds upon a few projects, most notably Lua Torch, Chainer, and HIPS Autograd [4], and provides a high performance environment with easy access to automatic differentiation of models executed on different devices (CPU and GPU). To make prototyping easier, PyTorch does not follow the symbolic approach used in many other deep learning frameworks, but focuses on differentiation of purely imperative programs, with a focus on extensibility and low overhead. Note that this preprint is a draft of certain sections from an upcoming paper covering all PyTorch features.

...read moreread less

13,268 citations

Journal Article•DOI•

Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia.

[...]

Qun Li¹, Xuhua Guan¹, Peng Wu², Xiaoye Wang¹, Lei Zhou¹, Yeqing Tong¹, Ruiqi Ren¹, Kathy Leung², Eric H. Y. Lau², Jessica Y. Wong², Xuesen Xing¹, Nijuan Xiang¹, Yang Wu¹, Chao Li¹, Chen Qi¹, Dan Li¹, Tian Liu¹, Jing Zhao¹, Man Liu¹, Wenxiao Tu¹, Chuding Chen¹, Lianmei Jin¹, Rui Yang¹, Qi Wang¹, Suhua Zhou¹, Rui Wang¹, Hui Liu¹, Yingbo Luo¹, Yuan Liu¹, Ge Shao¹, Huan Li¹, Zhongfa Tao¹, Yang Yang³, Yang Yang⁴, Zhiqiang Deng⁵, Boxi Liu⁵, Zhitao Ma⁵, Yanping Zhang¹, Guoqing Shi¹, Tommy Tsan-Yuk Lam², Joseph T. Wu², George F. Gao⁶, George F. Gao¹, Benjamin J. Cowling², Bo Yang⁵, Gabriel M. Leung², Zijian Feng¹ - Show less +43 more•Institutions (6)

Chinese Center for Disease Control and Prevention¹, University of Hong Kong², Shenzhen University³, Southern University of Science and Technology⁴, Centers for Disease Control and Prevention⁵, Chinese Academy of Sciences⁶

29 Jan 2020-The New England Journal of Medicine

TL;DR: There is evidence that human-to-human transmission has occurred among close contacts since the middle of December 2019 and considerable efforts to reduce transmission will be required to control outbreaks if similar dynamics apply elsewhere.

...read moreread less

Abstract: Background The initial cases of novel coronavirus (2019-nCoV)–infected pneumonia (NCIP) occurred in Wuhan, Hubei Province, China, in December 2019 and January 2020. We analyzed data on the...

...read moreread less

13,101 citations

Journal Article•DOI•

Cancer statistics in China, 2015

[...]

Wanqing Chen, Rongshou Zheng, Peter D. Baade¹, Siwei Zhang, Hongmei Zeng, Freddie Bray², Ahmedin Jemal³, Xue Qin Yu, Jie He - Show less +5 more•Institutions (3)

Cancer Council Queensland¹, International Agency for Research on Cancer², American Cancer Society³

01 Mar 2016-CA: A Cancer Journal for Clinicians

TL;DR: Many of the estimated cancer cases and deaths can be prevented through reducing the prevalence of risk factors, while increasing the effectiveness of clinical care delivery, particularly for those living in rural areas and in disadvantaged populations.

...read moreread less

Abstract: With increasing incidence and mortality, cancer is the leading cause of death in China and is a major public health problem. Because of China's massive population (1.37 billion), previous national incidence and mortality estimates have been limited to small samples of the population using data from the 1990s or based on a specific year. With high-quality data from an additional number of population-based registries now available through the National Central Cancer Registry of China, the authors analyzed data from 72 local, population-based cancer registries (2009-2011), representing 6.5% of the population, to estimate the number of new cases and cancer deaths for 2015. Data from 22 registries were used for trend analyses (2000-2011). The results indicated that an estimated 4292,000 new cancer cases and 2814,000 cancer deaths would occur in China in 2015, with lung cancer being the most common incident cancer and the leading cause of cancer death. Stomach, esophageal, and liver cancers were also commonly diagnosed and were identified as leading causes of cancer death. Residents of rural areas had significantly higher age-standardized (Segi population) incidence and mortality rates for all cancers combined than urban residents (213.6 per 100,000 vs 191.5 per 100,000 for incidence; 149.0 per 100,000 vs 109.5 per 100,000 for mortality, respectively). For all cancers combined, the incidence rates were stable during 2000 through 2011 for males (+0.2% per year; P = .1), whereas they increased significantly (+2.2% per year; P < .05) among females. In contrast, the mortality rates since 2006 have decreased significantly for both males (-1.4% per year; P < .05) and females (-1.1% per year; P < .05). Many of the estimated cancer cases and deaths can be prevented through reducing the prevalence of risk factors, while increasing the effectiveness of clinical care delivery, particularly for those living in rural areas and in disadvantaged populations.

...read moreread less

13,073 citations

Journal Article•DOI•

GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

[...]

Mark Abraham¹, Teemu Murtola², Roland Schulz³, Roland Schulz⁴, Szilárd Páll¹, Jeremy C. Smith⁴, Jeremy C. Smith³, Berk Hess¹, Erik Lindahl², Erik Lindahl¹ - Show less +6 more•Institutions (4)

Royal Institute of Technology¹, Stockholm University², University of Tennessee³, Oak Ridge National Laboratory⁴

01 Sep 2015-SoftwareX

TL;DR: GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules, and provides a rich set of calculation types.

...read moreread less

12,985 citations

Book•

The balanced scorecard : measures that drive performance

[...]

Robert S. Kaplan¹, David P. Norton•Institutions (1)

Harvard University¹

17 Apr 2015

TL;DR: A "balanced scorecard" is developed, a new performance measurement system that gives top managers a fast but comprehensive view of the business and complements those financial measures with three sets of operational measures having to do with customer satisfaction, internal processes, and the organization's ability to learn and improve.

...read moreread less

Abstract: Frustrated by the inadequacies of traditional performance measurement systems, some managers have abandoned financial measures like return on equity and earnings per share. "Make operational improvements and the numbers will follow," the argument goes. But managers do not want to choose between financial and operational measures. Executives want a balanced presentation of measures that allow them to view the company from several perspectives simultaneously. During a year-long research project with 12 companies at the leading edge of performance measurement, the authors developed a "balanced scorecard," a new performance measurement system that gives top managers a fast but comprehensive view of the business. The balanced scorecard includes financial measures that tell the results of actions already taken. And it complements those financial measures with three sets of operational measures having to do with customer satisfaction, internal processes, and the organization's ability to learn and improve--the activities that drive future financial performance. Managers can create a balanced scorecard by translating their company's strategy and mission statements into specific goals and measures. To create the part of the scorecard that focuses on the customer perspective, for example, executives at Electronic Circuits Inc. established general goals for customer performance: get standard products to market sooner, improve customers' time-to-market, become customers' supplier of choice through partnerships, and develop innovative products tailored to customer needs. Managers translated these elements of strategy into four specific goals and identified a measure for each.

...read moreread less

12,976 citations

Posted Content•

Distilling the Knowledge in a Neural Network

[...]

Geoffrey E. Hinton, Oriol Vinyals, Jeffrey Dean

09 Mar 2015-arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Abstract: A very simple way to improve the performance of almost any machine learning algorithm is to train many different models on the same data and then to average their predictions. Unfortunately, making predictions using a whole ensemble of models is cumbersome and may be too computationally expensive to allow deployment to a large number of users, especially if the individual models are large neural nets. Caruana and his collaborators have shown that it is possible to compress the knowledge in an ensemble into a single model which is much easier to deploy and we develop this approach further using a different compression technique. We achieve some surprising results on MNIST and we show that we can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model. We also introduce a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse. Unlike a mixture of experts, these specialist models can be trained rapidly and in parallel.

...read moreread less

12,857 citations

Journal Article•DOI•

A new criterion for assessing discriminant validity in variance-based structural equation modeling

[...]

Jörg Henseler¹, Christian M. Ringle², Marko Sarstedt³, Marko Sarstedt⁴•Institutions (4)

University of Twente¹, Hamburg University of Technology², University of Newcastle³, Otto-von-Guericke University Magdeburg⁴

01 Jan 2015-Journal of the Academy of Marketing Science

TL;DR: In this paper, the heterotrait-monotrait ratio of correlations is used to assess discriminant validity in variance-based structural equation modeling. But it does not reliably detect the lack of validity in common research situations.

...read moreread less

Abstract: Discriminant validity assessment has become a generally accepted prerequisite for analyzing relationships between latent variables. For variance-based structural equation modeling, such as partial least squares, the Fornell-Larcker criterion and the examination of cross-loadings are the dominant approaches for evaluating discriminant validity. By means of a simulation study, we show that these approaches do not reliably detect the lack of discriminant validity in common research situations. We therefore propose an alternative approach, based on the multitrait-multimethod matrix, to assess discriminant validity: the heterotrait-monotrait ratio of correlations. We demonstrate its superior performance by means of a Monte Carlo simulation study, in which we compare the new approach to the Fornell-Larcker criterion and the assessment of (partial) cross-loadings. Finally, we provide guidelines on how to handle discriminant validity issues in variance-based structural equation modeling.

...read moreread less

12,855 citations

Journal Article•DOI•

SciPy 1.0--Fundamental Algorithms for Scientific Computing in Python

[...]

Pauli Virtanen¹, Ralf Gommers, Travis E. Oliphant, Matt Haberland², Matt Haberland³, Tyler Reddy⁴, David Cournapeau, Evgeni Burovski⁵, Pearu Peterson, Warren Weckesser⁶, Jonathan Bright, Stefan van der Walt⁶, Matthew Brett⁷, Joshua Wilson, K. Jarrod Millman⁶, Nikolay Mayorov, Andrew Nelson⁸, Eric Jones, Robert Kern, Eric B. Larson⁹, CJ Carey¹⁰, Ilhan Polat, Yu Feng⁶, Eric Moore, Jake Vanderplas⁹, Denis Laxalde, Josef Perktold, Robert Cimrman¹¹, Ian Henriksen¹², Ian Henriksen¹³, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro¹⁴, Fabian Pedregosa¹⁵, Paul van Mulbregt¹⁵, SciPy . Contributors - Show less +33 more•Institutions (15)

University of Jyväskylä¹, University of California, Los Angeles², California Polytechnic State University³, Los Alamos National Laboratory⁴, National Research University – Higher School of Economics⁵, University of California, Berkeley⁶, University of Birmingham⁷, Australian Nuclear Science and Technology Organisation⁸, University of Washington⁹, University of Massachusetts Amherst¹⁰, University of West Bohemia¹¹, University of Texas at Austin¹², Brigham Young University¹³, Universidade Federal de Minas Gerais¹⁴, Google¹⁵

23 Jul 2019-arXiv: Mathematical Software

TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.

...read moreread less

Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

...read moreread less

12,774 citations

Posted Content•

YOLOv3: An Incremental Improvement.

[...]

Joseph Redmon, Ali Farhadi

08 Apr 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: The authors present some updates to YOLO!

...read moreread less

Abstract: We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that's pretty swell. It's a little bigger than last time but more accurate. It's still fast though, don't worry. At 320x320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8x faster. As always, all the code is online at this https URL

...read moreread less

12,770 citations

Posted Content•

PyTorch: An Imperative Style, High-Performance Deep Learning Library

[...]

Adam Paszke¹, Sam Gross², Francisco Massa², Adam Lerer², James Bradbury³, Gregory Chanan², Trevor Killeen⁴, Zeming Lin², Natalia Gimelshein⁵, Luca Antiga⁶, Alban Desmaison⁷, Andreas Kopf⁸, Edward Z. Yang², Zachary DeVito⁹, Martin Raison², Alykhan Tejani¹⁰, Sasank Chilamkurthy, Benoit Steiner², Lu Fang², Junjie Bai², Soumith Chintala² - Show less +17 more•Institutions (10)

University of Warsaw¹, Facebook², Salesforce.com³, University of Washington⁴, Nvidia⁵, Mario Negri Institute for Pharmacological Research⁶, University of Oxford⁷, ETH Zurich⁸, Stanford University⁹, Twitter¹⁰

03 Dec 2019-arXiv: Learning

TL;DR: PyTorch as discussed by the authors is a machine learning library that provides an imperative and Pythonic programming style that makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs.

...read moreread less

Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it provides an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several common benchmarks.

...read moreread less

Posted Content•

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

[...]

Alexey Dosovitskiy¹, Lucas Beyer¹, Alexander Kolesnikov¹, Dirk Weissenborn², Xiaohua Zhai¹, Thomas Unterthiner¹, Mostafa Dehghani¹, Matthias Minderer¹, Georg Heigold², Sylvain Gelly¹, Jakob Uszkoreit¹, Neil Houlsby¹ - Show less +8 more•Institutions (2)

Google¹, German Research Centre for Artificial Intelligence²

22 Oct 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

Abstract: While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not necessary and a pure transformer applied directly to sequences of image patches can perform very well on image classification tasks. When pre-trained on large amounts of data and transferred to multiple mid-sized or small image recognition benchmarks (ImageNet, CIFAR-100, VTAB, etc.), Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.

...read moreread less

Book Chapter•DOI•

SSD: Single Shot MultiBox Detector

[...]

Wei Liu¹, Dragomir Anguelov, Dumitru Erhan², Christian Szegedy², Scott Reed³, Cheng-Yang Fu¹, Alexander C. Berg¹ - Show less +3 more•Institutions (3)

University of North Carolina at Chapel Hill¹, Google², University of Michigan³

08 Dec 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: SSD as mentioned in this paper discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, and combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes.

...read moreread less

Abstract: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component. Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For $300\times 300$ input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for $500\times 500$ input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at this https URL .

...read moreread less

Journal Article•DOI•

A global reference for human genetic variation.

[...]

Adam Auton¹, Gonçalo R. Abecasis², David Altshuler³, Richard Durbin⁴ +514 more•Institutions (90)

01 Oct 2015-Nature

TL;DR: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations, and has reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-generation sequencing, deep exome sequencing, and dense microarray genotyping.

...read moreread less

Abstract: The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies.

...read moreread less

Journal Article•DOI•

lmerTest Package: Tests in Linear Mixed Effects Models

[...]

Alexandra Kuznetsova, Per B. Brockhoff, Rune Haubo Bojesen Christensen

06 Dec 2017-Journal of Statistical Software

TL;DR: The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects, and implementing the Satterthwaite's method for approximating degrees of freedom for the t and F tests.

...read moreread less

Abstract: One of the frequent questions by users of the mixed model function lmer of the lme4 package has been: How can I get p values for the F and t tests for objects returned by lmer? The lmerTest package extends the 'lmerMod' class of the lme4 package, by overloading the anova and summary functions by providing p values for tests for fixed effects. We have implemented the Satterthwaite's method for approximating degrees of freedom for the t and F tests. We have also implemented the construction of Type I - III ANOVA tables. Furthermore, one may also obtain the summary as well as the anova table using the Kenward-Roger approximation for denominator degrees of freedom (based on the KRmodcomp function from the pbkrtest package). Some other convenient mixed model analysis tools such as a step method, that performs backward elimination of nonsignificant effects - both random and fixed, calculation of population means and multiple comparison tests together with plot facilities are provided by the package as well.

...read moreread less

Proceedings Article•DOI•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Jul 2017

TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

...read moreread less

Ítems de referencia para publicar Protocolos de Revisiones Sistemáticas y Metaanálisis: Declaración PRISMA-P 2015 Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement

[...]

David Moher, Larissa Shamseer, Michael Clarke, Davina Ghersi, Alessandro Liberati, Mark Petticrew, Lesley A. Stewart - Show less +3 more

01 Jan 2016

Posted Content•

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren¹, Jian Sun¹•Institutions (2)

Microsoft¹, Xi'an Jiaotong University²

06 Feb 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.

...read moreread less

Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.

...read moreread less

Proceedings Article•DOI•

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

[...]

Kaiming He¹, Xiangyu Zhang², Shaoqing Ren¹, Jian Sun¹•Institutions (2)

Microsoft¹, Xi'an Jiaotong University²

07 Dec 2015

TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.

...read moreread less

Abstract: Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on the learnable activation and advanced initialization, we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66% [33]). To our knowledge, our result is the first to surpass the reported human-level performance (5.1%, [26]) on this dataset.

...read moreread less

Proceedings Article•DOI•

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

[...]

Jun-Yan Zhu¹, Taesung Park¹, Phillip Isola¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

01 Oct 2017

TL;DR: CycleGAN as discussed by the authors learns a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss.

...read moreread less

Abstract: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G : X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F : Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

...read moreread less

Journal Article•DOI•

Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging.

[...]

Roberto M. Lang¹, Luigi P. Badano², Victor Mor-Avi¹, Jonathan Afilalo³, Anderson C. Armstrong⁴, Laura Ernande⁵, Frank A. Flachskampf⁶, Elyse Foster⁷, Steven A. Goldstein⁸, Tatiana Kuznetsova⁹, Patrizio Lancellotti¹⁰, Denisa Muraru², Michael H. Picard¹¹, Ernst Rietzschel¹², Lawrence G. Rudski³, Kirk T. Spencer¹, Wendy Tsang¹³, Jens-Uwe Voigt⁹ - Show less +14 more•Institutions (13)

University of Chicago¹, University of Padua², McGill University³, Johns Hopkins University⁴, French Institute of Health and Medical Research⁵, Uppsala University⁶, University of California, San Francisco⁷, MedStar Washington Hospital Center⁸, Katholieke Universiteit Leuven⁹, University of Liège¹⁰, Harvard University¹¹, Ghent University Hospital¹², University of Toronto¹³

01 Jan 2015-Journal of The American Society of Echocardiography

TL;DR: This document provides updated normal values for all four cardiac chambers, including three-dimensional echocardiography and myocardial deformation, when possible, on the basis of considerably larger numbers of normal subjects, compiled from multiple databases.

...read moreread less

Abstract: The rapid technological developments of the past decade and the changes in echocardiographic practice brought about by these developments have resulted in the need for updated recommendations to the previously published guidelines for cardiac chamber quantification, which was the goal of the joint writing group assembled by the American Society of Echocardiography and the European Association of Cardiovascular Imaging. This document provides updated normal values for all four cardiac chambers, including three-dimensional echocardiography and myocardial deformation, when possible, on the basis of considerably larger numbers of normal subjects, compiled from multiple databases. In addition, this document attempts to eliminate several minor discrepancies that existed between previously published guidelines.

...read moreread less

Book•

The Problem of Social Cost

[...]

Ronald H. Coase

10 Oct 2016

TL;DR: In this paper, it is argued that the suggested courses of action are inappropriate, in that they lead to results which are not necessarily, or even usually, desirable, and therefore, it is recommended to exclude the factory from residential districts (and presumably from other areas in which the emission of smoke would have harmful effects on others).

...read moreread less

Abstract: This paper is concerned with those actions of business firms which have harmful effects on others. The standard example is that of a factory the smoke from which has harmful effects on those occupying neighbouring properties. The economic analysis of such a situation has usually proceeded in terms of a divergence between the private and social product of the factory, in which economists have largely followed the treatment of Pigou in The Economics of Welfare. The conclusions to which this kind of analysis seems to have led most economists is that it would be desirable to make the owner of the factory liable for the damage caused to those injured by the smoke, or alternatively, to place a tax on the factory owner varying with the amount of smoke produced and equivalent in money terms to the damage it would cause, or finally, to exclude the factory from residential districts (and presumably from other areas in which the emission of smoke would have harmful effects on others). It is my contention that the suggested courses of action are inappropriate, in that they lead to results which are not necessarily, or even usually, desirable.

...read moreread less

Proceedings Article•

Mask R-CNN

[...]

Kaiming He¹, Georgia Gkioxari², Piotr Dollár³, Ross Girshick³•Institutions (3)

Microsoft¹, University of California², École Centrale Paris³

20 Mar 2017

TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation that outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners.

...read moreread less

Abstract: We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. The method, called Mask R-CNN, extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition. Mask R-CNN is simple to train and adds only a small overhead to Faster R-CNN, running at 5 fps. Moreover, Mask R-CNN is easy to generalize to other tasks, e.g., allowing us to estimate human poses in the same framework. We show top results in all three tracks of the COCO suite of challenges, including instance segmentation, bounding-box object detection, and person keypoint detection. Without bells and whistles, Mask R-CNN outperforms all existing, single-model entries on every task, including the COCO 2016 challenge winners. We hope our simple and effective approach will serve as a solid baseline and help ease future research in instance-level recognition. Code has been made available at: this https URL

...read moreread less

Journal Article•DOI•

The 2016 World Health Organization Classification of Tumors of the Central Nervous System: a summary.

[...]

David N. Louis¹, Arie Perry², Guido Reifenberger³, Andreas von Deimling⁴, Dominique Figarella-Branger⁵, Webster K. Cavenee⁶, Hiroko Ohgaki⁷, Otmar D. Wiestler⁸, Paul Kleihues⁹, David W. Ellison¹⁰ - Show less +6 more•Institutions (10)

Harvard University¹, University of California, San Francisco², University of Düsseldorf³, Heidelberg University⁴, Aix-Marseille University⁵, Ludwig Institute for Cancer Research⁶, International Agency for Research on Cancer⁷, German Cancer Research Center⁸, University of Zurich⁹, St. Jude Children's Research Hospital¹⁰

09 May 2016-Acta Neuropathologica

TL;DR: The 2016 World Health Organization Classification of Tumors of the Central Nervous System is both a conceptual and practical advance over its 2007 predecessor and is hoped that it will facilitate clinical, experimental and epidemiological studies that will lead to improvements in the lives of patients with brain tumors.

...read moreread less

Abstract: The 2016 World Health Organization Classification of Tumors of the Central Nervous System is both a conceptual and practical advance over its 2007 predecessor. For the first time, the WHO classification of CNS tumors uses molecular parameters in addition to histology to define many tumor entities, thus formulating a concept for how CNS tumor diagnoses should be structured in the molecular era. As such, the 2016 CNS WHO presents major restructuring of the diffuse gliomas, medulloblastomas and other embryonal tumors, and incorporates new entities that are defined by both histology and molecular features, including glioblastoma, IDH-wildtype and glioblastoma, IDH-mutant; diffuse midline glioma, H3 K27M-mutant; RELA fusion-positive ependymoma; medulloblastoma, WNT-activated and medulloblastoma, SHH-activated; and embryonal tumour with multilayered rosettes, C19MC-altered. The 2016 edition has added newly recognized neoplasms, and has deleted some entities, variants and patterns that no longer have diagnostic and/or biological relevance. Other notable changes include the addition of brain invasion as a criterion for atypical meningioma and the introduction of a soft tissue-type grading system for the now combined entity of solitary fibrous tumor / hemangiopericytoma-a departure from the manner by which other CNS tumors are graded. Overall, it is hoped that the 2016 CNS WHO will facilitate clinical, experimental and epidemiological studies that will lead to improvements in the lives of patients with brain tumors.

...read moreread less

Posted Content•

Image-to-Image Translation with Conditional Adversarial Networks

[...]

Phillip Isola¹, Jun-Yan Zhu¹, Tinghui Zhou¹, Alexei A. Efros¹•Institutions (1)

University of California, Berkeley¹

21 Nov 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.

...read moreread less

Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

...read moreread less

Proceedings Article•DOI•

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

[...]

Marco Tulio Ribeiro¹, Sameer Singh¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.

...read moreread less

Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

...read moreread less

Journal Article•DOI•

Cancer statistics, 2015.

[...]

Rebecca L. Siegel¹, Kimberly D. Miller¹, Ahmedin Jemal¹•Institutions (1)

American Cancer Society¹

01 Jan 2015-CA: A Cancer Journal for Clinicians

TL;DR: The overall cancer death rate decreased from 215.1 (per 100,000 population) in 1991 to 168.7 in 2011, a total relative decline of 22%.

...read moreread less

Abstract: Each year the American Cancer Society estimates the numbers of new cancer cases and deaths that will occur in the United States in the current year and compiles the most recent data on cancer incidence, mortality, and survival. Incidence data were collected by the National Cancer Institute (Surveillance, Epidemiology, and End Results [SEER] Program), the Centers for Disease Control and Prevention (National Program of Cancer Registries), and the North American Association of Central Cancer Registries. Mortality data were collected by the National Center for Health Statistics. A total of 1,658,370 new cancer cases and 589,430 cancer deaths are projected to occur in the United States in 2015. During the most recent 5 years for which there are data (2007-2011), delay-adjusted cancer incidence rates (13 oldest SEER registries) declined by 1.8% per year in men and were stable in women, while cancer death rates nationwide decreased by 1.8% per year in men and by 1.4% per year in women. The overall cancer death rate decreased from 215.1 (per 100,000 population) in 1991 to 168.7 in 2011, a total relative decline of 22%. However, the magnitude of the decline varied by state, and was generally lowest in the South (∼15%) and highest in the Northeast (≥20%). For example, there were declines of 25% to 30% in Maryland, New Jersey, Massachusetts, New York, and Delaware, which collectively averted 29,000 cancer deaths in 2011 as a result of this progress. Further gains can be accelerated by applying existing cancer control knowledge across all segments of the population.

...read moreread less

Proceedings Article•DOI•

TensorFlow: a system for large-scale machine learning

[...]

Martín Abadi¹, Paul Barham¹, Jianmin Chen¹, Zhifeng Chen¹, Andy Davis¹, Jeffrey Dean¹, Matthieu Devin¹, Sanjay Ghemawat¹, Geoffrey Irving¹, Michael Isard¹, Manjunath Kudlur¹, Josh Levenberg¹, Rajat Monga¹, Sherry Moore¹, Derek G. Murray¹, Benoit Steiner¹, Paul A. Tucker¹, Vijay K. Vasudevan¹, Pete Warden¹, Martin Wicke¹, Yuan Yu¹, Xiaoqiang Zheng¹ - Show less +18 more•Institutions (1)

Google¹

02 Nov 2016

TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

Journal Article•DOI•

Planck 2015 results - XIII. Cosmological parameters

[...]

Peter A. R. Ade¹, Nabila Aghanim², Monique Arnaud³, M. Ashdown⁴ +334 more•Institutions (82)

01 Oct 2016-Astronomy and Astrophysics

TL;DR: In this article, the authors present a cosmological analysis based on full-mission Planck observations of temperature and polarization anisotropies of the cosmic microwave background (CMB) radiation.

...read moreread less

Abstract: This paper presents cosmological results based on full-mission Planck observations of temperature and polarization anisotropies of the cosmic microwave background (CMB) radiation. Our results are in very good agreement with the 2013 analysis of the Planck nominal-mission temperature data, but with increased precision. The temperature and polarization power spectra are consistent with the standard spatially-flat 6-parameter ΛCDM cosmology with a power-law spectrum of adiabatic scalar perturbations (denoted “base ΛCDM” in this paper). From the Planck temperature data combined with Planck lensing, for this cosmology we find a Hubble constant, H0 = (67.8 ± 0.9) km s-1Mpc-1, a matter density parameter Ωm = 0.308 ± 0.012, and a tilted scalar spectral index with ns = 0.968 ± 0.006, consistent with the 2013 analysis. Note that in this abstract we quote 68% confidence limits on measured parameters and 95% upper limits on other parameters. We present the first results of polarization measurements with the Low Frequency Instrument at large angular scales. Combined with the Planck temperature and lensing data, these measurements give a reionization optical depth of τ = 0.066 ± 0.016, corresponding to a reionization redshift of . These results are consistent with those from WMAP polarization measurements cleaned for dust emission using 353-GHz polarization maps from the High Frequency Instrument. We find no evidence for any departure from base ΛCDM in the neutrino sector of the theory; for example, combining Planck observations with other astrophysical data we find Neff = 3.15 ± 0.23 for the effective number of relativistic degrees of freedom, consistent with the value Neff = 3.046 of the Standard Model of particle physics. The sum of neutrino masses is constrained to ∑ mν < 0.23 eV. The spatial curvature of our Universe is found to be very close to zero, with | ΩK | < 0.005. Adding a tensor component as a single-parameter extension to base ΛCDM we find an upper limit on the tensor-to-scalar ratio of r0.002< 0.11, consistent with the Planck 2013 results and consistent with the B-mode polarization constraints from a joint analysis of BICEP2, Keck Array, and Planck (BKP) data. Adding the BKP B-mode data to our analysis leads to a tighter constraint of r0.002 < 0.09 and disfavours inflationarymodels with a V(φ) ∝ φ2 potential. The addition of Planck polarization data leads to strong constraints on deviations from a purely adiabatic spectrum of fluctuations. We find no evidence for any contribution from isocurvature perturbations or from cosmic defects. Combining Planck data with other astrophysical data, including Type Ia supernovae, the equation of state of dark energy is constrained to w = −1.006 ± 0.045, consistent with the expected value for a cosmological constant. The standard big bang nucleosynthesis predictions for the helium and deuterium abundances for the best-fit Planck base ΛCDM cosmology are in excellent agreement with observations. We also constraints on annihilating dark matter and on possible deviations from the standard recombination history. In neither case do we find no evidence for new physics. The Planck results for base ΛCDM are in good agreement with baryon acoustic oscillation data and with the JLA sample of Type Ia supernovae. However, as in the 2013 analysis, the amplitude of the fluctuation spectrum is found to be higher than inferred from some analyses of rich cluster counts and weak gravitational lensing. We show that these tensions cannot easily be resolved with simple modifications of the base ΛCDM cosmology. Apart from these tensions, the base ΛCDM cosmology provides an excellent description of the Planck CMB observations and many other astrophysical data sets.

...read moreread less

Journal Article•DOI•

STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.

[...]

Damian Szklarczyk¹, Annika L. Gable¹, David Lyon¹, Alexander Junge², Stefan Wyder¹, Jaime Huerta-Cepas³, Milan Simonovic¹, Nadezhda Tsankova Doncheva², John H. Morris⁴, Peer Bork, Lars Juhl Jensen², Christian von Mering¹ - Show less +8 more•Institutions (4)

Swiss Institute of Bioinformatics¹, University of Copenhagen², Technical University of Madrid³, University of California, San Francisco⁴

08 Jan 2019-Nucleic Acids Research

TL;DR: The latest version of STRING more than doubles the number of organisms it covers, and offers an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input.

...read moreread less

Abstract: Proteins and their functional interactions form the backbone of the cellular machinery. Their connectivity network needs to be considered for the full understanding of biological phenomena, but the available information on protein-protein associations is incomplete and exhibits varying levels of annotation granularity and reliability. The STRING database aims to collect, score and integrate all publicly available sources of protein-protein interaction information, and to complement these with computational predictions. Its goal is to achieve a comprehensive and objective global network, including direct (physical) as well as indirect (functional) interactions. The latest version of STRING (11.0) more than doubles the number of organisms it covers, to 5090. The most important new feature is an option to upload entire, genome-wide datasets as input, allowing users to visualize subsets as interaction networks and to perform gene-set enrichment analysis on the entire input. For the enrichment analysis, STRING implements well-known classification systems such as Gene Ontology and KEGG, but also offers additional, new classification systems based on high-throughput text-mining as well as on a hierarchical clustering of the association network itself. The STRING resource is available online at https://string-db.org/.

...read moreread less

Collapse