scispace - formally typeset
Search or ask a question

Showing papers by "University of Washington published in 2016"


Proceedings ArticleDOI
27 Jun 2016
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Abstract: We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class probabilities. A single neural network predicts bounding boxes and class probabilities directly from full images in one evaluation. Since the whole detection pipeline is a single network, it can be optimized end-to-end directly on detection performance. Our unified architecture is extremely fast. Our base YOLO model processes images in real-time at 45 frames per second. A smaller version of the network, Fast YOLO, processes an astounding 155 frames per second while still achieving double the mAP of other real-time detectors. Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background. Finally, YOLO learns very general representations of objects. It outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

27,256 citations


Proceedings ArticleDOI
13 Aug 2016
TL;DR: XGBoost as discussed by the authors proposes a sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning to achieve state-of-the-art results on many machine learning challenges.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

14,872 citations


Proceedings ArticleDOI
TL;DR: This paper proposes a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning and provides insights on cache access patterns, data compression and sharding to build a scalable tree boosting system called XGBoost.
Abstract: Tree boosting is a highly effective and widely used machine learning method. In this paper, we describe a scalable end-to-end tree boosting system called XGBoost, which is used widely by data scientists to achieve state-of-the-art results on many machine learning challenges. We propose a novel sparsity-aware algorithm for sparse data and weighted quantile sketch for approximate tree learning. More importantly, we provide insights on cache access patterns, data compression and sharding to build a scalable tree boosting system. By combining these insights, XGBoost scales beyond billions of examples using far fewer resources than existing systems.

13,333 citations


Proceedings ArticleDOI
13 Aug 2016
TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.
Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

11,104 citations


Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, Matthew Abernathy1  +1008 moreInstitutions (96)
TL;DR: This is the first direct detection of gravitational waves and the first observation of a binary black hole merger, and these observations demonstrate the existence of binary stellar-mass black hole systems.
Abstract: On September 14, 2015 at 09:50:45 UTC the two detectors of the Laser Interferometer Gravitational-Wave Observatory simultaneously observed a transient gravitational-wave signal. The signal sweeps upwards in frequency from 35 to 250 Hz with a peak gravitational-wave strain of $1.0 \times 10^{-21}$. It matches the waveform predicted by general relativity for the inspiral and merger of a pair of black holes and the ringdown of the resulting single black hole. The signal was observed with a matched-filter signal-to-noise ratio of 24 and a false alarm rate estimated to be less than 1 event per 203 000 years, equivalent to a significance greater than 5.1 {\sigma}. The source lies at a luminosity distance of $410^{+160}_{-180}$ Mpc corresponding to a redshift $z = 0.09^{+0.03}_{-0.04}$. In the source frame, the initial black hole masses are $36^{+5}_{-4} M_\odot$ and $29^{+4}_{-4} M_\odot$, and the final black hole mass is $62^{+4}_{-4} M_\odot$, with $3.0^{+0.5}_{-0.5} M_\odot c^2$ radiated in gravitational waves. All uncertainties define 90% credible intervals.These observations demonstrate the existence of binary stellar-mass black hole systems. This is the first direct detection of gravitational waves and the first observation of a binary black hole merger.

9,596 citations


Journal ArticleDOI
Monkol Lek, Konrad J. Karczewski1, Konrad J. Karczewski2, Eric Vallabh Minikel2, Eric Vallabh Minikel1, Kaitlin E. Samocha, Eric Banks1, Timothy Fennell1, Anne H. O’Donnell-Luria3, Anne H. O’Donnell-Luria2, Anne H. O’Donnell-Luria1, James S. Ware, Andrew J. Hill1, Andrew J. Hill2, Andrew J. Hill4, Beryl B. Cummings2, Beryl B. Cummings1, Taru Tukiainen2, Taru Tukiainen1, Daniel P. Birnbaum1, Jack A. Kosmicki, Laramie E. Duncan2, Laramie E. Duncan1, Karol Estrada2, Karol Estrada1, Fengmei Zhao1, Fengmei Zhao2, James Zou1, Emma Pierce-Hoffman1, Emma Pierce-Hoffman2, Joanne Berghout5, David Neil Cooper6, Nicole A. Deflaux7, Mark A. DePristo1, Ron Do, Jason Flannick2, Jason Flannick1, Menachem Fromer, Laura D. Gauthier1, Jackie Goldstein1, Jackie Goldstein2, Namrata Gupta1, Daniel P. Howrigan2, Daniel P. Howrigan1, Adam Kiezun1, Mitja I. Kurki1, Mitja I. Kurki2, Ami Levy Moonshine1, Pradeep Natarajan, Lorena Orozco, Gina M. Peloso1, Gina M. Peloso2, Ryan Poplin1, Manuel A. Rivas1, Valentin Ruano-Rubio1, Samuel A. Rose1, Douglas M. Ruderfer8, Khalid Shakir1, Peter D. Stenson6, Christine Stevens1, Brett Thomas2, Brett Thomas1, Grace Tiao1, María Teresa Tusié-Luna, Ben Weisburd1, Hong-Hee Won9, Dongmei Yu, David Altshuler1, David Altshuler10, Diego Ardissino, Michael Boehnke11, John Danesh12, Stacey Donnelly1, Roberto Elosua, Jose C. Florez1, Jose C. Florez2, Stacey Gabriel1, Gad Getz2, Gad Getz1, Stephen J. Glatt13, Christina M. Hultman14, Sekar Kathiresan, Markku Laakso15, Steven A. McCarroll1, Steven A. McCarroll2, Mark I. McCarthy16, Mark I. McCarthy17, Dermot P.B. McGovern18, Ruth McPherson19, Benjamin M. Neale1, Benjamin M. Neale2, Aarno Palotie, Shaun Purcell8, Danish Saleheen20, Jeremiah M. Scharf, Pamela Sklar, Patrick F. Sullivan21, Patrick F. Sullivan14, Jaakko Tuomilehto22, Ming T. Tsuang23, Hugh Watkins16, Hugh Watkins17, James G. Wilson24, Mark J. Daly1, Mark J. Daly2, Daniel G. MacArthur2, Daniel G. MacArthur1 
18 Aug 2016-Nature
TL;DR: The aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC) provides direct evidence for the presence of widespread mutational recurrence.
Abstract: Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.

8,758 citations


Posted Content
TL;DR: YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories, is introduced and a method to jointly train on object detection and classification is proposed, both novel and drawn from prior work.
Abstract: We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster RCNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don't have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.

8,505 citations


Journal ArticleDOI
Daniel J. Klionsky1, Kotb Abdelmohsen2, Akihisa Abe3, Joynal Abedin4  +2519 moreInstitutions (695)
TL;DR: In this paper, the authors present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macro-autophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes.
Abstract: In 2008 we published the first set of guidelines for standardizing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in different organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes. For example, a key point that needs to be emphasized is that there is a difference between measurements that monitor the numbers or volume of autophagic elements (e.g., autophagosomes or autolysosomes) at any stage of the autophagic process versus those that measure flux through the autophagy pathway (i.e., the complete process including the amount and rate of cargo sequestered and degraded). In particular, a block in macroautophagy that results in autophagosome accumulation must be differentiated from stimuli that increase autophagic activity, defined as increased autophagy induction coupled with increased delivery to, and degradation within, lysosomes (in most higher eukaryotes and some protists such as Dictyostelium) or the vacuole (in plants and fungi). In other words, it is especially important that investigators new to the field understand that the appearance of more autophagosomes does not necessarily equate with more autophagy. In fact, in many cases, autophagosomes accumulate because of a block in trafficking to lysosomes without a concomitant change in autophagosome biogenesis, whereas an increase in autolysosomes may reflect a reduction in degradative activity. It is worth emphasizing here that lysosomal digestion is a stage of autophagy and evaluating its competence is a crucial part of the evaluation of autophagic flux, or complete autophagy. Here, we present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macroautophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes. These guidelines are not meant to be a formulaic set of rules, because the appropriate assays depend in part on the question being asked and the system being used. In addition, we emphasize that no individual assay is guaranteed to be the most appropriate one in every situation, and we strongly recommend the use of multiple assays to monitor autophagy. Along these lines, because of the potential for pleiotropic effects due to blocking autophagy through genetic manipulation, it is imperative to target by gene knockout or RNA interference more than one autophagy-related protein. In addition, some individual Atg proteins, or groups of proteins, are involved in other cellular pathways implying that not all Atg proteins can be used as a specific marker for an autophagic process. In these guidelines, we consider these various methods of assessing autophagy and what information can, or cannot, be obtained from them. Finally, by discussing the merits and limits of particular assays, we hope to encourage technical innovation in the field.

5,187 citations


Journal ArticleDOI
Theo Vos1, Christine Allen1, Megha Arora1, Ryan M Barber1  +696 moreInstitutions (260)
TL;DR: The Global Burden of Diseases, Injuries, and Risk Factors Study 2015 (GBD 2015) as discussed by the authors was used to estimate the incidence, prevalence, and years lived with disability for diseases and injuries at the global, regional, and national scale over the period of 1990 to 2015.

5,050 citations


Journal ArticleDOI
Haidong Wang1, Mohsen Naghavi1, Christine Allen1, Ryan M Barber1  +841 moreInstitutions (293)
TL;DR: The Global Burden of Disease 2015 Study provides a comprehensive assessment of all-cause and cause-specific mortality for 249 causes in 195 countries and territories from 1980 to 2015, finding several countries in sub-Saharan Africa had very large gains in life expectancy, rebounding from an era of exceedingly high loss of life due to HIV/AIDS.

4,804 citations


Journal ArticleDOI
TL;DR: This report describes the process of radiomics, its challenges, and its potential power to facilitate better clinical decision making, particularly in the care of patients with cancer.
Abstract: In the past decade, the field of medical image analysis has grown exponentially, with an increased number of pattern recognition tools and an increase in data set sizes. These advances have facilitated the development of processes for high-throughput extraction of quantitative features that result in the conversion of images into mineable data and the subsequent analysis of these data for decision support; this practice is termed radiomics. This is in contrast to the traditional practice of treating medical images as pictures intended solely for visual interpretation. Radiomic data contain first-, second-, and higher-order statistics. These data are combined with other patient data and are mined with sophisticated bioinformatics tools to develop models that may potentially improve diagnostic, prognostic, and predictive accuracy. Because radiomics analyses are intended to be conducted with standard of care images, it is conceivable that conversion of digital images to mineable data will eventually become routine practice. This report describes the process of radiomics, its challenges, and its potential power to facilitate better clinical decision making, particularly in the care of patients with cancer.

Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, M. R. Abernathy3  +970 moreInstitutions (114)
TL;DR: This second gravitational-wave observation provides improved constraints on stellar populations and on deviations from general relativity.
Abstract: We report the observation of a gravitational-wave signal produced by the coalescence of two stellar-mass black holes. The signal, GW151226, was observed by the twin detectors of the Laser Interferometer Gravitational-Wave Observatory (LIGO) on December 26, 2015 at 03:38:53 UTC. The signal was initially identified within 70 s by an online matched-filter search targeting binary coalescences. Subsequent off-line analyses recovered GW151226 with a network signal-to-noise ratio of 13 and a significance greater than 5 σ. The signal persisted in the LIGO frequency band for approximately 1 s, increasing in frequency and amplitude over about 55 cycles from 35 to 450 Hz, and reached a peak gravitational strain of 3.4+0.7−0.9×10−22. The inferred source-frame initial black hole masses are 14.2+8.3−3.7M⊙ and 7.5+2.3−2.3M⊙ and the final black hole mass is 20.8+6.1−1.7M⊙. We find that at least one of the component black holes has spin greater than 0.2. This source is located at a luminosity distance of 440+180−190 Mpc corresponding to a redshift 0.09+0.03−0.04. All uncertainties define a 90 % credible interval. This second gravitational-wave observation provides improved constraints on stellar populations and on deviations from general relativity.

Book ChapterDOI
08 Oct 2016
TL;DR: The Binary-Weight-Network version of AlexNet is compared with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than \(16\,\%\) in top-1 accuracy.
Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32\(\times \) memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58\(\times \) faster convolutional operations (in terms of number of the high precision operations) and 32\(\times \) memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is the same as the full-precision AlexNet. We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than \(16\,\%\) in top-1 accuracy. Our code is available at: http://allenai.org/plato/xnornet.

Journal ArticleDOI
TL;DR: This work develops a novel framework to discover governing equations underlying a dynamical system simply from data measurements, leveraging advances in sparsity techniques and machine learning and using sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data.
Abstract: Extracting governing equations from data is a central challenge in many diverse areas of science and engineering. Data are abundant whereas models often remain elusive, as in climate science, neuroscience, ecology, finance, and epidemiology, to name only a few examples. In this work, we combine sparsity-promoting techniques and machine learning with nonlinear dynamical systems to discover governing equations from noisy measurement data. The only assumption about the structure of the model is that there are only a few important terms that govern the dynamics, so that the equations are sparse in the space of possible functions; this assumption holds for many physical systems in an appropriate basis. In particular, we use sparse regression to determine the fewest terms in the dynamic governing equations required to accurately represent the data. This results in parsimonious models that balance accuracy with model complexity to avoid overfitting. We demonstrate the algorithm on a wide range of problems, from simple canonical systems, including linear and nonlinear oscillators and the chaotic Lorenz system, to the fluid vortex shedding behind an obstacle. The fluid example illustrates the ability of this method to discover the underlying dynamics of a system that took experts in the community nearly 30 years to resolve. We also show that this method generalizes to parameterized systems and systems that are time-varying or have external forcing.

Journal ArticleDOI
23 Feb 2016-JAMA
TL;DR: To evaluate the validity of clinical criteria to identify patients with suspected infection who are at risk of sepsis, a new model derived using multivariable logistic regression in a split sample was derived.
Abstract: RESULTS In the primary cohort, 148 907 encounters had suspected infection (n = 74 453 derivation; n = 74 454 validation), of whom 6347 (4%) died. Among ICU encounters in the validation cohort (n = 7932 with suspected infection, of whom 1289 [16%] died), the predictive validity for in-hospital mortality was lower for SIRS (AUROC = 0.64; 95% CI, 0.62-0.66) and qSOFA (AUROC = 0.66; 95% CI, 0.64-0.68) vs SOFA (AUROC = 0.74; 95% CI, 0.73-0.76; P < .001 for both) or LODS (AUROC = 0.75; 95% CI, 0.73-0.76; P < .001 for both). Among non-ICU encounters in the validation cohort (n = 66 522 with suspected infection, of whom 1886 [3%] died), qSOFA had predictive validity (AUROC = 0.81; 95% CI, 0.80-0.82) that was greater than SOFA (AUROC = 0.79; 95% CI, 0.78-0.80; P < .001) and SIRS (AUROC = 0.76; 95% CI, 0.75-0.77; P < .001). Relative to qSOFA scores lower than 2, encounters with qSOFA scores of 2 or higher had a 3- to 14-fold increase in hospital mortality across baseline risk deciles. Findings were similar in external data sets and for the secondary outcome.

Journal ArticleDOI
Mingxun Wang1, Jeremy Carver1, Vanessa V. Phelan2, Laura M. Sanchez2, Neha Garg2, Yao Peng1, Don D. Nguyen1, Jeramie D. Watrous2, Clifford A. Kapono1, Tal Luzzatto-Knaan2, Carla Porto2, Amina Bouslimani2, Alexey V. Melnik2, Michael J. Meehan2, Wei-Ting Liu3, Max Crüsemann4, Paul D. Boudreau4, Eduardo Esquenazi, Mario Sandoval-Calderón5, Roland D. Kersten6, Laura A. Pace2, Robert A. Quinn7, Katherine R. Duncan8, Cheng-Chih Hsu1, Dimitrios J. Floros1, Ronnie G. Gavilan, Karin Kleigrewe4, Trent R. Northen9, Rachel J. Dutton10, Delphine Parrot11, Erin E. Carlson12, Bertrand Aigle13, Charlotte Frydenlund Michelsen14, Lars Jelsbak14, Christian Sohlenkamp5, Pavel A. Pevzner1, Anna Edlund15, Anna Edlund16, Jeffrey S. McLean17, Jeffrey S. McLean15, Jörn Piel18, Brian T. Murphy19, Lena Gerwick4, Chih-Chuang Liaw20, Yu-Liang Yang21, Hans-Ulrich Humpf22, Maria Maansson14, Robert A. Keyzers23, Amy C. Sims24, Andrew R. Johnson25, Ashley M. Sidebottom25, Brian E. Sedio26, Andreas Klitgaard14, Charles B. Larson4, Charles B. Larson2, Cristopher A. Boya P., Daniel Torres-Mendoza, David Gonzalez2, Denise Brentan Silva27, Denise Brentan Silva28, Lucas Miranda Marques27, Daniel P. Demarque27, Egle Pociute, Ellis C. O’Neill4, Enora Briand11, Enora Briand4, Eric J. N. Helfrich18, Eve A. Granatosky29, Evgenia Glukhov4, Florian Ryffel18, Hailey Houson, Hosein Mohimani1, Jenan J. Kharbush4, Yi Zeng1, Julia A. Vorholt18, Kenji L. Kurita30, Pep Charusanti1, Kerry L. McPhail31, Kristian Fog Nielsen14, Lisa Vuong, Maryam Elfeki19, Matthew F. Traxler32, Niclas Engene33, Nobuhiro Koyama2, Oliver B. Vining31, Ralph S. Baric24, Ricardo Pianta Rodrigues da Silva27, Samantha J. Mascuch4, Sophie Tomasi11, Stefan Jenkins9, Venkat R. Macherla, Thomas Hoffman, Vinayak Agarwal4, Philip G. Williams34, Jingqui Dai34, Ram P. Neupane34, Joshua R. Gurr34, Andrés M. C. Rodríguez27, Anne Lamsa1, Chen Zhang1, Kathleen Dorrestein2, Brendan M. Duggan2, Jehad Almaliti2, Pierre-Marie Allard35, Prasad Phapale, Louis-Félix Nothias36, Theodore Alexandrov, Marc Litaudon36, Jean-Luc Wolfender35, Jennifer E. Kyle37, Thomas O. Metz37, Tyler Peryea38, Dac-Trung Nguyen38, Danielle VanLeer38, Paul Shinn38, Ajit Jadhav38, Rolf Müller, Katrina M. Waters37, Wenyuan Shi15, Xueting Liu39, Lixin Zhang39, Rob Knight1, Paul R. Jensen4, Bernhard O. Palsson1, Kit Pogliano1, Roger G. Linington30, Marcelino Gutiérrez, Norberto Peporine Lopes27, William H. Gerwick2, William H. Gerwick4, Bradley S. Moore2, Bradley S. Moore4, Pieter C. Dorrestein2, Pieter C. Dorrestein4, Nuno Bandeira2, Nuno Bandeira1 
TL;DR: In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations and data-driven social-networking should facilitate identification of spectra and foster collaborations.
Abstract: The potential of the diverse chemistries present in natural products (NP) for biotechnology and medicine remains untapped because NP databases are not searchable with raw data and the NP community has no way to share data other than in published papers. Although mass spectrometry (MS) techniques are well-suited to high-throughput characterization of NP, there is a pressing need for an infrastructure to enable sharing and curation of data. We present Global Natural Products Social Molecular Networking (GNPS; http://gnps.ucsd.edu), an open-access knowledge base for community-wide organization and sharing of raw, processed or identified tandem mass (MS/MS) spectrometry data. In GNPS, crowdsourced curation of freely available community-wide reference MS libraries will underpin improved annotations. Data-driven social-networking should facilitate identification of spectra and foster collaborations. We also introduce the concept of 'living data' through continuous reanalysis of deposited data.

Journal ArticleDOI
Shane A. McCarthy1, Sayantan Das2, Warren W. Kretzschmar3, Olivier Delaneau4, Andrew R. Wood5, Alexander Teumer6, Hyun Min Kang2, Christian Fuchsberger2, Petr Danecek1, Kevin Sharp3, Yang Luo1, C Sidore7, Alan Kwong2, Nicholas J. Timpson8, Seppo Koskinen, Scott I. Vrieze9, Laura J. Scott2, He Zhang2, Anubha Mahajan3, Jan H. Veldink, Ulrike Peters10, Ulrike Peters11, Carlos N. Pato12, Cornelia M. van Duijn13, Christopher E. Gillies2, Ilaria Gandin14, Massimo Mezzavilla, Arthur Gilly1, Massimiliano Cocca14, Michela Traglia, Andrea Angius7, Jeffrey C. Barrett1, D.I. Boomsma15, Kari Branham2, Gerome Breen16, Gerome Breen17, Chad M. Brummett2, Fabio Busonero7, Harry Campbell18, Andrew T. Chan19, Sai Chen2, Emily Y. Chew20, Francis S. Collins20, Laura J Corbin8, George Davey Smith8, George Dedoussis21, Marcus Dörr6, Aliki-Eleni Farmaki21, Luigi Ferrucci20, Lukas Forer22, Ross M. Fraser2, Stacey Gabriel23, Shawn Levy, Leif Groop24, Leif Groop25, Tabitha A. Harrison10, Andrew T. Hattersley5, Oddgeir L. Holmen26, Kristian Hveem26, Matthias Kretzler2, James Lee27, Matt McGue28, Thomas Meitinger29, David Melzer5, Josine L. Min8, Karen L. Mohlke30, John B. Vincent31, Matthias Nauck6, Deborah A. Nickerson11, Aarno Palotie19, Aarno Palotie23, Michele T. Pato12, Nicola Pirastu14, Melvin G. McInnis2, J. Brent Richards32, J. Brent Richards16, Cinzia Sala, Veikko Salomaa, David Schlessinger20, Sebastian Schoenherr22, P. Eline Slagboom33, Kerrin S. Small16, Tim D. Spector16, Dwight Stambolian34, Marcus A. Tuke5, Jaakko Tuomilehto, Leonard H. van den Berg, Wouter van Rheenen, Uwe Völker6, Cisca Wijmenga35, Daniela Toniolo, Eleftheria Zeggini1, Paolo Gasparini14, Matthew G. Sampson2, James F. Wilson18, Timothy M. Frayling5, Paul I.W. de Bakker36, Morris A. Swertz35, Steven A. McCarroll19, Charles Kooperberg10, Annelot M. Dekker, David Altshuler, Cristen J. Willer2, William G. Iacono28, Samuli Ripatti24, Nicole Soranzo27, Nicole Soranzo1, Klaudia Walter1, Anand Swaroop20, Francesco Cucca7, Carl A. Anderson1, Richard M. Myers, Michael Boehnke2, Mark I. McCarthy3, Mark I. McCarthy37, Richard Durbin1, Gonçalo R. Abecasis2, Jonathan Marchini3 
TL;DR: A reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies.
Abstract: We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole-genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1% and a large increase in the number of SNPs tested in association studies, and it can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.

Journal ArticleDOI
07 Oct 2016-Science
TL;DR: N nanoscale phase stabilization of CsPbI3 quantum dots (QDs) to low temperatures that can be used as the active component of efficient optoelectronic devices and describe the formation of α-CsP bI3 QD films that are phase-stable for months in ambient air.
Abstract: We show nanoscale phase stabilization of CsPbI 3 quantum dots (QDs) to low temperatures that can be used as the active component of efficient optoelectronic devices. CsPbI 3 is an all-inorganic analog to the hybrid organic cation halide perovskites, but the cubic phase of bulk CsPbI 3 (α-CsPbI 3 )—the variant with desirable band gap—is only stable at high temperatures. We describe the formation of α-CsPbI 3 QD films that are phase-stable for months in ambient air. The films exhibit long-range electronic transport and were used to fabricate colloidal perovskite QD photovoltaic cells with an open-circuit voltage of 1.23 volts and efficiency of 10.77%. These devices also function as light-emitting diodes with low turn-on voltage and tunable emission.

Journal ArticleDOI
21 Jun 2016-JAMA
TL;DR: It is concluded with high certainty that screening for colorectal cancer in average-risk, asymptomatic adults aged 50 to 75 years is of substantial net benefit.
Abstract: Importance Colorectal cancer is the second leading cause of cancer death in the United States. In 2016, an estimated 134 000 persons will be diagnosed with the disease, and about 49 000 will die from it. Colorectal cancer is most frequently diagnosed among adults aged 65 to 74 years; the median age at death from colorectal cancer is 73 years. Objective To update the 2008 US Preventive Services Task Force (USPSTF) recommendation on screening for colorectal cancer. Evidence Review The USPSTF reviewed the evidence on the effectiveness of screening with colonoscopy, flexible sigmoidoscopy, computed tomography colonography, the guaiac-based fecal occult blood test, the fecal immunochemical test, the multitargeted stool DNA test, and the methylated SEPT9 DNA test in reducing the incidence of and mortality from colorectal cancer or all-cause mortality; the harms of these screening tests; and the test performance characteristics of these tests for detecting adenomatous polyps, advanced adenomas based on size, or both, as well as colorectal cancer. The USPSTF also commissioned a comparative modeling study to provide information on optimal starting and stopping ages and screening intervals across the different available screening methods. Findings The USPSTF concludes with high certainty that screening for colorectal cancer in average-risk, asymptomatic adults aged 50 to 75 years is of substantial net benefit. Multiple screening strategies are available to choose from, with different levels of evidence to support their effectiveness, as well as unique advantages and limitations, although there are no empirical data to demonstrate that any of the reviewed strategies provide a greater net benefit. Screening for colorectal cancer is a substantially underused preventive health strategy in the United States. Conclusions and Recommendations The USPSTF recommends screening for colorectal cancer starting at age 50 years and continuing until age 75 years (A recommendation). The decision to screen for colorectal cancer in adults aged 76 to 85 years should be an individual one, taking into account the patient’s overall health and prior screening history (C recommendation).

Journal ArticleDOI
13 Sep 2016-JAMA
TL;DR: The Second Panel on Cost-Effectiveness in Health and Medicine reviewed the current status of the field of cost-effectiveness analysis and developed a new set of recommendations, including the recommendation to perform analyses from 2 reference case perspectives and to provide an impact inventory to clarify included consequences.
Abstract: Importance Since publication of the report by the Panel on Cost-Effectiveness in Health and Medicine in 1996, researchers have advanced the methods of cost-effectiveness analysis, and policy makers have experimented with its application. The need to deliver health care efficiently and the importance of using analytic techniques to understand the clinical and economic consequences of strategies to improve health have increased in recent years. Objective To review the state of the field and provide recommendations to improve the quality of cost-effectiveness analyses. The intended audiences include researchers, government policy makers, public health officials, health care administrators, payers, businesses, clinicians, patients, and consumers. Design In 2012, the Second Panel on Cost-Effectiveness in Health and Medicine was formed and included 2 co-chairs, 13 members, and 3 additional members of a leadership group. These members were selected on the basis of their experience in the field to provide broad expertise in the design, conduct, and use of cost-effectiveness analyses. Over the next 3.5 years, the panel developed recommendations by consensus. These recommendations were then reviewed by invited external reviewers and through a public posting process. Findings The concept of a “reference case” and a set of standard methodological practices that all cost-effectiveness analyses should follow to improve quality and comparability are recommended. All cost-effectiveness analyses should report 2 reference case analyses: one based on a health care sector perspective and another based on a societal perspective. The use of an “impact inventory,” which is a structured table that contains consequences (both inside and outside the formal health care sector), intended to clarify the scope and boundaries of the 2 reference case analyses is also recommended. This special communication reviews these recommendations and others concerning the estimation of the consequences of interventions, the valuation of health outcomes, and the reporting of cost-effectiveness analyses. Conclusions and Relevance The Second Panel reviewed the current status of the field of cost-effectiveness analysis and developed a new set of recommendations. Major changes include the recommendation to perform analyses from 2 reference case perspectives and to provide an impact inventory to clarify included consequences.

Journal ArticleDOI
TL;DR: These recommendations address the best approaches for antibiotic stewardship programs to influence the optimal use of antibiotics.
Abstract: Evidence-based guidelines for implementation and measurement of antibiotic stewardship interventions in inpatient populations including long-term care were prepared by a multidisciplinary expert panel of the Infectious Diseases Society of America and the Society for Healthcare Epidemiology of America. The panel included clinicians and investigators representing internal medicine, emergency medicine, microbiology, critical care, surgery, epidemiology, pharmacy, and adult and pediatric infectious diseases specialties. These recommendations address the best approaches for antibiotic stewardship programs to influence the optimal use of antibiotics.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate a highly reversible zinc/manganese oxide system in which optimal mild aqueous ZnSO4-based solution is used as the electrolyte, and nanofibres of a manganese oxide phase, α-MnO2, are used as a cathode.
Abstract: Rechargeable aqueous batteries such as alkaline zinc/manganese oxide batteries are highly desirable for large-scale energy storage owing to their low cost and high safety; however, cycling stability is a major issue for their applications. Here we demonstrate a highly reversible zinc/manganese oxide system in which optimal mild aqueous ZnSO4-based solution is used as the electrolyte, and nanofibres of a manganese oxide phase, α-MnO2, are used as the cathode. We show that a chemical conversion reaction mechanism between α-MnO2 and H+ is mainly responsible for the good performance of the system. This includes an operating voltage of 1.44 V, a capacity of 285 mAh g−1 (MnO2), and capacity retention of 92% over 5,000 cycles. The Zn metal anode also shows high stability. This finding opens new opportunities for the development of low-cost, high-performance rechargeable aqueous batteries. Rechargeable aqueous batteries are attractive owing to their relatively low cost and safety. Here the authors report an aqueous zinc/manganese oxide battery that operates via a conversion reaction mechanism and exhibits a long-term cycling stability.

Posted Content
TL;DR: XNOR-Nets as discussed by the authors approximate convolutions using primarily binary operations, which results in 58x faster convolutional operations and 32x memory savings, and outperforms BinaryConnect and BinaryNets by large margins on ImageNet.
Abstract: We propose two efficient approximations to standard convolutional neural networks: Binary-Weight-Networks and XNOR-Networks. In Binary-Weight-Networks, the filters are approximated with binary values resulting in 32x memory saving. In XNOR-Networks, both the filters and the input to convolutional layers are binary. XNOR-Networks approximate convolutions using primarily binary operations. This results in 58x faster convolutional operations and 32x memory savings. XNOR-Nets offer the possibility of running state-of-the-art networks on CPUs (rather than GPUs) in real-time. Our binary networks are simple, accurate, efficient, and work on challenging visual tasks. We evaluate our approach on the ImageNet classification task. The classification accuracy with a Binary-Weight-Network version of AlexNet is only 2.9% less than the full-precision AlexNet (in top-1 measure). We compare our method with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than 16% in top-1 accuracy.

Journal ArticleDOI
TL;DR: This updated version of mclust adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.
Abstract: Finite mixture models are being used increasingly to model a wide variety of random phenomena for clustering, classification and density estimation. mclust is a powerful and popular package which allows modelling of data as a Gaussian finite mixture with different covariance structures and different numbers of mixture components, for a variety of purposes of analysis. Recently, version 5 of the package has been made available on CRAN. This updated version adds new covariance structures, dimension reduction capabilities for visualisation, model selection criteria, initialisation strategies for the EM algorithm, and bootstrap-based inference, making it a full-featured R package for data analysis via finite mixture modelling.

Journal ArticleDOI
TL;DR: The American Pain Society, with input from the American Society of Anesthesiologists, developed a clinical practice guideline to promote evidence-based, effective, and safer postoperative pain management in children and adults.

Journal ArticleDOI
TL;DR: In this article, the latest advances in valley-tronics have largely been enabled by the isolation of 2D materials (such as graphene and semiconducting transition metal dichalcogenides) that host an easily accessible electronic valley degree of freedom, allowing for dynamic control.
Abstract: Semiconductor technology is currently based on the manipulation of electronic charge; however, electrons have additional degrees of freedom, such as spin and valley, that can be used to encode and process information. Over the past several decades, there has been significant progress in manipulating electron spin for semiconductor spintronic devices, motivated by potential spin-based information processing and storage applications. However, experimental progress towards manipulating the valley degree of freedom for potential valleytronic devices has been limited until very recently. We review the latest advances in valleytronics, which have largely been enabled by the isolation of 2D materials (such as graphene and semiconducting transition metal dichalcogenides) that host an easily accessible electronic valley degree of freedom, allowing for dynamic control. The energy extrema of an electronic band are referred to as valleys. In 2D materials, two distinguishable valleys can be used to encode information and explore other valleytronic applications.

Proceedings Article
19 Jun 2016
TL;DR: Deep Embedded Clustering (DEC) as discussed by the authors learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective.
Abstract: Clustering is central to many data-driven application domains and has been studied extensively in terms of distance functions and grouping algorithms. Relatively little work has focused on learning representations for clustering. In this paper, we propose Deep Embedded Clustering (DEC), a method that simultaneously learns feature representations and cluster assignments using deep neural networks. DEC learns a mapping from the data space to a lower-dimensional feature space in which it iteratively optimizes a clustering objective. Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.

Proceedings Article
04 Nov 2016
TL;DR: The BIDAF network is introduced, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization.
Abstract: Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. Recently, attention mechanisms have been successfully extended to MC. Typically these methods use attention to focus on a small portion of the context and summarize it with a fixed-size vector, couple attentions temporally, and/or often form a uni-directional attention. In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Our experimental evaluations show that our model achieves the state-of-the-art results in Stanford Question Answering Dataset (SQuAD) and CNN/DailyMail cloze test.

Journal ArticleDOI
TL;DR: It is established that high CAR-T cell doses and tumor burden increase the risks of severe cytokine release syndrome and neurotoxicity, and serum biomarkers that allow testing of early intervention strategies in patients at the highest risk of toxicity are identified.
Abstract: BACKGROUND. T cells that have been modified to express a CD19-specific chimeric antigen receptor (CAR) have antitumor activity in B cell malignancies; however, identification of the factors that determine toxicity and efficacy of these T cells has been challenging in prior studies in which phenotypically heterogeneous CAR–T cell products were prepared from unselected T cells. METHODS. We conducted a clinical trial to evaluate CD19 CAR–T cells that were manufactured from defined CD4+ and CD8+ T cell subsets and administered in a defined CD4+:CD8+ composition to adults with B cell acute lymphoblastic leukemia after lymphodepletion chemotherapy. RESULTS. The defined composition product was remarkably potent, as 27 of 29 patients (93%) achieved BM remission, as determined by flow cytometry. We established that high CAR–T cell doses and tumor burden increase the risks of severe cytokine release syndrome and neurotoxicity. Moreover, we identified serum biomarkers that allow testing of early intervention strategies in patients at the highest risk of toxicity. Risk-stratified CAR–T cell dosing based on BM disease burden decreased toxicity. CD8+ T cell–mediated anti-CAR transgene product immune responses developed after CAR–T cell infusion in some patients, limited CAR–T cell persistence, and increased relapse risk. Addition of fludarabine to the lymphodepletion regimen improved CAR–T cell persistence and disease-free survival. CONCLUSION. Immunotherapy with a CAR–T cell product of defined composition enabled identification of factors that correlated with CAR–T cell expansion, persistence, and toxicity and facilitated design of lymphodepletion and CAR–T cell dosing strategies that mitigated toxicity and improved disease-free survival. TRIAL REGISTRATION. ClinicalTrials.gov {"type":"clinical-trial","attrs":{"text":"NCT01865617","term_id":"NCT01865617"}}NCT01865617. FUNDING. R01-CA136551; Life Science Development Fund; Juno Therapeutics; Bezos Family Foundation.

Journal ArticleDOI
28 Jan 2016-Cell
TL;DR: The complete set of genes associated with 1,122 diffuse grade II-III-IV gliomas were defined from The Cancer Genome Atlas and molecular profiles were used to improve disease classification, identify molecular correlations, and provide insights into the progression from low- to high-grade disease.