scispace - formally typeset
Search or ask a question
Browse all papers

Posted Content
TL;DR: A small change in the stylization architecture results in a significant qualitative improvement in the generated images, and can be used to train high-performance architectures for real-time image generation.
Abstract: It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images. The change is limited to swapping batch normalization with instance normalization, and to apply the latter both at training and testing times. The resulting method can be used to train high-performance architectures for real-time image generation. The code will is made available on github at this https URL. Full paper can be found at arXiv:1701.02096.

3,118 citations


Proceedings ArticleDOI
21 Mar 2016
TL;DR: This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.
Abstract: Deep learning takes advantage of large datasets and computationally efficient training algorithms to outperform other approaches at various machine learning tasks. However, imperfections in the training phase of deep neural networks make them vulnerable to adversarial samples: inputs crafted by adversaries with the intent of causing deep neural networks to misclassify. In this work, we formalize the space of adversaries against deep neural networks (DNNs) and introduce a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs. In an application to computer vision, we show that our algorithms can reliably produce samples correctly classified by human subjects but misclassified in specific targets by a DNN with a 97% adversarial success rate while only modifying on average 4.02% of the input features per sample. We then evaluate the vulnerability of different sample classes to adversarial perturbations by defining a hardness measure. Finally, we describe preliminary work outlining defenses against adversarial samples by defining a predictive measure of distance between a benign input and a target classification.

3,114 citations


Journal ArticleDOI
TL;DR: An updated summary of recent advances in the field of nanomedicines and nano based drug delivery systems through comprehensive scrutiny of the discovery and application of nanomaterials in improving both the efficacy of novel and old drugs and selective diagnosis through disease marker molecules is presented.
Abstract: Nanomedicine and nano delivery systems are a relatively new but rapidly developing science where materials in the nanoscale range are employed to serve as means of diagnostic tools or to deliver therapeutic agents to specific targeted sites in a controlled manner Nanotechnology offers multiple benefits in treating chronic human diseases by site-specific, and target-oriented delivery of precise medicines Recently, there are a number of outstanding applications of the nanomedicine (chemotherapeutic agents, biological agents, immunotherapeutic agents etc) in the treatment of various diseases The current review, presents an updated summary of recent advances in the field of nanomedicines and nano based drug delivery systems through comprehensive scrutiny of the discovery and application of nanomaterials in improving both the efficacy of novel and old drugs (eg, natural products) and selective diagnosis through disease marker molecules The opportunities and challenges of nanomedicines in drug delivery from synthetic/natural sources to their clinical applications are also discussed In addition, we have included information regarding the trends and perspectives in nanomedicine area

3,112 citations


Proceedings ArticleDOI
21 Aug 2015
TL;DR: The Stanford Natural Language Inference (SNLI) corpus as discussed by the authors is a large-scale collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning.
Abstract: Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable testing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This increase in scale allows lexicalized classifiers to outperform some sophisticated existing entailment models, and it allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.

3,100 citations


Journal ArticleDOI
TL;DR: The immune system recognizes and is poised to eliminate cancer but is held in check by inhibitory receptors and ligands, so drugs interrupting immune checkpoints, such as anti-CTLA-4, anti-PD-1, and others in early development, can unleash anti-tumor immunity and mediate durable cancer regressions.

3,097 citations


Posted Content
Barret Zoph1, Quoc V. Le1
TL;DR: This paper uses a recurrent network to generate the model descriptions of neural networks and trains this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set.
Abstract: Neural networks are powerful and flexible models that work well for many difficult learning tasks in image, speech and natural language understanding. Despite their success, neural networks are still hard to design. In this paper, we use a recurrent network to generate the model descriptions of neural networks and train this RNN with reinforcement learning to maximize the expected accuracy of the generated architectures on a validation set. On the CIFAR-10 dataset, our method, starting from scratch, can design a novel network architecture that rivals the best human-invented architecture in terms of test set accuracy. Our CIFAR-10 model achieves a test error rate of 3.65, which is 0.09 percent better and 1.05x faster than the previous state-of-the-art model that used a similar architectural scheme. On the Penn Treebank dataset, our model can compose a novel recurrent cell that outperforms the widely-used LSTM cell, and other state-of-the-art baselines. Our cell achieves a test set perplexity of 62.4 on the Penn Treebank, which is 3.6 perplexity better than the previous state-of-the-art model. The cell can also be transferred to the character language modeling task on PTB and achieves a state-of-the-art perplexity of 1.214.

3,095 citations


Journal ArticleDOI
Daniel D Murray1, Kazuo Suzuki1, Matthew Law1, Jonel Trebicka2  +1486 moreInstitutions (9)
14 Oct 2015-PLOS ONE
TL;DR: No associations with mortality were found with any circulating miRNAs studied and these results cast doubt onto the effectiveness of circulating miRNA as early predictors of mortality or the major underlying diseases that contribute to mortality in participants treated for HIV-1 infection.
Abstract: Introduction The use of anti-retroviral therapy (ART) has dramatically reduced HIV-1 associated morbidity and mortality. However, HIV-1 infected individuals have increased rates of morbidity and mortality compared to the non-HIV-1 infected population and this appears to be related to end-organ diseases collectively referred to as Serious Non-AIDS Events (SNAEs). Circulating miRNAs are reported as promising biomarkers for a number of human disease conditions including those that constitute SNAEs. Our study sought to investigate the potential of selected miRNAs in predicting mortality in HIV-1 infected ART treated individuals. Materials and Methods A set of miRNAs was chosen based on published associations with human disease conditions that constitute SNAEs. This case: control study compared 126 cases (individuals who died whilst on therapy), and 247 matched controls (individuals who remained alive). Cases and controls were ART treated participants of two pivotal HIV-1 trials. The relative abundance of each miRNA in serum was measured, by RTqPCR. Associations with mortality (all-cause, cardiovascular and malignancy) were assessed by logistic regression analysis. Correlations between miRNAs and CD4+ T cell count, hs-CRP, IL-6 and D-dimer were also assessed. Results None of the selected miRNAs was associated with all-cause, cardiovascular or malignancy mortality. The levels of three miRNAs (miRs -21, -122 and -200a) correlated with IL-6 while miR-21 also correlated with D-dimer. Additionally, the abundance of miRs -31, -150 and -223, correlated with baseline CD4+ T cell count while the same three miRNAs plus miR-145 correlated with nadir CD4+ T cell count. Discussion No associations with mortality were found with any circulating miRNA studied. These results cast doubt onto the effectiveness of circulating miRNA as early predictors of mortality or the major underlying diseases that contribute to mortality in participants treated for HIV-1 infection.

3,094 citations


Journal ArticleDOI
TL;DR: In patients with unresectable hepatocellular carcinoma, atezolIZumab combined with bevacizumab resulted in better overall and progression-free survival outcomes than sorafenib.
Abstract: Background The combination of atezolizumab and bevacizumab showed encouraging antitumor activity and safety in a phase 1b trial involving patients with unresectable hepatocellular carcinom...

3,085 citations


Journal ArticleDOI
Nabila Aghanim1, Yashar Akrami2, Yashar Akrami3, Yashar Akrami4  +229 moreInstitutions (70)
TL;DR: In this paper, the cosmological parameter results from the final full-mission Planck measurements of the CMB anisotropies were presented, with good consistency with the standard spatially-flat 6-parameter CDM cosmology having a power-law spectrum of adiabatic scalar perturbations from polarization, temperature, and lensing separately and in combination.
Abstract: We present cosmological parameter results from the final full-mission Planck measurements of the CMB anisotropies. We find good consistency with the standard spatially-flat 6-parameter $\Lambda$CDM cosmology having a power-law spectrum of adiabatic scalar perturbations (denoted "base $\Lambda$CDM" in this paper), from polarization, temperature, and lensing, separately and in combination. A combined analysis gives dark matter density $\Omega_c h^2 = 0.120\pm 0.001$, baryon density $\Omega_b h^2 = 0.0224\pm 0.0001$, scalar spectral index $n_s = 0.965\pm 0.004$, and optical depth $\tau = 0.054\pm 0.007$ (in this abstract we quote $68\,\%$ confidence regions on measured parameters and $95\,\%$ on upper limits). The angular acoustic scale is measured to $0.03\,\%$ precision, with $100\theta_*=1.0411\pm 0.0003$. These results are only weakly dependent on the cosmological model and remain stable, with somewhat increased errors, in many commonly considered extensions. Assuming the base-$\Lambda$CDM cosmology, the inferred late-Universe parameters are: Hubble constant $H_0 = (67.4\pm 0.5)$km/s/Mpc; matter density parameter $\Omega_m = 0.315\pm 0.007$; and matter fluctuation amplitude $\sigma_8 = 0.811\pm 0.006$. We find no compelling evidence for extensions to the base-$\Lambda$CDM model. Combining with BAO we constrain the effective extra relativistic degrees of freedom to be $N_{\rm eff} = 2.99\pm 0.17$, and the neutrino mass is tightly constrained to $\sum m_ u< 0.12$eV. The CMB spectra continue to prefer higher lensing amplitudes than predicted in base -$\Lambda$CDM at over $2\,\sigma$, which pulls some parameters that affect the lensing amplitude away from the base-$\Lambda$CDM model; however, this is not supported by the lensing reconstruction or (in models that also change the background geometry) BAO data. (Abridged)

3,077 citations


Journal ArticleDOI
TL;DR: Osimertinib showed efficacy superior to that of standard EGFR‐TKIs in the first‐line treatment of EGFR mutation–positive advanced NSCLC, with a similar safety profile and lower rates of serious adverse events.
Abstract: BackgroundOsimertinib is an oral, third-generation, irreversible epidermal growth factor receptor tyrosine kinase inhibitor (EGFR-TKI) that selectively inhibits both EGFR-TKI–sensitizing and EGFR T790M resistance mutations. We compared osimertinib with standard EGFR-TKIs in patients with previously untreated, EGFR mutation–positive advanced non–small-cell lung cancer (NSCLC). MethodsIn this double-blind, phase 3 trial, we randomly assigned 556 patients with previously untreated, EGFR mutation–positive (exon 19 deletion or L858R) advanced NSCLC in a 1:1 ratio to receive either osimertinib (at a dose of 80 mg once daily) or a standard EGFR-TKI (gefitinib at a dose of 250 mg once daily or erlotinib at a dose of 150 mg once daily). The primary end point was investigator-assessed progression-free survival. ResultsThe median progression-free survival was significantly longer with osimertinib than with standard EGFR-TKIs (18.9 months vs. 10.2 months; hazard ratio for disease progression or death, 0.46; 95% confi...

3,074 citations


Posted Content
TL;DR: This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.
Abstract: Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU)---deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs (caches, out-of-order execution, multithreading, multiprocessing, prefetching, ...) that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X - 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X - 80X higher. Moreover, using the GPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.

Journal ArticleDOI
08 Apr 2016-Science
TL;DR: The cellular ecosystem of tumors is begin to unravel and how single-cell genomics offers insights with implications for both targeted and immune therapies is unraveled.
Abstract: To explore the distinct genotypic and phenotypic states of melanoma tumors, we applied single-cell RNA sequencing (RNA-seq) to 4645 single cells isolated from 19 patients, profiling malignant, immune, stromal, and endothelial cells. Malignant cells within the same tumor displayed transcriptional heterogeneity associated with the cell cycle, spatial context, and a drug-resistance program. In particular, all tumors harbored malignant cells from two distinct transcriptional cell states, such that tumors characterized by high levels of the MITF transcription factor also contained cells with low MITF and elevated levels of the AXL kinase. Single-cell analyses suggested distinct tumor microenvironmental patterns, including cell-to-cell interactions. Analysis of tumor-infiltrating T cells revealed exhaustion programs, their connection to T cell activation and clonal expansion, and their variability across patients. Overall, we begin to unravel the cellular ecosystem of tumors and how single-cell genomics offers insights with implications for both targeted and immune therapies.

Journal ArticleDOI
TL;DR: The largest declines in risk exposure from 2010 to 2019 were among a set of risks that are strongly linked to social and economic development, including household air pollution; unsafe water, sanitation, and handwashing; and child growth failure.

Proceedings ArticleDOI
01 Sep 2015
TL;DR: VoxNet is proposed, an architecture to tackle the problem of robust object recognition by integrating a volumetric Occupancy Grid representation with a supervised 3D Convolutional Neural Network (3D CNN).
Abstract: Robust object recognition is a crucial skill for robots operating autonomously in real world environments. Range sensors such as LiDAR and RGBD cameras are increasingly found in modern robotic systems, providing a rich source of 3D information that can aid in this task. However, many current systems do not fully utilize this information and have trouble efficiently dealing with large amounts of point cloud data. In this paper, we propose VoxNet, an architecture to tackle this problem by integrating a volumetric Occupancy Grid representation with a supervised 3D Convolutional Neural Network (3D CNN). We evaluate our approach on publicly available benchmarks using LiDAR, RGBD, and CAD data. VoxNet achieves accuracy beyond the state of the art while labeling hundreds of instances per second.

Journal ArticleDOI
TL;DR: The distribution of children’s COVID-19 cases varied with time and space, and most of the cases were concentrated in Hubei province and surrounding areas, providing strong evidence of human-to-human transmission.
Abstract: OBJECTIVE: To identify the epidemiological characteristics and transmission patterns of pediatric patients with the 2019 novel coronavirus disease (COVID-19) in China. METHODS: Nationwide case series of 2135 pediatric patients with COVID-19 reported to the Chinese Center for Disease Control and Prevention from January 16, 2020, to February 8, 2020, were included. The epidemic curves were constructed by key dates of disease onset and case diagnosis. Onset-to-diagnosis curves were constructed by fitting a log-normal distribution to data on both onset and diagnosis dates. RESULTS: There were 728 (34.1%) laboratory-confirmed cases and 1407 (65.9%) suspected cases. The median age of all patients was 7 years (interquartile range: 2–13 years), and 1208 case patients (56.6%) were boys. More than 90% of all patients had asymptomatic, mild, or moderate cases. The median time from illness onset to diagnoses was 2 days (range: 0–42 days). There was a rapid increase of disease at the early stage of the epidemic, and then there was a gradual and steady decrease. The disease rapidly spread from Hubei province to surrounding provinces over time. More children were infected in Hubei province than any other province. CONCLUSIONS: Children of all ages appeared susceptible to COVID-19, and there was no significant sex difference. Although clinical manifestations of children’s COVID-19 cases were generally less severe than those of adult patients, young children, particularly infants, were vulnerable to infection. The distribution of children’s COVID-19 cases varied with time and space, and most of the cases were concentrated in Hubei province and surrounding areas. Furthermore, this study provides strong evidence of human-to-human transmission.

Proceedings Article
07 Dec 2015
TL;DR: In this paper, the use of character-level convolutional networks (ConvNets) for text classification has been explored and compared with traditional models such as bag of words, n-grams and their TFIDF variants.
Abstract: This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.

Proceedings ArticleDOI
27 Jun 2016
TL;DR: This work proposes a new SfM technique that improves upon the state of the art to make a further step towards building a truly general-purpose pipeline.
Abstract: Incremental Structure-from-Motion is a prevalent strategy for 3D reconstruction from unordered image collections. While incremental reconstruction systems have tremendously advanced in all regards, robustness, accuracy, completeness, and scalability remain the key problems towards building a truly general-purpose pipeline. We propose a new SfM technique that improves upon the state of the art to make a further step towards this ultimate goal. The full reconstruction pipeline is released to the public as an open-source implementation.

Journal ArticleDOI
TL;DR: Simulation results demonstrate that an IRS-aided single-cell wireless system can achieve the same rate performance as a benchmark massive MIMO system without using IRS, but with significantly reduced active antennas/RF chains.
Abstract: Intelligent reflecting surface (IRS) is a revolutionary and transformative technology for achieving spectrum and energy efficient wireless communication cost-effectively in the future. Specifically, an IRS consists of a large number of low-cost passive elements each being able to reflect the incident signal independently with an adjustable phase shift so as to collaboratively achieve three-dimensional (3D) passive beamforming without the need of any transmit radio-frequency (RF) chains. In this paper, we study an IRS-aided single-cell wireless system where one IRS is deployed to assist in the communications between a multi-antenna access point (AP) and multiple single-antenna users. We formulate and solve new problems to minimize the total transmit power at the AP by jointly optimizing the transmit beamforming by active antenna array at the AP and reflect beamforming by passive phase shifters at the IRS, subject to users’ individual signal-to-interference-plus-noise ratio (SINR) constraints. Moreover, we analyze the asymptotic performance of IRS’s passive beamforming with infinitely large number of reflecting elements and compare it to that of the traditional active beamforming/relaying. Simulation results demonstrate that an IRS-aided MIMO system can achieve the same rate performance as a benchmark massive MIMO system without using IRS, but with significantly reduced active antennas/RF chains. We also draw useful insights into optimally deploying IRS in future wireless systems.

Journal ArticleDOI
25 Jun 2020-Cell
TL;DR: Using HLA class I and II predicted peptide ‘megapools’, circulating SARS-CoV-2−specific CD8+ and CD4+ T cells were identified in ∼70% and 100% of COVID-19 convalescent patients, respectively, suggesting cross-reactive T cell recognition between circulating ‘common cold’ coronaviruses and SARS.

Proceedings ArticleDOI
07 Jun 2015
TL;DR: In this article, the authors show that it is possible to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence.
Abstract: Deep neural networks (DNNs) have recently been achieving state-of-the-art performance on a variety of pattern-recognition tasks, most notably visual classification problems. Given that DNNs are now able to classify objects in images with near-human-level performance, questions naturally arise as to what differences remain between computer and human vision. A recent study [30] revealed that changing an image (e.g. of a lion) in a way imperceptible to humans can cause a DNN to label the image as something else entirely (e.g. mislabeling a lion a library). Here we show a related result: it is easy to produce images that are completely unrecognizable to humans, but that state-of-the-art DNNs believe to be recognizable objects with 99.99% confidence (e.g. labeling with certainty that white noise static is a lion). Specifically, we take convolutional neural networks trained to perform well on either the ImageNet or MNIST datasets and then find images with evolutionary algorithms or gradient ascent that DNNs label with high confidence as belonging to each dataset class. It is possible to produce images totally unrecognizable to human eyes that DNNs believe with near certainty are familiar objects, which we call “fooling images” (more generally, fooling examples). Our results shed light on interesting differences between human vision and current DNNs, and raise questions about the generality of DNN computer vision.

Journal ArticleDOI
14 Oct 2016-Science
TL;DR: This work shows that the small and oxidation-stable rubidium cation (Rb+) can be embedded into a “cation cascade” to create perovskite materials with excellent material properties and achieved stabilized efficiencies of up to 21.6% on small areas.
Abstract: All of the cations currently used in perovskite solar cells abide by the tolerance factor for incorporation into the lattice. We show that the small and oxidation-stable rubidium cation (Rb + ) can be embedded into a “cation cascade” to create perovskite materials with excellent material properties. We achieved stabilized efficiencies of up to 21.6% (average value, 20.2%) on small areas (and a stabilized 19.0% on a cell 0.5 square centimeters in area) as well as an electroluminescence of 3.8%. The open-circuit voltage of 1.24 volts at a band gap of 1.63 electron volts leads to a loss in potential of 0.39 volts, versus 0.4 volts for commercial silicon cells. Polymer-coated cells maintained 95% of their initial performance at 85°C for 500 hours under full illumination and maximum power point tracking.

Journal ArticleDOI
TL;DR: At a global level, DALYs and HALE continue to show improvements and the importance of continued health interventions, which has changed in most locations in pace with the gross domestic product per person, education, and family planning.

Posted Content
TL;DR: It is shown that it is possible to overcome the limitation of connectionist models and train networks that can maintain expertise on tasks that they have not experienced for a long time and selectively slowing down learning on the weights important for previous tasks.
Abstract: The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.

01 Jan 2015
TL;DR: The Penn World Table (PWT) as mentioned in this paper has been used to compare real GDP comparisons across countries and over time, and the PWT version 8 will expand on previous versions of PWT in three respects.
Abstract: We describe the theory and practice of real GDP comparisons across countries and over time. Effective with version 8, the Penn World Table (PWT) will be taken over by the University of California, Davis and the University of Groningen, with continued input from Alan Heston at the University of Pennsylvania. Version 8 will expand on previous versions of PWT in three respects. First, it will distinguish real GDP on the expenditure side from real GDP on the output side, which differ by the terms of trade faced by countries. Second, it will distinguish growth rates of GDP based on national accounts data from growth rates that are benchmarked in multiple years to cross-country price data. Third, data on capital stocks will be reintroduced. Some illustrative results from PWT version 8 are discussed, including new results that show how the Penn effect is not emergent but a stable relationship over time.

Journal ArticleDOI
TL;DR: Comparing the performance of UMAP with five other tools, it is found that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters.
Abstract: Advances in single-cell technologies have enabled high-resolution dissection of tissue composition. Several tools for dimensionality reduction are available to analyze the large number of parameters generated in single-cell studies. Recently, a nonlinear dimensionality-reduction technique, uniform manifold approximation and projection (UMAP), was developed for the analysis of any type of high-dimensional data. Here we apply it to biological data, using three well-characterized mass cytometry and single-cell RNA sequencing datasets. Comparing the performance of UMAP with five other tools, we find that UMAP provides the fastest run times, highest reproducibility and the most meaningful organization of cell clusters. The work highlights the use of UMAP for improved visualization and interpretation of single-cell data.

Proceedings ArticleDOI
07 Aug 2019
TL;DR: CutMix as discussed by the authors augments the training data by cutting and pasting patches among training images, where the ground truth labels are also mixed proportionally to the area of the patches.
Abstract: Regional dropout strategies have been proposed to enhance performance of convolutional neural network classifiers. They have proved to be effective for guiding the model to attend on less discriminative parts of objects (e.g. leg as opposed to head of a person), thereby letting the network generalize better and have better object localization capabilities. On the other hand, current methods for regional dropout removes informative pixels on training images by overlaying a patch of either black pixels or random noise. Such removal is not desirable because it suffers from information loss causing inefficiency in training. We therefore propose the CutMix augmentation strategy: patches are cut and pasted among training images where the ground truth labels are also mixed proportionally to the area of the patches. By making efficient use of training pixels and retaining the regularization effect of regional dropout, CutMix consistently outperforms state-of-the-art augmentation strategies on CIFAR and ImageNet classification tasks, as well as on ImageNet weakly-supervised localization task. Moreover, unlike previous augmentation methods, our CutMix-trained ImageNet classifier, when used as a pretrained model, results in consistent performance gain in Pascal detection and MS-COCO image captioning benchmarks. We also show that CutMix can improve the model robustness against input corruptions and its out-of distribution detection performance.

Posted Content
TL;DR: XLNet is proposed, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT thanks to its autore progressive formulation.
Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.

Journal ArticleDOI
TL;DR: Nivolumab had substantial therapeutic activity and an acceptable safety profile in patients with previously heavily treated relapsed or refractory Hodgkin's lymphoma.
Abstract: BACKGROUND Preclinical studies suggest that Reed–Sternberg cells exploit the programmed death 1 (PD-1) pathway to evade immune detection. In classic Hodgkin’s lymphoma, alterations in chromosome 9p24.1 increase the abundance of the PD-1 ligands, PD-L1 and PD-L2, and promote their induction through Janus kinase (JAK)–signal transducer and activator of transcription (STAT) signaling. We hypothesized that nivolumab, a PD-1–blocking antibody, could inhibit tumor immune evasion in patients with relapsed or refractory Hodgkin’s lymphoma. METHODS In this ongoing study, 23 patients with relapsed or refractory Hodgkin’s lymphoma that had already been heavily treated received nivolumab (at a dose of 3 mg per kilogram of body weight) every 2 weeks until they had a complete response, tumor progression, or excessive toxic effects. Study objectives were measurement of safety and efficacy and assessment of the PDL1 and PDL2 (also called CD274 and PDCD1LG2, respectively) loci and PD-L1 and PD-L2 protein expression. RESULTS Of the 23 study patients, 78% were enrolled in the study after a relapse following autologous stem-cell transplantation and 78% after a relapse following the receipt of brentuximab vedotin. Drug-related adverse events of any grade and of grade 3 occurred in 78% and 22% of patients, respectively. An objective response was reported in 20 patients (87%), including 17% with a complete response and 70% with a partial response; the remaining 3 patients (13%) had stable disease. The rate of progression-free survival at 24 weeks was 86%; 11 patients were continuing to participate in the study. Reasons for discontinuation included stem-cell transplantation (in 6 patients), disease progression (in 4 patients), and drug toxicity (in 2 patients). Analyses of pretreatment tumor specimens from 10 patients revealed copy-number gains in PDL1 and PDL2 and increased expression of these ligands. Reed–Sternberg cells showed nuclear positivity of phosphorylated STAT3, indicative of active JAK-STAT signaling. CONCLUSIONS Nivolumab had substantial therapeutic activity and an acceptable safety profile in patients with previously heavily treated relapsed or refractory Hodgkin’s lymphoma. (Funded by Bristol-Myers Squibb and others; ClinicalTrials.gov number, NCT01592370.)

Proceedings Article
H. Brendan McMahan1, Eider Moore1, Daniel Ramage1, Seth Hampson, Blaise Aguera y Arcas1 
10 Apr 2017
TL;DR: In this paper, the authors presented a decentralized approach for federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets.
Abstract: Modern mobile devices have access to a wealth of data suitable for learning models, which in turn can greatly improve the user experience on the device For example, language models can improve speech recognition and text entry, and image models can automatically select good photos However, this rich data is often privacy sensitive, large in quantity, or both, which may preclude logging to the data center and training there using conventional approaches We advocate an alternative that leaves the training data distributed on the mobile devices, and learns a shared model by aggregating locally-computed updates We term this decentralized approach Federated Learning We present a practical method for the federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets These experiments demonstrate the approach is robust to the unbalanced and non-IID data distributions that are a defining characteristic of this setting Communication costs are the principal constraint, and we show a reduction in required communication rounds by 10-100x as compared to synchronized stochastic gradient descent

Journal ArticleDOI
05 Mar 2018-Nature
TL;DR: It is shown experimentally that when this angle is close to the ‘magic’ angle the electronic band structure near zero Fermi energy becomes flat, owing to strong interlayer coupling, and these flat bands exhibit insulating states at half-filling, which are not expected in the absence of correlations between electrons.
Abstract: A van der Waals heterostructure is a type of metamaterial that consists of vertically stacked two-dimensional building blocks held together by the van der Waals forces between the layers. This design means that the properties of van der Waals heterostructures can be engineered precisely, even more so than those of two-dimensional materials. One such property is the 'twist' angle between different layers in the heterostructure. This angle has a crucial role in the electronic properties of van der Waals heterostructures, but does not have a direct analogue in other types of heterostructure, such as semiconductors grown using molecular beam epitaxy. For small twist angles, the moire pattern that is produced by the lattice misorientation between the two-dimensional layers creates long-range modulation of the stacking order. So far, studies of the effects of the twist angle in van der Waals heterostructures have concentrated mostly on heterostructures consisting of monolayer graphene on top of hexagonal boron nitride, which exhibit relatively weak interlayer interaction owing to the large bandgap in hexagonal boron nitride. Here we study a heterostructure consisting of bilayer graphene, in which the two graphene layers are twisted relative to each other by a certain angle. We show experimentally that, as predicted theoretically, when this angle is close to the 'magic' angle the electronic band structure near zero Fermi energy becomes flat, owing to strong interlayer coupling. These flat bands exhibit insulating states at half-filling, which are not expected in the absence of correlations between electrons. We show that these correlated states at half-filling are consistent with Mott-like insulator states, which can arise from electrons being localized in the superlattice that is induced by the moire pattern. These properties of magic-angle-twisted bilayer graphene heterostructures suggest that these materials could be used to study other exotic many-body quantum phases in two dimensions in the absence of a magnetic field. The accessibility of the flat bands through electrical tunability and the bandwidth tunability through the twist angle could pave the way towards more exotic correlated systems, such as unconventional superconductors and quantum spin liquids.