scispace - formally typeset
Search or ask a question

Showing papers by "Massachusetts Institute of Technology published in 2017"


Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, Fausto Acernese3  +1131 moreInstitutions (123)
TL;DR: The association of GRB 170817A, detected by Fermi-GBM 1.7 s after the coalescence, corroborates the hypothesis of a neutron star merger and provides the first direct evidence of a link between these mergers and short γ-ray bursts.
Abstract: On August 17, 2017 at 12∶41:04 UTC the Advanced LIGO and Advanced Virgo gravitational-wave detectors made their first observation of a binary neutron star inspiral. The signal, GW170817, was detected with a combined signal-to-noise ratio of 32.4 and a false-alarm-rate estimate of less than one per 8.0×10^{4} years. We infer the component masses of the binary to be between 0.86 and 2.26 M_{⊙}, in agreement with masses of known neutron stars. Restricting the component spins to the range inferred in binary neutron stars, we find the component masses to be in the range 1.17-1.60 M_{⊙}, with the total mass of the system 2.74_{-0.01}^{+0.04}M_{⊙}. The source was localized within a sky region of 28 deg^{2} (90% probability) and had a luminosity distance of 40_{-14}^{+8} Mpc, the closest and most precisely localized gravitational-wave signal yet. The association with the γ-ray burst GRB 170817A, detected by Fermi-GBM 1.7 s after the coalescence, corroborates the hypothesis of a neutron star merger and provides the first direct evidence of a link between these mergers and short γ-ray bursts. Subsequent identification of transient counterparts across the electromagnetic spectrum in the same location further supports the interpretation of this event as a neutron star merger. This unprecedented joint gravitational and electromagnetic observation provides insight into astrophysics, dense matter, gravitation, and cosmology.

7,327 citations


Posted Content
TL;DR: This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.
Abstract: Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples---inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at this https URL and this https URL.

5,789 citations


Journal ArticleDOI
07 Jun 2017-Nature
TL;DR: Xu et al. as mentioned in this paper used magneto-optical Kerr effect microscopy to show that monolayer chromium triiodide (CrI3) is an Ising ferromagnet with out-of-plane spin orientation.
Abstract: Magneto-optical Kerr effect microscopy is used to show that monolayer chromium triiodide is an Ising ferromagnet with out-of-plane spin orientation. The question of what happens to the properties of a material when it is thinned down to atomic-scale thickness has for a long time been a largely hypothetical one. In the past decade, new experimental methods have made it possible to isolate and measure a range of two-dimensional structures, enabling many theoretical predictions to be tested. But it has been a particular challenge to observe intrinsic magnetic effects, which could shed light on the longstanding fundamental question of whether intrinsic long-range magnetic order can robustly exist in two dimensions. In this issue of Nature, two groups address this challenge and report ferromagnetism in atomically thin crystals. Xiang Zhang and colleagues measured atomic layers of Cr2Ge2Te6 and observed ferromagnetic ordering with a transition temperature that, unusually, can be controlled using small magnetic fields. Xiaodong Xu and colleagues measured atomic layers of CrI3 and observed ferromagnetic ordering that, remarkably, was suppressed in double layers of CrI3, but restored in triple layers. The two studies demonstrate a platform with which to test fundamental properties of purely two-dimensional magnets. Since the discovery of graphene1, the family of two-dimensional materials has grown, displaying a broad range of electronic properties. Recent additions include semiconductors with spin–valley coupling2, Ising superconductors3,4,5 that can be tuned into a quantum metal6, possible Mott insulators with tunable charge-density waves7, and topological semimetals with edge transport8,9. However, no two-dimensional crystal with intrinsic magnetism has yet been discovered10,11,12,13,14; such a crystal would be useful in many technologies from sensing to data storage15. Theoretically, magnetic order is prohibited in the two-dimensional isotropic Heisenberg model at finite temperatures by the Mermin–Wagner theorem16. Magnetic anisotropy removes this restriction, however, and enables, for instance, the occurrence of two-dimensional Ising ferromagnetism. Here we use magneto-optical Kerr effect microscopy to demonstrate that monolayer chromium triiodide (CrI3) is an Ising ferromagnet with out-of-plane spin orientation. Its Curie temperature of 45 kelvin is only slightly lower than that of the bulk crystal, 61 kelvin, which is consistent with a weak interlayer coupling. Moreover, our studies suggest a layer-dependent magnetic phase, highlighting thickness-dependent physical properties typical of van der Waals crystals17,18,19. Remarkably, bilayer CrI3 displays suppressed magnetization with a metamagnetic effect20, whereas in trilayer CrI3 the interlayer ferromagnetism observed in the bulk crystal is restored. This work creates opportunities for studying magnetism by harnessing the unusual features of atomically thin materials, such as electrical control for realizing magnetoelectronics12, and van der Waals engineering to produce interface phenomena15.

3,802 citations


Proceedings Article
25 Oct 2017
TL;DR: This work proposes mixup, a simple learning principle that trains a neural network on convex combinations of pairs of examples and their labels, which improves the generalization of state-of-the-art neural network architectures.
Abstract: Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

3,787 citations


Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, Fausto Acernese3  +1195 moreInstitutions (139)
TL;DR: In this paper, the authors used the observed time delay of $(+1.74\pm 0.05)\,{\rm{s}}$ between GRB 170817A and GW170817 to constrain the difference between the speed of gravity and speed of light to be between $-3
Abstract: On 2017 August 17, the gravitational-wave event GW170817 was observed by the Advanced LIGO and Virgo detectors, and the gamma-ray burst (GRB) GRB 170817A was observed independently by the Fermi Gamma-ray Burst Monitor, and the Anti-Coincidence Shield for the Spectrometer for the International Gamma-Ray Astrophysics Laboratory. The probability of the near-simultaneous temporal and spatial observation of GRB 170817A and GW170817 occurring by chance is $5.0\times {10}^{-8}$. We therefore confirm binary neutron star mergers as a progenitor of short GRBs. The association of GW170817 and GRB 170817A provides new insight into fundamental physics and the origin of short GRBs. We use the observed time delay of $(+1.74\pm 0.05)\,{\rm{s}}$ between GRB 170817A and GW170817 to: (i) constrain the difference between the speed of gravity and the speed of light to be between $-3\times {10}^{-15}$ and $+7\times {10}^{-16}$ times the speed of light, (ii) place new bounds on the violation of Lorentz invariance, (iii) present a new test of the equivalence principle by constraining the Shapiro delay between gravitational and electromagnetic radiation. We also use the time delay to constrain the size and bulk Lorentz factor of the region emitting the gamma-rays. GRB 170817A is the closest short GRB with a known distance, but is between 2 and 6 orders of magnitude less energetic than other bursts with measured redshift. A new generation of gamma-ray detectors, and subthreshold searches in existing detectors, will be essential to detect similar short bursts at greater distances. Finally, we predict a joint detection rate for the Fermi Gamma-ray Burst Monitor and the Advanced LIGO and Virgo detectors of 0.1–1.4 per year during the 2018–2019 observing run and 0.3–1.7 per year at design sensitivity.

2,633 citations


Journal ArticleDOI
TL;DR: This book is dedicated to the memory of those who have served in the armed forces and their families during the conflicts of the twentieth century.

2,628 citations


Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, Fausto Acernese3  +1062 moreInstitutions (115)
TL;DR: The magnitude of modifications to the gravitational-wave dispersion relation is constrain, the graviton mass is bound to m_{g}≤7.7×10^{-23} eV/c^{2} and null tests of general relativity are performed, finding that GW170104 is consistent with general relativity.
Abstract: We describe the observation of GW170104, a gravitational-wave signal produced by the coalescence of a pair of stellar-mass black holes. The signal was measured on January 4, 2017 at 10∶11:58.6 UTC by the twin advanced detectors of the Laser Interferometer Gravitational-Wave Observatory during their second observing run, with a network signal-to-noise ratio of 13 and a false alarm rate less than 1 in 70 000 years. The inferred component black hole masses are 31.2^(8.4) _(−6.0)M_⊙ and 19.4^(5.3)_( −5.9)M_⊙ (at the 90% credible level). The black hole spins are best constrained through measurement of the effective inspiral spin parameter, a mass-weighted combination of the spin components perpendicular to the orbital plane, χ_(eff) = −0.12^(0.21)_( −0.30). This result implies that spin configurations with both component spins positively aligned with the orbital angular momentum are disfavored. The source luminosity distance is 880^(450)_(−390) Mpc corresponding to a redshift of z = 0.18^(0.08)_( −0.07) . We constrain the magnitude of modifications to the gravitational-wave dispersion relation and perform null tests of general relativity. Assuming that gravitons are dispersed in vacuum like massive particles, we bound the graviton mass to m_g ≤ 7.7 × 10^(−23) eV/c^2. In all cases, we find that GW170104 is consistent with general relativity.

2,569 citations


Journal ArticleDOI
20 Nov 2017
TL;DR: In this paper, the authors provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs, and discuss various hardware platforms and architectures that support DNN, and highlight key trends in reducing the computation cost of deep neural networks either solely via hardware design changes or via joint hardware and DNN algorithm changes.
Abstract: Deep neural networks (DNNs) are currently widely used for many artificial intelligence (AI) applications including computer vision, speech recognition, and robotics. While DNNs deliver state-of-the-art accuracy on many AI tasks, it comes at the cost of high computational complexity. Accordingly, techniques that enable efficient processing of DNNs to improve energy efficiency and throughput without sacrificing application accuracy or increasing hardware cost are critical to the wide deployment of DNNs in AI systems. This article aims to provide a comprehensive tutorial and survey about the recent advances toward the goal of enabling efficient processing of DNNs. Specifically, it will provide an overview of DNNs, discuss various hardware platforms and architectures that support DNNs, and highlight key trends in reducing the computation cost of DNNs either solely via hardware design changes or via joint hardware design and DNN algorithm changes. It will also summarize various development resources that enable researchers and practitioners to quickly get started in this field, and highlight important benchmarking metrics and design considerations that should be used for evaluating the rapidly growing number of DNN hardware designs, optionally including algorithmic codesigns, being proposed in academia and industry. The reader will take away the following concepts from this article: understand the key design considerations for DNNs; be able to evaluate different DNN hardware implementations with benchmarks and comparison metrics; understand the tradeoffs between various hardware architectures and platforms; be able to evaluate the utility of various DNN design techniques for efficient processing; and understand recent implementation trends and opportunities.

2,391 citations


Proceedings ArticleDOI
21 Jul 2017
TL;DR: The ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts, is introduced and it is shown that the trained scene parsing networks can lead to applications such as image content removal and scene synthesis.
Abstract: Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the communitys efforts in data collection, there are still few image datasets covering a wide range of scenes and object categories with dense and detailed annotations for scene parsing. In this paper, we introduce and analyze the ADE20K dataset, spanning diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. A scene parsing benchmark is built upon the ADE20K with 150 object and stuff classes included. Several segmentation baseline models are evaluated on the benchmark. A novel network design called Cascade Segmentation Module is proposed to parse a scene into stuff, objects, and object parts in a cascade and improve over the baselines. We further show that the trained scene parsing networks can lead to applications such as image content removal and scene synthesis1.

2,233 citations


Journal ArticleDOI
TL;DR: Eyeriss as mentioned in this paper is an accelerator for state-of-the-art deep convolutional neural networks (CNNs) that optimizes for the energy efficiency of the entire system, including the accelerator chip and off-chip DRAM, by reconfiguring the architecture.
Abstract: Eyeriss is an accelerator for state-of-the-art deep convolutional neural networks (CNNs). It optimizes for the energy efficiency of the entire system, including the accelerator chip and off-chip DRAM, for various CNN shapes by reconfiguring the architecture. CNNs are widely used in modern AI systems but also bring challenges on throughput and energy efficiency to the underlying hardware. This is because its computation requires a large amount of data, creating significant data movement from on-chip and off-chip that is more energy-consuming than computation. Minimizing data movement energy cost for any CNN shape, therefore, is the key to high throughput and energy efficiency. Eyeriss achieves these goals by using a proposed processing dataflow, called row stationary (RS), on a spatial architecture with 168 processing elements. RS dataflow reconfigures the computation mapping of a given shape, which optimizes energy efficiency by maximally reusing data locally to reduce expensive data movement, such as DRAM accesses. Compression and data gating are also applied to further improve energy efficiency. Eyeriss processes the convolutional layers at 35 frames/s and 0.0029 DRAM access/multiply and accumulation (MAC) for AlexNet at 278 mW (batch size $N = 4$ ), and 0.7 frames/s and 0.0035 DRAM access/MAC for VGG-16 at 236 mW ( $N = 3$ ).

2,165 citations


Journal ArticleDOI
13 Sep 2017-Nature
TL;DR: The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers.
Abstract: Recent progress implies that a crossover between machine learning and quantum information processing benefits both fields. Traditional machine learning has dramatically improved the benchmarking an ...

Journal ArticleDOI
12 Dec 2017-JAMA
TL;DR: In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints.
Abstract: Importance Application of deep learning algorithms to whole-slide pathology images can potentially improve diagnostic accuracy and efficiency. Objective Assess the performance of automated deep learning algorithms at detecting metastases in hematoxylin and eosin–stained tissue sections of lymph nodes of women with breast cancer and compare it with pathologists’ diagnoses in a diagnostic setting. Design, Setting, and Participants Researcher challenge competition (CAMELYON16) to develop automated solutions for detecting lymph node metastases (November 2015-November 2016). A training data set of whole-slide images from 2 centers in the Netherlands with (n = 110) and without (n = 160) nodal metastases verified by immunohistochemical staining were provided to challenge participants to build algorithms. Algorithm performance was evaluated in an independent test set of 129 whole-slide images (49 with and 80 without metastases). The same test set of corresponding glass slides was also evaluated by a panel of 11 pathologists with time constraint (WTC) from the Netherlands to ascertain likelihood of nodal metastases for each slide in a flexible 2-hour session, simulating routine pathology workflow, and by 1 pathologist without time constraint (WOTC). Exposures Deep learning algorithms submitted as part of a challenge competition or pathologist interpretation. Main Outcomes and Measures The presence of specific metastatic foci and the absence vs presence of lymph node metastasis in a slide or image using receiver operating characteristic curve analysis. The 11 pathologists participating in the simulation exercise rated their diagnostic confidence as definitely normal, probably normal, equivocal, probably tumor, or definitely tumor. Results The area under the receiver operating characteristic curve (AUC) for the algorithms ranged from 0.556 to 0.994. The top-performing algorithm achieved a lesion-level, true-positive fraction comparable with that of the pathologist WOTC (72.4% [95% CI, 64.3%-80.4%]) at a mean of 0.0125 false-positives per normal whole-slide image. For the whole-slide image classification task, the best algorithm (AUC, 0.994 [95% CI, 0.983-0.999]) performed significantly better than the pathologists WTC in a diagnostic simulation (mean AUC, 0.810 [range, 0.738-0.884];P Conclusions and Relevance In the setting of a challenge competition, some deep learning algorithms achieved better diagnostic performance than a panel of 11 pathologists participating in a simulation exercise designed to mimic routine pathology workflow; algorithm performance was comparable with an expert pathologist interpreting whole-slide images without time constraints. Whether this approach has clinical utility will require evaluation in a clinical setting.

Journal ArticleDOI
29 Nov 2017-Nature
TL;DR: This work demonstrates a method for creating controlled many-body quantum matter that combines deterministically prepared, reconfigurable arrays of individually trapped cold atoms with strong, coherent interactions enabled by excitation to Rydberg states, and realizes a programmable Ising-type quantum spin model with tunable interactions and system sizes of up to 51 qubits.
Abstract: Controllable, coherent many-body systems can provide insights into the fundamental properties of quantum matter, enable the realization of new quantum phases and could ultimately lead to computational systems that outperform existing computers based on classical approaches. Here we demonstrate a method for creating controlled many-body quantum matter that combines deterministically prepared, reconfigurable arrays of individually trapped cold atoms with strong, coherent interactions enabled by excitation to Rydberg states. We realize a programmable Ising-type quantum spin model with tunable interactions and system sizes of up to 51 qubits. Within this model, we observe phase transitions into spatially ordered states that break various discrete symmetries, verify the high-fidelity preparation of these states and investigate the dynamics across the phase transition in large arrays of atoms. In particular, we observe robust many-body dynamics corresponding to persistent oscillations of the order after a rapid quantum quench that results from a sudden transition across the phase boundary. Our method provides a way of exploring many-body phenomena on a programmable quantum simulator and could enable realizations of new quantum algorithms.

Journal ArticleDOI
TL;DR: In this article, a review of recent progress in cognitive science suggests that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it.
Abstract: Recent progress in artificial intelligence has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats that of humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it. Specifically, we argue that these machines should (1) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (2) ground learning in intuitive theories of physics and psychology to support and enrich the knowledge that is learned; and (3) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes toward these goals that can combine the strengths of recent neural network advances with more structured cognitive models.

Journal ArticleDOI
B. P. Abbott1, Richard J. Abbott1, T. D. Abbott2, Fausto Acernese3  +1113 moreInstitutions (117)
TL;DR: For the first time, the nature of gravitational-wave polarizations from the antenna response of the LIGO-Virgo network is tested, thus enabling a new class of phenomenological tests of gravity.
Abstract: On August 14, 2017 at 10∶30:43 UTC, the Advanced Virgo detector and the two Advanced LIGO detectors coherently observed a transient gravitational-wave signal produced by the coalescence of two stellar mass black holes, with a false-alarm rate of ≲1 in 27 000 years. The signal was observed with a three-detector network matched-filter signal-to-noise ratio of 18. The inferred masses of the initial black holes are 30.5-3.0+5.7M⊙ and 25.3-4.2+2.8M⊙ (at the 90% credible level). The luminosity distance of the source is 540-210+130 Mpc, corresponding to a redshift of z=0.11-0.04+0.03. A network of three detectors improves the sky localization of the source, reducing the area of the 90% credible region from 1160 deg2 using only the two LIGO detectors to 60 deg2 using all three detectors. For the first time, we can test the nature of gravitational-wave polarizations from the antenna response of the LIGO-Virgo network, thus enabling a new class of phenomenological tests of gravity.

Journal ArticleDOI
01 Jul 2017
TL;DR: A new architecture for a fully optical neural network is demonstrated that enables a computational speed enhancement of at least two orders of magnitude and three order of magnitude in power efficiency over state-of-the-art electronics.
Abstract: Artificial Neural Networks have dramatically improved performance for many machine learning tasks. We demonstrate a new architecture for a fully optical neural network that enables a computational speed enhancement of at least two orders of magnitude and three orders of magnitude in power efficiency over state-of-the-art electronics.

Journal ArticleDOI
28 Apr 2017-Science
TL;DR: A Cas13a-based molecular detection platform, termed Specific High-Sensitivity Enzymatic Reporter UnLOCKing (SHERLOCK), is used to detect specific strains of Zika and Dengue virus, distinguish pathogenic bacteria, genotype human DNA, and identify mutations in cell-free tumor DNA.
Abstract: Rapid, inexpensive, and sensitive nucleic acid detection may aid point-of-care pathogen detection, genotyping, and disease monitoring. The RNA-guided, RNA-targeting clustered regularly interspaced short palindromic repeats (CRISPR) effector Cas13a (previously known as C2c2) exhibits a “collateral effect” of promiscuous ribonuclease activity upon target recognition. We combine the collateral effect of Cas13a with isothermal amplification to establish a CRISPR-based diagnostic (CRISPR-Dx), providing rapid DNA or RNA detection with attomolar sensitivity and single-base mismatch specificity. We use this Cas13a-based molecular detection platform, termed Specific High-Sensitivity Enzymatic Reporter UnLOCKing (SHERLOCK), to detect specific strains of Zika and Dengue virus, distinguish pathogenic bacteria, genotype human DNA, and identify mutations in cell-free tumor DNA. Furthermore, SHERLOCK reaction reagents can be lyophilized for cold-chain independence and long-term storage and be readily reconstituted on paper for field applications.

Journal ArticleDOI
09 Feb 2017-Cell
TL;DR: The cellular and molecular mechanisms involved in metastasis are summarized, with a focus on carcinomas where the most is known, and the general principles of metastasis that have begun to emerge are highlighted.

Journal ArticleDOI
TL;DR: This article proposes a vision‐based method using a deep architecture of convolutional neural networks (CNNs) for detecting concrete cracks without calculating the defect features, and shows quite better performances and can indeed find concrete cracks in realistic situations.
Abstract: A number of image processing techniques IPTs have been implemented for detecting civil infrastructure defects to partially replace human-conducted onsite inspections. These IPTs are primarily used to manipulate images to extract defect features, such as cracks in concrete and steel surfaces. However, the extensively varying real-world situations e.g., lighting and shadow changes can lead to challenges to the wide adoption of IPTs. To overcome these challenges, this article proposes a vision-based method using a deep architecture of convolutional neural networks CNNs for detecting concrete cracks without calculating the defect features. As CNNs are capable of learning image features automatically, the proposed method works without the conjugation of IPTs for extracting features. The designed CNN is trained on 40 K images of 256 × 256 pixel resolutions and, consequently, records with about 98% accuracy. The trained CNN is combined with a sliding window technique to scan any image size larger than 256 × 256 pixel resolutions. The robustness and adaptability of the proposed approach are tested on 55 images of 5,888 × 3,584 pixel resolutions taken from a different structure which is not used for training and validation processes under various conditions e.g., strong light spot, shadows, and very thin cracks. Comparative studies are conducted to examine the performance of the proposed CNN using traditional Canny and Sobel edge detection methods. The results show that the proposed method shows quite better performances and can indeed find concrete cracks in realistic situations.

Proceedings Article
01 Jan 2017
TL;DR: This article showed that deep neural networks can fit a random labeling of the training data, and that this phenomenon is qualitatively unaffected by explicit regularization, and occurs even if the true images are replaced by completely unstructured random noise.
Abstract: Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models.

Journal ArticleDOI
01 Nov 2017-Nature
TL;DR: A meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project is presented, creating both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity.
Abstract: Our growing awareness of the microbial world’s importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity.

Journal ArticleDOI
TL;DR: In this paper, the authors discuss the link between the epithelial-to-mesenchymal transition (EMT) and the cancer stem cell (CSC) phenotype and discuss how this knowledge can contribute to improvements in clinical practice.
Abstract: The success of anticancer therapy is usually limited by the development of drug resistance. Such acquired resistance is driven, in part, by intratumoural heterogeneity - that is, the phenotypic diversity of cancer cells co-inhabiting a single tumour mass. The introduction of the cancer stem cell (CSC) concept, which posits the presence of minor subpopulations of CSCs that are uniquely capable of seeding new tumours, has provided a framework for understanding one dimension of intratumoural heterogeneity. This concept, taken together with the identification of the epithelial-to-mesenchymal transition (EMT) programme as a critical regulator of the CSC phenotype, offers an opportunity to investigate the nature of intratumoural heterogeneity and a possible mechanistic basis for anticancer drug resistance. In fact, accumulating evidence indicates that conventional therapies often fail to eradicate carcinoma cells that have entered the CSC state via activation of the EMT programme, thereby permitting CSC-mediated clinical relapse. In this Review, we summarize our current understanding of the link between the EMT programme and the CSC state, and also discuss how this knowledge can contribute to improvements in clinical practice.

Journal ArticleDOI
TL;DR: This work shows that Ni3(2,3,6,7,10,11-hexaiminotriphenylene)2 (Ni3(HITP)2), a MOF with high electrical conductivity, can serve as the sole electrode material in an EDLC, the first example of a supercapacitor made entirely from neat MOFs as active materials, without conductive additives or other binders.
Abstract: Using MOFs as active electrodes in electrochemical double layer capacitors has so far proved difficult. An electrically conductive MOF used as an electrode is now shown to exhibit electrochemical performance similar to most carbon-based materials. Owing to their high power density and superior cyclability relative to batteries, electrochemical double layer capacitors (EDLCs) have emerged as an important electrical energy storage technology that will play a critical role in the large-scale deployment of intermittent renewable energy sources, smart power grids, and electrical vehicles1,2,3. Because the capacitance and charge–discharge rates of EDLCs scale with surface area and electrical conductivity, respectively, porous carbons such as activated carbon, carbon nanotubes and crosslinked or holey graphenes are used exclusively as the active electrode materials in EDLCs4,5,6,7,8,9. One class of materials whose surface area far exceeds that of activated carbons, potentially allowing them to challenge the dominance of carbon electrodes in EDLCs, is metal–organic frameworks (MOFs)10. The high porosity of MOFs, however, is conventionally coupled to very poor electrical conductivity, which has thus far prevented the use of these materials as active electrodes in EDLCs. Here, we show that Ni3(2,3,6,7,10,11-hexaiminotriphenylene)2 (Ni3(HITP)2), a MOF with high electrical conductivity11, can serve as the sole electrode material in an EDLC. This is the first example of a supercapacitor made entirely from neat MOFs as active materials, without conductive additives or other binders. The MOF-based device shows an areal capacitance that exceeds those of most carbon-based materials and capacity retention greater than 90% over 10,000 cycles, in line with commercial devices. Given the established structural and compositional tunability of MOFs, these results herald the advent of a new generation of supercapacitors whose active electrode materials can be tuned rationally, at the molecular level.

Journal ArticleDOI
05 May 2017-Science
TL;DR: The advances in making hydrogels with improved mechanical strength and greater flexibility for use in a wide range of applications are reviewed, foreseeing opportunities in the further development of more sophisticated fabrication methods that allow better-controlled hydrogel architecture across multiple length scales.
Abstract: BACKGROUND Hydrogels are formed through the cross-linking of hydrophilic polymer chains within an aqueous microenvironment. The gelation can be achieved through a variety of mechanisms, spanning physical entanglement of polymer chains, electrostatic interactions, and covalent chemical cross-linking. The water-rich nature of hydrogels makes them broadly applicable to many areas, including tissue engineering, drug delivery, soft electronics, and actuators. Conventional hydrogels usually possess limited mechanical strength and are prone to permanent breakage. The lack of desired dynamic cues and structural complexity within the hydrogels has further limited their functions. Broadened applications of hydrogels, however, require advanced engineering of parameters such as mechanics and spatiotemporal presentation of active or bioactive moieties, as well as manipulation of multiscale shape, structure, and architecture. ADVANCES Hydrogels with substantially improved physicochemical properties have been enabled by rational design at the molecular level and control over multiscale architecture. For example, formulations that combine permanent polymer networks with reversibly bonding chains for energy dissipation show strong toughness and stretchability. Similar strategies may also substantially enhance the bonding affinity of hydrogels at interfaces with solids by covalently anchoring the polymer networks of tough hydrogels onto solid surfaces. Shear-thinning hydrogels that feature reversible bonds impart a fluidic nature upon application of shear forces and return back to their gel states once the forces are released. Self-healing hydrogels based on nanomaterial hybridization, electrostatic interactions, and slide-ring configurations exhibit excellent abilities in spontaneously healing themselves after damages. Additionally, harnessing techniques that can dynamically and precisely configure hydrogels have resulted in flexibility to regulate their architecture, activity, and functionality. Dynamic modulations of polymer chain physics and chemistry can lead to temporal alteration of hydrogel structures in a programmed manner. Three-dimensional printing enables architectural control of hydrogels at high precision, with a potential to further integrate elements that enable change of hydrogel configurations along prescribed paths. OUTLOOK We envision the continuation of innovation in new bioorthogonal chemistries for making hydrogels, enabling their fabrication in the presence of biological species without impairing cellular or biomolecule functions. We also foresee opportunities in the further development of more sophisticated fabrication methods that allow better-controlled hydrogel architecture across multiple length scales. In addition, technologies that precisely regulate the physicochemical properties of hydrogels in spatiotemporally controlled manners are crucial in controlling their dynamics, such as degradation and dynamic presentation of biomolecules. We believe that the fabrication of hydrogels should be coupled with end applications in a feedback loop in order to achieve optimal designs through iterations. In the end, it is the combination of multiscale constituents and complementary strategies that will enable new applications of this important class of materials.

Journal ArticleDOI
TL;DR: The presence of CHIP in peripheral‐blood cells was associated with nearly a doubling in the risk of coronary heart disease in humans and with accelerated atherosclerosis in mice.
Abstract: BackgroundClonal hematopoiesis of indeterminate potential (CHIP), which is defined as the presence of an expanded somatic blood-cell clone in persons without other hematologic abnormalities, is common among older persons and is associated with an increased risk of hematologic cancer. We previously found preliminary evidence for an association between CHIP and atherosclerotic cardiovascular disease, but the nature of this association was unclear. MethodsWe used whole-exome sequencing to detect the presence of CHIP in peripheral-blood cells and associated such presence with coronary heart disease using samples from four case–control studies that together enrolled 4726 participants with coronary heart disease and 3529 controls. To assess causality, we perturbed the function of Tet2, the second most commonly mutated gene linked to clonal hematopoiesis, in the hematopoietic cells of atherosclerosis-prone mice. ResultsIn nested case–control analyses from two prospective cohorts, carriers of CHIP had a risk of c...

Journal ArticleDOI
TL;DR: This review provides an objective and comprehensive account of the cellular uptake of NPs and the underlying parameters controlling the nano-cellular interactions, along with the available analytical techniques to follow and track these processes.
Abstract: Nanoscale materials are increasingly found in consumer goods, electronics, and pharmaceuticals. While these particles interact with the body in myriad ways, their beneficial and/or deleterious effects ultimately arise from interactions at the cellular and subcellular level. Nanoparticles (NPs) can modulate cell fate, induce or prevent mutations, initiate cell–cell communication, and modulate cell structure in a manner dictated largely by phenomena at the nano–bio interface. Recent advances in chemical synthesis have yielded new nanoscale materials with precisely defined biochemical features, and emerging analytical techniques have shed light on nuanced and context-dependent nano-bio interactions within cells. In this review, we provide an objective and comprehensive account of our current understanding of the cellular uptake of NPs and the underlying parameters controlling the nano-cellular interactions, along with the available analytical techniques to follow and track these processes.

Posted Content
TL;DR: Mixup as discussed by the authors trains a neural network on convex combinations of pairs of examples and their labels, and regularizes the neural network to favor simple linear behavior in between training examples, which improves the generalization of state-of-the-art neural network architectures.
Abstract: Large deep neural networks are powerful, but exhibit undesirable behaviors such as memorization and sensitivity to adversarial examples. In this work, we propose mixup, a simple learning principle to alleviate these issues. In essence, mixup trains a neural network on convex combinations of pairs of examples and their labels. By doing so, mixup regularizes the neural network to favor simple linear behavior in-between training examples. Our experiments on the ImageNet-2012, CIFAR-10, CIFAR-100, Google commands and UCI datasets show that mixup improves the generalization of state-of-the-art neural network architectures. We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

Journal ArticleDOI
23 Feb 2017-Nature
TL;DR: The observations reveal that at least seven planets with sizes and masses similar to those of Earth revolve around TRAPPIST-1, and the six inner planets form a near-resonant chain, such that their orbital periods are near-ratios of small integers.
Abstract: One aim of modern astronomy is to detect temperate, Earth-like exoplanets that are well suited for atmospheric characterization. Recently, three Earth-sized planets were detected that transit (that is, pass in front of) a star with a mass just eight per cent that of the Sun, located 12 parsecs away. The transiting configuration of these planets, combined with the Jupiter-like size of their host star—named TRAPPIST-1—makes possible in-depth studies of their atmospheric properties with present-day and future astronomical facilities. Here we report the results of a photometric monitoring campaign of that star from the ground and space. Our observations reveal that at least seven planets with sizes and masses similar to those of Earth revolve around TRAPPIST-1. The six inner planets form a near-resonant chain, such that their orbital periods (1.51, 2.42, 4.04, 6.06, 9.1 and 12.35 days) are near-ratios of small integers. This architecture suggests that the planets formed farther from the star and migrated inwards. Moreover, the seven planets have equilibrium temperatures low enough to make possible the presence of liquid water on their surfaces.

Journal ArticleDOI
21 Apr 2017-Science
TL;DR: This refined analysis has identified, among others, a previously unknown dendritic cell population that potently activates T cells and reclassify pDCs as the originally described “natural interferon-producing cells (IPCs)” with weaker T cell proliferation induction ability.
Abstract: INTRODUCTION Dendritic cells (DCs) and monocytes consist of multiple specialized subtypes that play a central role in pathogen sensing, phagocytosis, and antigen presentation. However, their identities and interrelationships are not fully understood, as these populations have historically been defined by a combination of morphology, physical properties, localization, functions, developmental origins, and expression of a restricted set of surface markers. RATIONALE To overcome this inherently biased strategy for cell identification, we performed single-cell RNA sequencing of ~2400 cells isolated from healthy blood donors and enriched for HLA-DR + lineage − cells. This single-cell profiling strategy and unbiased genomic classification, together with follow-up profiling and functional and phenotypic characterization of prospectively isolated subsets, led us to identify and validate six DC subtypes and four monocyte subtypes, and thus revise the taxonomy of these cells. RESULTS Our study reveals: 1) A new DC subset, representing 2 to 3% of the DC populations across all 10 donors tested, characterized by the expression of AXL , SIGLEC1 , and SIGLEC6 antigens, named AS DCs. The AS DC population further divides into two populations captured in the traditionally defined plasmacytoid DC (pDC) and CD1C + conventional DC (cDC) gates. This split is further reflected through AS DC gene expression signatures spanning a spectrum between cDC-like and pDC-like gene sets. Although AS DCs share properties with pDCs, they more potently activate T cells. This discovery led us to reclassify pDCs as the originally described “natural interferon-producing cells (IPCs)” with weaker T cell proliferation induction ability. 2) A new subdivision within the CD1C + DC subset: one defined by a major histocompatibility complex class II–like gene set and one by a CD14 + monocyte–like prominent gene set. These CD1C + DC subsets, which can be enriched by combining CD1C with CD32B, CD36, and CD163 antigens, can both potently induce T cell proliferation. 3) The existence of a circulating and dividing cDC progenitor giving rise to CD1C + and CLEC9A + DCs through in vitro differentiation assays. This blood precursor is defined by the expression of CD100 + CD34 int and observed at a frequency of ~0.02% of the LIN – HLA-DR + fraction. 4) Two additional monocyte populations: one expressing classical monocyte genes and cytotoxic genes, and the other with unknown functions. 5) Evidence for a relationship between blastic plasmacytoid DC neoplasia (BPDCN) cells and healthy DCs. CONCLUSION Our revised taxonomy will enable more accurate functional and developmental analyses as well as immune monitoring in health and disease. The discovery of AS DCs within the traditionally defined pDC population explains many of the cDC properties previously assigned to pDCs, highlighting the need to revisit the definition of pDCs. Furthermore, the discovery of blood cDC progenitors represents a new therapeutic target readily accessible in the bloodstream for manipulation, as well as a new source for better in vitro DC generation. Although the current results focus on DCs and monocytes, a similar strategy can be applied to build a comprehensive human immune cell atlas.

Journal ArticleDOI
09 Feb 2017-Cell
TL;DR: In this paper, the authors define pathways that are limiting for cancer progression and understand the context specificity of metabolic preferences and liabilities in malignant cells, which can guide the more effective targeting of metabolism to help patients.