scispace - formally typeset
Search or ask a question

Showing papers by "Massachusetts Institute of Technology published in 2011"


Journal ArticleDOI
04 Mar 2011-Cell
TL;DR: Recognition of the widespread applicability of these concepts will increasingly affect the development of new means to treat human cancer.

51,099 citations


Journal ArticleDOI
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.

15,665 citations


Journal ArticleDOI
TL;DR: In this article, the authors present an approach for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data.
Abstract: Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole-genome sequencing, epigenetic surveys, expression profiling of coding and noncoding RNAs, single nucleotide polymorphism (SNP) and copy number profiling, and functional assays. Analysis of these large, diverse data sets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large data sets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data pose a significant challenge to the development of such tools.

10,798 citations


Journal ArticleDOI
TL;DR: In this paper, the organization of networks in the human cerebrum was explored using resting-state functional connectivity MRI data from 1,000 subjects and a clustering approach was employed to identify and replicate networks of functionally coupled regions across the cerebral cortex.
Abstract: Information processing in the cerebral cortex involves interactions among distributed areas. Anatomical connectivity suggests that certain areas form local hierarchical relations such as within the visual system. Other connectivity patterns, particularly among association areas, suggest the presence of large-scale circuits without clear hierarchical relations. In this study the organization of networks in the human cerebrum was explored using resting-state functional connectivity MRI. Data from 1,000 subjects were registered using surface-based alignment. A clustering approach was employed to identify and replicate networks of functionally coupled regions across the cerebral cortex. The results revealed local networks confined to sensory and motor cortices as well as distributed networks of association regions. Within the sensory and motor cortices, functional connectivity followed topographic representations across adjacent areas. In association cortex, the connectivity patterns often showed abrupt transitions between network boundaries. Focused analyses were performed to better understand properties of network connectivity. A canonical sensory-motor pathway involving primary visual area, putative middle temporal area complex (MT+), lateral intraparietal area, and frontal eye field was analyzed to explore how interactions might arise within and between networks. Results showed that adjacent regions of the MT+ complex demonstrate differential connectivity consistent with a hierarchical pathway that spans networks. The functional connectivity of parietal and prefrontal association cortices was next explored. Distinct connectivity profiles of neighboring regions suggest they participate in distributed networks that, while showing evidence for interactions, are embedded within largely parallel, interdigitated circuits. We conclude by discussing the organization of these large-scale cerebral networks in relation to monkey anatomy and their potential evolutionary expansion in humans to support cognition.

6,284 citations


Journal ArticleDOI
Debra A. Bell1, Andrew Berchuck2, Michael J. Birrer3, Jeremy Chien1  +282 moreInstitutions (35)
30 Jun 2011-Nature
TL;DR: It is reported that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1,BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes.
Abstract: A catalogue of molecular aberrations that cause ovarian cancer is critical for developing and deploying therapies that will improve patients' lives. The Cancer Genome Atlas project has analysed messenger RNA expression, microRNA expression, promoter methylation and DNA copy number in 489 high-grade serous ovarian adenocarcinomas and the DNA sequences of exons from coding genes in 316 of these tumours. Here we report that high-grade serous ovarian cancer is characterized by TP53 mutations in almost all tumours (96%); low prevalence but statistically recurrent somatic mutations in nine further genes including NF1, BRCA1, BRCA2, RB1 and CDK12; 113 significant focal DNA copy number aberrations; and promoter methylation events involving 168 genes. Analyses delineated four ovarian cancer transcriptional subtypes, three microRNA subtypes, four promoter methylation subtypes and a transcriptional signature associated with survival duration, and shed new light on the impact that tumours with BRCA1/2 (BRCA1 or BRCA2) and CCNE1 aberrations have on survival. Pathway analyses suggested that homologous recombination is defective in about half of the tumours analysed, and that NOTCH and FOXM1 signalling are involved in serous ovarian cancer pathophysiology.

5,878 citations


Book
02 Sep 2011
TL;DR: This research addresses the needs for software measures in object-orientation design through the development and implementation of a new suite of metrics for OO design, and suggests ways in which managers may use these metrics for process improvement.
Abstract: Given the central role that software development plays in the delivery and application of information technology, managers are increasingly focusing on process improvement in the software development area. This demand has spurred the provision of a number of new and/or improved approaches to software development, with perhaps the most prominent being object-orientation (OO). In addition, the focus on process improvement has increased the demand for software measures, or metrics with which to manage the process. The need for such metrics is particularly acute when an organization is adopting a new technology for which established practices have yet to be developed. This research addresses these needs through the development and implementation of a new suite of metrics for OO design. Metrics developed in previous research, while contributing to the field's understanding of software development processes, have generally been subject to serious criticisms, including the lack of a theoretical base. Following Wand and Weber (1989), the theoretical base chosen for the metrics was the ontology of Bunge (1977). Six design metrics are developed, and then analytically evaluated against Weyuker's (1988) proposed set of measurement principles. An automated data collection tool was then developed and implemented to collect an empirical sample of these metrics at two field sites in order to demonstrate their feasibility and suggest ways in which managers may use these metrics for process improvement. >

5,476 citations


Journal ArticleDOI
TL;DR: The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.
Abstract: The gem5 simulation infrastructure is the merger of the best aspects of the M5 [4] and GEMS [9] simulators. M5 provides a highly configurable simulation framework, multiple ISAs, and diverse CPU models. GEMS complements these features with a detailed and exible memory system, including support for multiple cache coherence protocols and interconnect models. Currently, gem5 supports most commercial ISAs (ARM, ALPHA, MIPS, Power, SPARC, and x86), including booting Linux on three of them (ARM, ALPHA, and x86).The project is the result of the combined efforts of many academic and industrial institutions, including AMD, ARM, HP, MIPS, Princeton, MIT, and the Universities of Michigan, Texas, and Wisconsin. Over the past ten years, M5 and GEMS have been used in hundreds of publications and have been downloaded tens of thousands of times. The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.

4,039 citations


Journal ArticleDOI
25 Mar 2011-Science
TL;DR: It is suggested that metastasis can be portrayed as a two-phase process: the first phase involves the physical translocation of a cancer cell to a distant organ, whereas the second encompasses the ability of the cancer cellto develop into a metastatic lesion at that distant site.
Abstract: Metastasis causes most cancer deaths, yet this process remains one of the most enigmatic aspects of the disease. Building on new mechanistic insights emerging from recent research, we offer our perspective on the metastatic process and reflect on possible paths of future exploration. We suggest that metastasis can be portrayed as a two-phase process: The first phase involves the physical translocation of a cancer cell to a distant organ, whereas the second encompasses the ability of the cancer cell to develop into a metastatic lesion at that distant site. Although much remains to be learned about the second phase, we feel that an understanding of the first phase is now within sight, due in part to a better understanding of how cancer cell behavior can be modified by a cell-biological program called the epithelial-to-mesenchymal transition.

3,993 citations


Journal ArticleDOI
09 Dec 2011-Science
TL;DR: The high activity of BSCF was predicted from a design principle established by systematic examination of more than 10 transition metal oxides, which showed that the intrinsic OER activity exhibits a volcano-shaped dependence on the occupancy of the 3d electron with an eg symmetry of surface transition metal cations in an oxide.
Abstract: The efficiency of many energy storage technologies, such as rechargeable metal-air batteries and hydrogen production from water splitting, is limited by the slow kinetics of the oxygen evolution reaction (OER). We found that Ba 0.5 Sr 0.5 Co 0.8 Fe 0.2 O 3–δ (BSCF) catalyzes the OER with intrinsic activity that is at least an order of magnitude higher than that of the state-of-the-art iridium oxide catalyst in alkaline media. The high activity of BSCF was predicted from a design principle established by systematic examination of more than 10 transition metal oxides, which showed that the intrinsic OER activity exhibits a volcano-shaped dependence on the occupancy of the 3d electron with an e g symmetry of surface transition metal cations in an oxide. The peak OER activity was predicted to be at an e g occupancy close to unity, with high covalency of transition metal–oxygen bonds.

3,876 citations


Journal ArticleDOI
TL;DR: The photodynamic therapy (PDT) is a clinically approved, minimally invasive therapeutic procedure that can exert a selective cytotoxic activity toward malignant cells as discussed by the authors, which can prolong survival in patients with inoperable cancers and significantly improve quality of life.
Abstract: Photodynamic therapy (PDT) is a clinically approved, minimally invasive therapeutic procedure that can exert a selective cytotoxic activity toward malignant cells. The procedure involves administration of a photosensitizing agent followed by irradiation at a wavelength corresponding to an absorbance band of the sensitizer. In the presence of oxygen, a series of events lead to direct tumor cell death, damage to the microvasculature, and induction of a local inflammatory reaction. Clinical studies revealed that PDT can be curative, particularly in early stage tumors. It can prolong survival in patients with inoperable cancers and significantly improve quality of life. Minimal normal tissue toxicity, negligible systemic effects, greatly reduced long-term morbidity, lack of intrinsic or acquired resistance mechanisms, and excellent cosmetic as well as organ function-sparing effects of this treatment make it a valuable therapeutic option for combination treatments. With a number of recent technological improvements, PDT has the potential to become integrated into the mainstream of cancer treatment. CA Cancer J Clin 2011;61:250-281. V C

3,770 citations


Journal ArticleDOI
TL;DR: Mammalian TOR complex 1 (mTORC1) and mTORC2 exert their actions by regulating other important kinases, such as S6 kinase (S6K) and Akt.
Abstract: In all eukaryotes, the target of rapamycin (TOR) signalling pathway couples energy and nutrient abundance to the execution of cell growth and division, owing to the ability of TOR protein kinase to simultaneously sense energy, nutrients and stress and, in metazoans, growth factors. Mammalian TOR complex 1 (mTORC1) and mTORC2 exert their actions by regulating other important kinases, such as S6 kinase (S6K) and Akt. In the past few years, a significant advance in our understanding of the regulation and functions of mTOR has revealed the crucial involvement of this signalling pathway in the onset and progression of diabetes, cancer and ageing.

Book
25 Aug 2011
TL;DR: The authors analyzed the format of typical bonus contracts, providing a more complete characterization of their accounting incentive effects than earlier studies, and found that accrual policies of managers are related to income-reporting incentives of their bonus contracts.
Abstract: Studies examining managerial accounting decisions postulate that executives rewarded by earnings-based bonuses select accounting procedures that increase their compensation. The empirical results of these studies are conflicting. This paper analyzes the format of typical bonus contracts, providing a more complete characterization of their accounting incentive effects than earlier studies. The test results suggest that (1) accrual policies of managers are related to income-reporting incentives of their bonus contracts, and (2) changes in accounting procedures by managers are associated with adoption or modification of their bonus plan.

Proceedings ArticleDOI
06 Nov 2011
TL;DR: This paper uses the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube, to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions.
Abstract: With nearly one billion online videos viewed everyday, an emerging new frontier in computer vision research is recognition and search in video. While much effort has been devoted to the collection and annotation of large scalable static image datasets containing thousands of image categories, human action datasets lag far behind. Current action recognition databases contain on the order of ten different action categories collected under fairly controlled conditions. State-of-the-art performance on these datasets is now near ceiling and thus there is a need for the design and creation of new benchmarks. To address this issue we collected the largest action video database to-date with 51 action categories, which in total contain around 7,000 manually annotated clips extracted from a variety of sources ranging from digitized movies to YouTube. We use this database to evaluate the performance of two representative computer vision systems for action recognition and explore the robustness of these methods under various conditions such as camera motion, viewpoint, video quality and occlusion.

Journal ArticleDOI
TL;DR: An extension of the previous work which proposes a new speaker representation for speaker verification, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis, named the total variability space because it models both speaker and channel variabilities.
Abstract: This paper presents an extension of our previous work which proposes a new speaker representation for speaker verification. In this modeling, a new low-dimensional speaker- and channel-dependent space is defined using a simple factor analysis. This space is named the total variability space because it models both speaker and channel variabilities. Two speaker verification systems are proposed which use this new representation. The first system is a support vector machine-based system that uses the cosine kernel to estimate the similarity between the input data. The second system directly uses the cosine similarity as the final decision score. We tested three channel compensation techniques in the total variability space, which are within-class covariance normalization (WCCN), linear discriminate analysis (LDA), and nuisance attribute projection (NAP). We found that the best results are obtained when LDA is followed by WCCN. We achieved an equal error rate (EER) of 1.12% and MinDCF of 0.0094 using the cosine distance scoring on the male English trials of the core condition of the NIST 2008 Speaker Recognition Evaluation dataset. We also obtained 4% absolute EER improvement for both-gender trials on the 10 s-10 s condition compared to the classical joint factor analysis scoring.

Book
08 Sep 2011
TL;DR: A two-dimensional classificatory scheme highlighting ten different approaches to the measurement of business performance in strategy research is developed in this article, where the first dimension concerns the use of financial versus broader operational criteria, while the second dimension focuses on two alternate data sources (primary versus secondary).
Abstract: A two-dimensional classificatory scheme highlighting ten different approaches to the measurement of business performance in strategy research is developed. The first dimension concerns the use of financial versus broader operational criteria, while the second focuses on two alternate data sources (primary versus secondary). The scheme permits the classification of an exhaustive coverage of measurement approaches and is useful for discussing their relative merits and demerits. Implications for operationalizing business performance in future strategy research are discussed.

Journal ArticleDOI
TL;DR: In this paper, the authors studied the asymptotic behavior of the cost of the solution returned by stochastic sampling-based path planning algorithms as the number of samples increases.
Abstract: During the last decade, sampling-based path planning algorithms, such as probabilistic roadmaps (PRM) and rapidly exploring random trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g. as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g. showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e. such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.

Journal ArticleDOI
14 Oct 2011-Cell
TL;DR: The invasion-metastasis cascade is a multistep cell-biological process that involves dissemination of cancer cells to anatomically distant organ sites and their subsequent adaptation to foreign tissue microenvironments as mentioned in this paper.

Journal ArticleDOI
TL;DR: It is found that lincRNA expression is strikingly tissue-specific compared with coding genes, and that l incRNAs are typically coexpressed with their neighboring genes, albeit to an extent similar to that of pairs of neighboring protein-coding genes.
Abstract: Large intergenic noncoding RNAs (lincRNAs) are emerging as key regulators of diverse cellular processes. Determining the function of individual lincRNAs remains a challenge. Recent advances in RNA sequencing (RNA-seq) and computational methods allow for an unprecedented analysis of such transcripts. Here, we present an integrative approach to define a reference catalog of >8000 human lincRNAs. Our catalog unifies previously existing annotation sources with transcripts we assembled from RNA-seq data collected from ~4 billion RNA-seq reads across 24 tissues and cell types. We characterize each lincRNA by a panorama of >30 properties, including sequence, structural, transcriptional, and orthology features. We found that lincRNA expression is strikingly tissue-specific compared with coding genes, and that lincRNAs are typically coexpressed with their neighboring genes, albeit to an extent similar to that of pairs of neighboring protein-coding genes. We distinguish an additional subset of transcripts that have high evolutionary conservation but may include short ORFs and may serve as either lincRNAs or small peptides. Our integrated, comprehensive, yet conservative reference catalog of human lincRNAs reveals the global properties of lincRNAs and will facilitate experimental studies and further functional classification of these genes.

Journal ArticleDOI
TL;DR: Quantum metrology is the use of quantum techniques such as entanglement to yield higher statistical precision than purely classical approaches as discussed by the authors, where the central limit theorem implies that the reduction is proportional to the square root of the number of repetitions.
Abstract: The statistical error in any estimation can be reduced by repeating the measurement and averaging the results. The central limit theorem implies that the reduction is proportional to the square root of the number of repetitions. Quantum metrology is the use of quantum techniques such as entanglement to yield higher statistical precision than purely classical approaches. In this Review, we analyse some of the most promising recent developments of this research field and point out some of the new experiments. We then look at one of the major new trends of the field: analyses of the effects of noise and experimental imperfections.

Journal ArticleDOI
12 May 2011-Nature
TL;DR: In this article, the authors developed analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system's entire dynamics.
Abstract: The ultimate proof of our understanding of natural or technological systems is reflected in our ability to control them. Although control theory offers mathematical tools for steering engineered and natural systems towards a desired state, a framework to control complex self-organized systems is lacking. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system's entire dynamics. We apply these tools to several real networks, finding that the number of driver nodes is determined mainly by the network's degree distribution. We show that sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes. Counterintuitively, we find that in both model and real systems the driver nodes tend to avoid the high-degree nodes.

Journal ArticleDOI
05 May 2011-Nature
TL;DR: This study presents a general framework for deciphering cis-regulatory connections and their roles in disease, and maps nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions.
Abstract: Chromatin profiling has emerged as a powerful means of genome annotation and detection of regulatory activity. The approach is especially well suited to the characterization of non-coding portions of the genome, which critically contribute to cellular phenotypes yet remain largely uncharted. Here we map nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions. Focusing on cell-type-specific patterns of promoters and enhancers, we define multicell activity profiles for chromatin state, gene expression, regulatory motif enrichment and regulator expression. We use correlations between these profiles to link enhancers to putative target genes, and predict the cell-type-specific activators and repressors that modulate them. The resulting annotations and regulatory predictions have implications for the interpretation of genome-wide association studies. Top-scoring disease single nucleotide polymorphisms are frequently positioned within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a predicted regulator, thus suggesting a mechanism for the association. Our study presents a general framework for deciphering cis-regulatory connections and their roles in disease.

Book
14 Sep 2011
TL;DR: The paper is intended to raise awareness of the far-reaching implications of the architecture of the product, to create a vocabulary for discussing and addressing the decisions and issues that are linked to product architecture, and to identify and discuss specific trade-offs associated with the choice of a product architecture.
Abstract: Product architecture is the scheme by which the function of a product is allocated to physical components. This paper further defines product architecture, provides a typology of product architectures, and articulates the potential linkages between the architecture of the product and five areas of managerial importance: (1) product change; (2) product variety; (3) component standardization; (4) product performance; and (5) product development management. The paper is conceptual and foundational, synthesizing fragments from several different disciplines, including software engineering, design theory, operations management and product development management. The paper is intended to raise awareness of the far-reaching implications of the architecture of the product, to create a vocabulary for discussing and addressing the decisions and issues that are linked to product architecture, and to identify and discuss specific trade-offs associated with the choice of a product architecture.

Book
31 Jan 2011
TL;DR: Containing a wealth of figures and exercises, this well-known textbook is ideal for courses on the subject, and will interest beginning graduate students and researchers in physics, computer science, mathematics, and electrical engineering.
Abstract: One of the most cited books in physics of all time, Quantum Computation and Quantum Information remains the best textbook in this exciting field of science. This 10th anniversary edition includes an introduction from the authors setting the work in context. This comprehensive textbook describes such remarkable effects as fast quantum algorithms, quantum teleportation, quantum cryptography and quantum error-correction. Quantum mechanics and computer science are introduced before moving on to describe what a quantum computer is, how it can be used to solve problems faster than 'classical' computers and its real-world implementation. It concludes with an in-depth treatment of quantum information. Containing a wealth of figures and exercises, this well-known textbook is ideal for courses on the subject, and will interest beginning graduate students and researchers in physics, computer science, mathematics, and electrical engineering.


Proceedings ArticleDOI
20 Jun 2011
TL;DR: A comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value is presented.
Abstract: Datasets are an integral part of contemporary object recognition research. They have been the chief reason for the considerable progress in the field, not just as source of large amounts of training data, but also as means of measuring and comparing performance of competing algorithms. At the same time, datasets have often been blamed for narrowing the focus of object recognition research, reducing it to a single benchmark performance number. Indeed, some datasets, that started out as data capture efforts aimed at representing the visual world, have become closed worlds unto themselves (e.g. the Corel world, the Caltech-101 world, the PASCAL VOC world). With the focus on beating the latest benchmark numbers on the latest dataset, have we perhaps lost sight of the original purpose? The goal of this paper is to take stock of the current state of recognition datasets. We present a comparison study using a set of popular datasets, evaluated based on a number of criteria including: relative data bias, cross-dataset generalization, effects of closed-world assumption, and sample value. The experimental results, some rather surprising, suggest directions that can improve dataset collection as well as algorithm evaluation protocols. But more broadly, the hope is to stimulate discussion in the community regarding this very important, but largely neglected issue.

Journal ArticleDOI
16 Dec 2011-Science
TL;DR: A measure of dependence for two-variable relationships: the maximal information coefficient (MIC), which captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination of the data relative to the regression function.
Abstract: Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Book
01 Jan 2011
TL;DR: In Alone Together as mentioned in this paper, MIT technology and society professor Sherry Turkle explores the power of our new tools and toys to dramatically alter our social lives and argues that despite the handwaving of todays self-described prophets of the future, it will be the next generation who will chart the path between isolation and connectivity.
Abstract: Consider Facebookits human contact, only easier to engage with and easier to avoid. Developing technology promises closeness. Sometimes it delivers, but much of our modern life leaves us less connected with people and more connected to simulations of them. In Alone Together, MIT technology and society professor Sherry Turkle explores the power of our new tools and toys to dramatically alter our social lives. Its a nuanced exploration of what we are looking forand sacrificingin a world of electronic companions and social networking tools, and an argument that, despite the hand-waving of todays self-described prophets of the future, it will be the next generation who will chart the path between isolation and connectivity.

Journal ArticleDOI
14 Jan 2011-Science
TL;DR: This work surveys the vast terrain of ‘culturomics,’ focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000, and shows how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology and the pursuit of fame.
Abstract: We constructed a corpus of digitized texts containing about 4% of all books ever printed. Analysis of this corpus enables us to investigate cultural trends quantitatively. We survey the vast terrain of 'culturomics,' focusing on linguistic and cultural phenomena that were reflected in the English language between 1800 and 2000. We show how this approach can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology. Culturomics extends the boundaries of rigorous quantitative inquiry to a wide array of new phenomena spanning the social sciences and the humanities.

Journal ArticleDOI
TL;DR: In this paper, the authors provide a detailed accounting of the biosynthetic requirements to construct a new cell and illustrate the importance of glycolysis in providing carbons to generate biomass.
Abstract: Warburg's observation that cancer cells exhibit a high rate of glycolysis even in the presence of oxygen (aerobic glycolysis) sparked debate over the role of glycolysis in normal and cancer cells. Although it has been established that defects in mitochondrial respiration are not the cause of cancer or aerobic glycolysis, the advantages of enhanced glycolysis in cancer remain controversial. Many cells ranging from microbes to lymphocytes use aerobic glycolysis during rapid proliferation, which suggests it may play a fundamental role in supporting cell growth. Here, we review how glycolysis contributes to the metabolic processes of dividing cells. We provide a detailed accounting of the biosynthetic requirements to construct a new cell and illustrate the importance of glycolysis in providing carbons to generate biomass. We argue that the major function of aerobic glycolysis is to maintain high levels of glycolytic intermediates to support anabolic reactions in cells, thus providing an explanation for why in...

Journal ArticleDOI
26 Aug 2011-Science
TL;DR: In this article, the authors analyzed whole-exome sequencing data from 74 tumor-normal pairs and found that at least 30% of cases harbored mutations in genes that regulate squamous differentiation (for example, NOTCH1, IRF6, and TP63), implicating its dysregulation as a major driver of HNSCC carcinogenesis.
Abstract: Head and neck squamous cell carcinoma (HNSCC) is a common, morbid, and frequently lethal malignancy. To uncover its mutational spectrum, we analyzed whole-exome sequencing data from 74 tumor-normal pairs. The majority exhibited a mutational profile consistent with tobacco exposure; human papillomavirus was detectable by sequencing DNA from infected tumors. In addition to identifying previously known HNSCC genes (TP53, CDKN2A, PTEN, PIK3CA, and HRAS), our analysis revealed many genes not previously implicated in this malignancy. At least 30% of cases harbored mutations in genes that regulate squamous differentiation (for example, NOTCH1, IRF6, and TP63), implicating its dysregulation as a major driver of HNSCC carcinogenesis. More generally, the results indicate the ability of large-scale sequencing to reveal fundamental tumorigenic mechanisms.