scispace - formally typeset
Search or ask a question
Author

Lauren Linton

Bio: Lauren Linton is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Human genome & SNP array. The author has an hindex of 6, co-authored 6 publications receiving 25355 citations.

Papers
More filters
Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
15 Feb 2001-Nature
TL;DR: This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.
Abstract: We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.

2,908 citations

Journal ArticleDOI
28 Sep 2000-Nature
TL;DR: A simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps, which facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species.
Abstract: Most genomic variation is attributable to single nucleotide polymorphisms (SNPs), which therefore offer the highest resolution for tracking disease genes and population history. It has been proposed that a dense map of 30,000-500,000 SNPs can be used to scan the human genome for haplotypes associated with common diseases. Here we describe a simple but powerful method, called reduced representation shotgun (RRS) sequencing, for creating SNP maps. RRS re-samples specific subsets of the genome from several individuals, and compares the resulting sequences using a highly accurate SNP detection algorithm. The method can be extended by alignment to available genome sequence, increasing the yield of SNPs and providing map positions. These methods are being used by The SNP Consortium, an international collaboration of academic centres, pharmaceutical companies and a private foundation, to discover and release at least 300,000 human SNPs. We have discovered 47,172 human SNPs by RRS, and in total the Consortium has identified 148,459 SNPs. More broadly, RRS facilitates the rapid, inexpensive construction of SNP maps in biomedically and agriculturally important species. SNPs discovered by RRS also offer unique advantages for large-scale genotyping.

749 citations

Journal ArticleDOI
TL;DR: The complete genome sequence of an acetate-utilizing methanogen, Methanosarcina acetivorans C2A, is reported, which indicates the likelihood of undiscovered natural energy sources for methanogenesis, whereas the presence of single-subunit carbon monoxide dehydrogenases raises the possibility of nonmethanogenic growth.
Abstract: The Archaea remain the most poorly understood domain of life despite their importance to the biosphere. Methanogenesis, which plays a pivotal role in the global carbon cycle, is unique to the Archaea. Each year, an estimated 900 million metric tons of methane are biologically produced, representing the major global source for this greenhouse gas and contributing significantly to global warming (Schlesinger 1997). Methanogenesis is critical to the waste-treatment industry and biologically produced methane also represents an important alternative fuel source. At least two-thirds of the methane in nature is derived from acetate, although only two genera of methanogens are known to be capable of utilizing this substrate. We report here the first complete genome sequence of an acetate-utilizing (acetoclastic) methanogen, Methanosarcina acetivorans C2A. The Methanosarcineae are metabolically and physiologically the most versatile methanogens. Only Methanosarcina species possess all three known pathways for methanogenesis (Fig. ​(Fig.1)1) and are capable of utilizing no less than nine methanogenic substrates, including acetate. In contrast, all other orders of methanogens possess a single pathway for methanogenesis, and many utilize no more than two substrates. Among methanogens, the Methanosarcineae also display extensive environmental diversity. Individual species of Methanosarcina have been found in freshwater and marine sediments, decaying leaves and garden soils, oil wells, sewage and animal waste digesters and lagoons, thermophilic digesters, feces of herbivorous animals, and the rumens of ungulates (Zinder 1993). Figure 1 Three pathways for methanogenesis. Methanogenesis is a form of anaerobic respiration using a variety of one-carbon (C-1) compounds or acetic acid as a terminal electron acceptor. All three pathways converge on the reduction of methyl-CoM to methane (CH ... The Methanosarcineae are unique among the Archaea in forming complex multicellular structures during different phases of growth and in response to environmental change (Fig. ​(Fig.2).2). Within the Methanosarcineae, a number of distinct morphological forms have been characterized, including single cells with and without a cell envelope, as well as multicellular packets and lamina (Macario and Conway de Macario 2001). Packets and lamina display internal morphological heterogeneity, suggesting the possibility of cellular differentiation. Moreover, it has been suggested that cells within lamina may display differential production of extracellular material, a potential form of cellular specialization (Macario and Conway de Macario 2001). The formation of multicellular structures has been proposed to act as an adaptation to stress and likely plays a role in the ability of Methanosarcina species to colonize diverse environments. Figure 2 Different morphological forms of Methanosarcina acetivorans. Thin-section electron micrographs showing M. acetivorans growing as both single cells (center of micrograph) and within multicellular aggregates (top left, bottom right). Cells were harvested ... Significantly, powerful methods for genetic analysis exist for Methanosarcina species. These tools include plasmid shuttle vectors (Metcalf et al. 1997), very high efficiency transformation (Metcalf et al. 1997), random in vivo transposon mutagenesis (Zhang et al. 2000), directed mutagenesis of specific genes (Zhang et al. 2000), multiple selectable markers (Boccazzi et al. 2000), reporter gene fusions (M. Pritchett and W. Metcalf, unpubl.), integration vectors (Conway de Macario et al. 1996), and anaerobic incubators for large-scale growth of methanogens on solid media (Metcalf et al. 1998). Furthermore, and in contrast to other known methanogens, genetic analysis can be used to study the process of methanogenesis: Because Methanosarcina species are able to utilize each of the three known methanogenic pathways, mutants in a single pathway are viable (M. Pritchett and W. Metcalf, unpubl.). The availability of genetic methods allowing immediate exploitation of genomic sequence, coupled with the genetic, physiological, and environmental diversity of M. acetivorans make this species an outstanding model organism for the study of archaeal biology. For these reasons, we set out to study the genome of M. acetivorans.

626 citations


Cited by
More filters
Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

14,075 citations

Journal ArticleDOI
J. Craig Venter1, Mark Raymond Adams1, Eugene W. Myers1, Peter W. Li1  +269 moreInstitutions (12)
16 Feb 2001-Science
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

12,098 citations

Journal ArticleDOI
14 Jan 2005-Cell
TL;DR: In a four-genome analysis of 3' UTRs, approximately 13,000 regulatory relationships were detected above the estimate of false-positive predictions, thereby implicating as miRNA targets more than 5300 human genes, which represented 30% of the gene set.

11,624 citations

Journal ArticleDOI
TL;DR: A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu.
Abstract: As vertebrate genome sequences near completion and research refocuses to their analysis, the issue of effective genome annotation display becomes critical. A mature web tool for rapid and reliable display of any requested portion of the genome at any scale, together with several dozen aligned annotation tracks, is provided at http://genome.ucsc.edu. This browser displays assembly contigs and gaps, mRNA and expressed sequence tag alignments, multiple gene predictions, cross-species homologies, single nucleotide polymorphisms, sequence-tagged sites, radiation hybrid data, transposon repeats, and more as a stack of coregistered tracks. Text and sequence-based searches provide quick and precise access to any region of specific interest. Secondary links from individual features lead to sequence details and supplementary off-site databases. One-half of the annotation tracks are computed at the University of California, Santa Cruz from publicly available sequence data; collaborators worldwide provide the rest. Users can stably add their own custom tracks to the browser for educational or research purposes. The conceptual and technical framework of the browser, its underlying MYSQL database, and overall use are described. The web site currently serves over 50,000 pages per day to over 3000 different users.

9,605 citations