scispace - formally typeset
Search or ask a question
Author

James C. Mullikin

Bio: James C. Mullikin is an academic researcher from National Institutes of Health. The author has contributed to research in topics: Exome sequencing & Genome. The author has an hindex of 90, co-authored 198 publications receiving 85708 citations. Previous affiliations of James C. Mullikin include Wellcome Trust & Wellcome Trust Sanger Institute.


Papers
More filters
Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
Robert H. Waterston1, Kerstin Lindblad-Toh2, Ewan Birney, Jane Rogers3  +219 moreInstitutions (26)
05 Dec 2002-Nature
TL;DR: The results of an international collaboration to produce a high-quality draft sequence of the mouse genome are reported and an initial comparative analysis of the Mouse and human genomes is presented, describing some of the insights that can be gleaned from the two sequences.
Abstract: The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.

6,643 citations

Journal ArticleDOI
John W. Belmont1, Paul Hardenbol, Thomas D. Willis, Fuli Yu1, Huanming Yang2, Lan Yang Ch'Ang, Wei Huang3, Bin Liu2, Yan Shen3, Paul K.H. Tam4, Lap-Chee Tsui4, Mary M.Y. Waye5, Jeffrey Tze Fei Wong6, Changqing Zeng2, Qingrun Zhang2, Mark S. Chee7, Luana Galver7, Semyon Kruglyak7, Sarah S. Murray7, Arnold Oliphant7, Alexandre Montpetit8, Fanny Chagnon8, Vincent Ferretti8, Martin Leboeuf8, Michael S. Phillips8, Andrei Verner8, Shenghui Duan9, Denise L. Lind10, Raymond D. Miller9, John P. Rice9, Nancy L. Saccone9, Patricia Taillon-Miller9, Ming Xiao10, Akihiro Sekine, Koki Sorimachi, Yoichi Tanaka, Tatsuhiko Tsunoda, Eiji Yoshino, David R. Bentley11, Sarah E. Hunt11, Don Powell11, Houcan Zhang12, Ichiro Matsuda13, Yoshimitsu Fukushima14, Darryl Macer15, Eiko Suda15, Charles N. Rotimi16, Clement Adebamowo17, Toyin Aniagwu17, Patricia A. Marshall18, Olayemi Matthew17, Chibuzor Nkwodimmah17, Charmaine D.M. Royal16, Mark Leppert19, Missy Dixon19, Fiona Cunningham20, Ardavan Kanani20, Gudmundur A. Thorisson20, Peter E. Chen21, David J. Cutler21, Carl S. Kashuk21, Peter Donnelly22, Jonathan Marchini22, Gilean McVean22, Simon Myers22, Lon R. Cardon22, Andrew P. Morris22, Bruce S. Weir23, James C. Mullikin24, Michael Feolo24, Mark J. Daly25, Renzong Qiu26, Alastair Kent, Georgia M. Dunston16, Kazuto Kato27, Norio Niikawa28, Jessica Watkin29, Richard A. Gibbs1, Erica Sodergren1, George M. Weinstock1, Richard K. Wilson9, Lucinda Fulton9, Jane Rogers11, Bruce W. Birren25, Hua Han2, Hongguang Wang, Martin Godbout30, John C. Wallenburg8, Paul L'Archevêque, Guy Bellemare, Kazuo Todani, Takashi Fujita, Satoshi Tanaka, Arthur L. Holden, Francis S. Collins24, Lisa D. Brooks24, Jean E. McEwen24, Mark S. Guyer24, Elke Jordan31, Jane Peterson24, Jack Spiegel24, Lawrence M. Sung32, Lynn F. Zacharia24, Karen Kennedy29, Michael Dunn29, Richard Seabrook29, Mark Shillito, Barbara Skene29, John Stewart29, David Valle21, Ellen Wright Clayton33, Lynn B. Jorde19, Aravinda Chakravarti21, Mildred K. Cho34, Troy Duster35, Troy Duster36, Morris W. Foster37, Maria Jasperse38, Bartha Maria Knoppers39, Pui-Yan Kwok10, Julio Licinio40, Jeffrey C. Long41, Pilar N. Ossorio42, Vivian Ota Wang33, Charles N. Rotimi16, Patricia Spallone29, Patricia Spallone43, Sharon F. Terry44, Eric S. Lander25, Eric H. Lai45, Deborah A. Nickerson46, Gonçalo R. Abecasis41, David Altshuler47, Michael Boehnke41, Panos Deloukas11, Julie A. Douglas41, Stacey Gabriel25, Richard R. Hudson48, Thomas J. Hudson8, Leonid Kruglyak49, Yusuke Nakamura50, Robert L. Nussbaum24, Stephen F. Schaffner25, Stephen T. Sherry24, Lincoln Stein20, Toshihiro Tanaka 
18 Dec 2003-Nature
TL;DR: The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance the ability to choose targets for therapeutic intervention.
Abstract: The goal of the International HapMap Project is to determine the common patterns of DNA sequence variation in the human genome and to make this information freely available in the public domain. An international consortium is developing a map of these patterns across the genome by determining the genotypes of one million or more sequence variants, their frequencies and the degree of association between them, in DNA samples from populations with ancestry from parts of Africa, Asia and Europe. The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance our ability to choose targets for therapeutic intervention.

5,926 citations

Journal ArticleDOI
John W. Belmont1, Andrew Boudreau, Suzanne M. Leal1, Paul Hardenbol  +229 moreInstitutions (40)
27 Oct 2005
TL;DR: A public database of common variation in the human genome: more than one million single nucleotide polymorphisms for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted.
Abstract: Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.

5,479 citations

Journal ArticleDOI
14 Jun 2007-Nature
TL;DR: Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.
Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

5,091 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Abstract: Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ~10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: [email protected]

43,862 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
Abstract: Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS—the 1000 Genome pilot alone includes nearly five terabases—make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.

20,557 citations

Journal ArticleDOI
TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Abstract: Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

20,335 citations

28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations