scispace - formally typeset
Search or ask a question
Author

Linda Hsie

Bio: Linda Hsie is an academic researcher from Affymetrix. The author has contributed to research in topics: Single-nucleotide polymorphism & SNP genotyping. The author has an hindex of 8, co-authored 8 publications receiving 4011 citations. Previous affiliations of Linda Hsie include Massachusetts Institute of Technology.

Papers
More filters
Journal ArticleDOI
15 May 1998-Science
TL;DR: A large-scale survey for SNPs was examined by a combination of gel-based sequencing and high-density variation-detection DNA chips, and a genetic map was constructed showing the location of 2227 candidate SNPs.
Abstract: Single-nucleotide polymorphisms (SNPs) are the most frequent type of variation in the human genome, and they provide powerful tools for a variety of medical genetic studies. In a large-scale survey for SNPs, 2.3 megabases of human genomic DNA was examined by a combination of gel-based sequencing and high-density variation-detection DNA chips. A total of 3241 candidate SNPs were identified. A genetic map was constructed showing the location of 2227 of these SNPs. Prototype genotyping chips were developed that allow simultaneous genotyping of 500 SNPs. The results provide a characterization of human diversity at the nucleotide level and demonstrate the feasibility of large-scale identification of human SNPs.

2,383 citations

Journal ArticleDOI
TL;DR: The degree of nucleotide polymorphism across these human genes, and orthologous great ape sequences, is highly variable and is correlated with the effects of functional conservation on gene sequences.
Abstract: Sequence variation in human genes is largely confined to single-nucleotide polymorphisms (SNPs) and is valuable in tests of association with common diseases and pharmacogenetic traits. We performed a systematic and comprehensive survey of molecular variation to assess the nature, pattern and frequency of SNPs in 75 candidate human genes for blood-pressure homeostasis and hypertension. We assayed 28 Mb (190 kb in 148 alleles) of genomic sequence, comprising the 5´ and 3´ untranslated regions (UTRs), introns and coding sequence of these genes, for sequence differences in individuals of African and Northern European descent using high-density variant detection arrays (VDAs). We identified 874 candidate human SNPs, of which 22% were confirmed by DNA sequencing to reveal a discordancy rate of 21% for VDA detection. The SNPs detected have an average minor allele frequency of 11%, and 387 are within the coding sequence (cSNPs). Of all cSNPs, 54% lead to a predicted change in the protein sequence, implying a high level of human protein diversity. These protein-altering SNPs are 38% of the total number of such SNPs expected, are more likely to be population-specific and are rarer in the human population, directly demonstrating the effects of natural selection on human genes. Overall, the degree of nucleotide polymorphism across these human genes, and orthologous great ape sequences, is highly variable and is correlated with the effects of functional conservation on gene sequences.

1,050 citations

Journal ArticleDOI
TL;DR: In a diverse human population set, it was found that SNP alleles with higher frequencies were more likely to be ancestral than less frequently occurring alleles.
Abstract: Here we report the application of high-density oligonucleotide array (DNA chip)-based analysis to determine the distant history of single nucleotide polymorphisms (SNPs) in current human populations. We analysed orthologues for 397 human SNP sites (identified in CEPH pedigrees from Amish, Venezuelan and Utah populations) from 23 common chimpanzee, 19 pygmy chimpanzee and 11 gorilla genomic DNA samples. From this data we determined 214 proposed ancestral alleles (the sequence found in the last common ancestor of humans and chimpanzees). In a diverse human population set, we found that SNP alleles with higher frequencies were more likely to be ancestral than less frequently occurring alleles. There were, however, exceptions. We also found three shared human/pygmy chimpanzee polymorphisms, all involving CpG dinucleotides, and two shared human/gorilla polymorphisms, one involving a CpG dinucleotide. We demonstrate that microarray-based assays allow rapid comparative sequence analysis of intra- and interspecies genetic variation.

409 citations

Journal ArticleDOI
TL;DR: Based on the preliminary results, using oligonucleotide arrays to genotype several thousand polymorphic loci simultaneously appears feasible.

134 citations

Journal ArticleDOI
01 Jan 2002-Genomics
TL;DR: The results show that long-range haplotypes can be obtained easily with this resource and that a collection of such samples is a simple way to obtain reference haplotypes for association studies in various populations.

37 citations


Cited by
More filters
Book ChapterDOI
TL;DR: This chapter assumes acquaintance with the principles and practice of PCR, as outlined in, for example, refs.
Abstract: 1. Introduction Designing PCR and sequencing primers are essential activities for molecular biologists around the world. This chapter assumes acquaintance with the principles and practice of PCR, as outlined in, for example, refs. 1–4. Primer3 is a computer program that suggests PCR primers for a variety of applications, for example to create STSs (sequence tagged sites) for radiation hybrid mapping (5), or to amplify sequences for single nucleotide polymor-phism discovery (6). Primer3 can also select single primers for sequencing reactions and can design oligonucleotide hybridization probes. In selecting oligos for primers or hybridization probes, Primer3 can consider many factors. These include oligo melting temperature, length, GC content , 3′ stability, estimated secondary structure, the likelihood of annealing to or amplifying undesirable sequences (for example interspersed repeats), the likelihood of primer–dimer formation between two copies of the same primer, and the accuracy of the source sequence. In the design of primer pairs Primer3 can consider product size and melting temperature, the likelihood of primer– dimer formation between the two primers in the pair, the difference between primer melting temperatures, and primer location relative to particular regions of interest or to be avoided.

16,407 citations

Journal ArticleDOI
J. Craig Venter1, Mark Raymond Adams1, Eugene W. Myers1, Peter W. Li1  +269 moreInstitutions (12)
16 Feb 2001-Science
TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.
Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

12,098 citations

Journal Article
Fumio Tajima1
30 Oct 1989-Genomics
TL;DR: It is suggested that the natural selection against large insertion/deletion is so weak that a large amount of variation is maintained in a population.

11,521 citations

Journal ArticleDOI
01 Apr 2001-Genetics
TL;DR: It was concluded that selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.
Abstract: Recent advances in molecular genetic techniques will make dense marker maps available and genotyping many individuals for these markers feasible. Here we attempted to estimate the effects of ∼50,000 marker haplotypes simultaneously from a limited number of phenotypic records. A genome of 1000 cM was simulated with a marker spacing of 1 cM. The markers surrounding every 1-cM region were combined into marker haplotypes. Due to finite population size (Ne = 100), the marker haplotypes were in linkage disequilibrium with the QTL located between the markers. Using least squares, all haplotype effects could not be estimated simultaneously. When only the biggest effects were included, they were overestimated and the accuracy of predicting genetic values of the offspring of the recorded animals was only 0.32. Best linear unbiased prediction of haplotype effects assumed equal variances associated to each 1-cM chromosomal segment, which yielded an accuracy of 0.73, although this assumption was far from true. Bayesian methods that assumed a prior distribution of the variance associated with each chromosome segment increased this accuracy to 0.85, even when the prior was not correct. It was concluded that selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval.

6,036 citations

Journal ArticleDOI
John W. Belmont1, Paul Hardenbol, Thomas D. Willis, Fuli Yu1, Huanming Yang2, Lan Yang Ch'Ang, Wei Huang3, Bin Liu2, Yan Shen3, Paul K.H. Tam4, Lap-Chee Tsui4, Mary M.Y. Waye5, Jeffrey Tze Fei Wong6, Changqing Zeng2, Qingrun Zhang2, Mark S. Chee7, Luana Galver7, Semyon Kruglyak7, Sarah S. Murray7, Arnold Oliphant7, Alexandre Montpetit8, Fanny Chagnon8, Vincent Ferretti8, Martin Leboeuf8, Michael S. Phillips8, Andrei Verner8, Shenghui Duan9, Denise L. Lind10, Raymond D. Miller9, John P. Rice9, Nancy L. Saccone9, Patricia Taillon-Miller9, Ming Xiao10, Akihiro Sekine, Koki Sorimachi, Yoichi Tanaka, Tatsuhiko Tsunoda, Eiji Yoshino, David R. Bentley11, Sarah E. Hunt11, Don Powell11, Houcan Zhang12, Ichiro Matsuda13, Yoshimitsu Fukushima14, Darryl Macer15, Eiko Suda15, Charles N. Rotimi16, Clement Adebamowo17, Toyin Aniagwu17, Patricia A. Marshall18, Olayemi Matthew17, Chibuzor Nkwodimmah17, Charmaine D.M. Royal16, Mark Leppert19, Missy Dixon19, Fiona Cunningham20, Ardavan Kanani20, Gudmundur A. Thorisson20, Peter E. Chen21, David J. Cutler21, Carl S. Kashuk21, Peter Donnelly22, Jonathan Marchini22, Gilean McVean22, Simon Myers22, Lon R. Cardon22, Andrew P. Morris22, Bruce S. Weir23, James C. Mullikin24, Michael Feolo24, Mark J. Daly25, Renzong Qiu26, Alastair Kent, Georgia M. Dunston16, Kazuto Kato27, Norio Niikawa28, Jessica Watkin29, Richard A. Gibbs1, Erica Sodergren1, George M. Weinstock1, Richard K. Wilson9, Lucinda Fulton9, Jane Rogers11, Bruce W. Birren25, Hua Han2, Hongguang Wang, Martin Godbout30, John C. Wallenburg8, Paul L'Archevêque, Guy Bellemare, Kazuo Todani, Takashi Fujita, Satoshi Tanaka, Arthur L. Holden, Francis S. Collins24, Lisa D. Brooks24, Jean E. McEwen24, Mark S. Guyer24, Elke Jordan31, Jane Peterson24, Jack Spiegel24, Lawrence M. Sung32, Lynn F. Zacharia24, Karen Kennedy29, Michael Dunn29, Richard Seabrook29, Mark Shillito, Barbara Skene29, John Stewart29, David Valle21, Ellen Wright Clayton33, Lynn B. Jorde19, Aravinda Chakravarti21, Mildred K. Cho34, Troy Duster35, Troy Duster36, Morris W. Foster37, Maria Jasperse38, Bartha Maria Knoppers39, Pui-Yan Kwok10, Julio Licinio40, Jeffrey C. Long41, Pilar N. Ossorio42, Vivian Ota Wang33, Charles N. Rotimi16, Patricia Spallone29, Patricia Spallone43, Sharon F. Terry44, Eric S. Lander25, Eric H. Lai45, Deborah A. Nickerson46, Gonçalo R. Abecasis41, David Altshuler47, Michael Boehnke41, Panos Deloukas11, Julie A. Douglas41, Stacey Gabriel25, Richard R. Hudson48, Thomas J. Hudson8, Leonid Kruglyak49, Yusuke Nakamura50, Robert L. Nussbaum24, Stephen F. Schaffner25, Stephen T. Sherry24, Lincoln Stein20, Toshihiro Tanaka 
18 Dec 2003-Nature
TL;DR: The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance the ability to choose targets for therapeutic intervention.
Abstract: The goal of the International HapMap Project is to determine the common patterns of DNA sequence variation in the human genome and to make this information freely available in the public domain. An international consortium is developing a map of these patterns across the genome by determining the genotypes of one million or more sequence variants, their frequencies and the degree of association between them, in DNA samples from populations with ancestry from parts of Africa, Asia and Europe. The HapMap will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance our ability to choose targets for therapeutic intervention.

5,926 citations