scispace - formally typeset
Search or ask a question

Showing papers in "Cold Spring Harbor Symposia on Quantitative Biology in 2003"


Journal ArticleDOI
TL;DR: An integrated SNP genotyping system that combines a highly multiplexed assay with an accurate readout technology based on random arrays of DNA-coated beads is developed that helps enable genome-wide association studies and other large-scale genetic analysis projects.
Abstract: genetic factors underlying common disease are largely unknown. Discovery of disease-causing genes will transform our knowledge of the genetic contribution to human disease, lead to new genetic screens, and underpin research into new cures and improved lifestyles. The se-quencing of the human genome has catalyzed efforts to search for disease genes by the strategy of associating sequence variants with measurable phenotypes. In particular , the Human Genome Project and follow-on efforts to characterize genetic variation have resulted in the discovery of millions of single-nucleotide polymorphisms (SNPs) (Patil et al. 2001; Sachidanandam et al. 2001; Reich et al. 2003). This represents a significant fraction of common genetic variation in the human genome and creates an unprecedented opportunity to associate genes with phenotypes via large-scale SNP genotyping studies. To make use of this information, efficient and accurate SNP genotyping technologies are needed. However, most methods were designed to analyze only one or a few SNPs per assay, and are costly to scale up (Kwok 2001; Syvanen 2001). To help enable genome-wide association studies and other large-scale genetic analysis projects, we have developed an integrated SNP genotyping system that combines a highly multiplexed assay with an accurate readout technology based on random arrays of DNA-coated beads (Michael et al. 1998; Oliphant et al. 2002; Gunderson et al. 2004). Our aim was to reduce costs and increase productivity by ~2 orders of magnitude. We chose a multiplexed approach because it is more easily scalable and is intrinsically cost-efficient (Wang et al. 1998). Although existing multiplexed approaches lacked the combination of accuracy, robustness, scalability, and cost-effectiveness needed for truly large-scale endeavors, we hypothesized that some of these limitations could be overcome by designing an assay specifically for multiplexing. To increase throughput and decrease costs by ~2 orders of magnitude, it was necessary to eliminate bottlenecks throughout the genotyping process. It was also desirable to minimize sources of variability and human error in order to ensure data quality and reproducibility. We therefore took a systems-level view to technology design, development , and integration. Although the focus of this paper is on a novel, highly multiplexed genotyping assay, the GoldenGate ™ assay, four other key technologies that were developed in parallel, as part of the complete BeadLab system (Oliphant et al. 2002), are briefly described below. BEADARRAY™ PLATFORM We developed an array technology based on random assembly of beads in micro-wells located at the end of an …

678 citations


Journal ArticleDOI
TL;DR: The ultimate goal is not only to describe detailed gene expression profiles, but also to gain a greater understanding of the organization of gene regulatory networks and to determine how they control cell function during development and differentiation in C. elegans.
Abstract: genome and that of the nematode Caenorhabditis elegans allows the large-scale identification and analysis of orthologs of human genes in an organism amenable to detailed genetic and molecular analyses. We are determining gene expression profiles in specific cells, tissues, and developmental stages in C. elegans. Our ultimate goal is not only to describe detailed gene expression profiles, but also to gain a greater understanding of the organization of gene regulatory networks and to determine how they control cell function during development and differentiation. The use of C. elegans as a platform to investigate the details of gene regulatory networks has several major advantages. Two key advantages are that it is the simplest multicellular organism for which there is a complete sequence (C. elegans Sequencing Consortium 1998), and it is the only multicellular organism for which there is a completely documented cell lineage (Sulston and Horvitz 1977; Sulston et al. 1983). C. elegans is amenable to both forward and reverse genetics (for review, see Riddle et al. 1997). A 2-week life span and generation time of just 3 days for C. elegans allows experimental procedures to be much shorter, more flexible, and more cost-effective compared to the use of mouse or zebrafish models for genomic analyses. Finally, the small size, transparency, and limited cell number of the worm make it possible to observe many complex cellular and developmental processes that cannot easily be observed in more complex organisms. Morphogenesis of organs and tissues can be observed at the level of a single cell (White et al. 1986). As events have shown, investigating the details of C. elegans biology can lead to fundamental observations about human health and biology (Sulston 1976; Hedgecock et al. 1983; Ellis and Horvitz 1986). We are using complementary approaches to examine gene expression in C. elegans. We are constructing transgenic animals containing promoter green fluorescent protein (GFP) fusions of nematode orthologs of human genes. These transgenic animals are examined to determine the time and tissue expression pattern of the promoter::GFP constructs. Concurrently, we are undertaking serial analysis of gene expression (SAGE) on all developmental stages of intact animals and on selected purified cells. Tissues and selected cells are isolated using a fluorescence activated cell sorter (FACS) to sort promoter::GFP marked cell populations. To date we have purified to near homogeneity cell populations for embryonic muscle, gut, and a subset of neurons. The SAGE and promoter::GFP expression data are publicly available at http://elegans.bcgsc.bc.ca.

315 citations



Journal ArticleDOI
TL;DR: The computational strategy that led to a rough estimate that ~5% of the human genome is in short segments that appear to be under selection based on comparison with mouse is presented, providing details on scoring functions, data preparation, and statistical techniques.
Abstract: recently become available for two mammalian genomes, human (Lander et al. 2001; Venter et al. 2001) and mouse (Waterston et al. 2002). This raises the possibility of using comparative genomics to estimate what fraction of the human genome evolves under purifying selection. Lacking genomes of other mammals, this comparative exercise is still in its preliminary stages. However, a rough estimate has been made that ~5% of the human genome is in short segments that appear to be under selection based on comparison with mouse (Waterston et al. 2002). Here, as a basis for future refinements, we present the computational strategy that led to this estimate, providing details on scoring functions, data preparation, and statistical techniques. We also describe stability analyses, control experiments, and tests for the effects of artifacts that were performed to establish robustness of our results, and discuss possible alternate interpretations. Our strategy hinges on three elements: (1) the construction of various collections of short aligned windows of the human genome (e.g., 50 bp)—in particular, a large collection of such windows that are very likely to have evolved neutrally since the divergence of human and mouse (\" ancestral repeats, \" relics of transposons that were present in the genome of our common ancestor with mouse); (2) the development of a score function quantifying conservation in short aligned windows, and providing a satisfactory \" template \" for neutral behavior when computed on windows in ancestral repeats; and (3) statistical techniques to estimate and compare the score distributions for genome-wide and ancestral repeat windows, and thus infer an upper bound on the share of genome-wide windows that are compatible with the neutral template. The remaining share of the genome is populated by windows that are too conserved to be modeled by the neutral template, and hence are either evolving under purifying selection, or are evolving neutrally but are experiencing fewer substitutions than nearby windows in ancestral repeats for some unknown reasons. Because ancestral transposons have been inactive since their insertion in the genome of the common ancestor of human and mouse, they are one type of human DNA that is most likely to have evolved free of any selective pressure. The rate of substitution in these sites between human and mouse is similar to, but slightly less than, that observed in fourfold degenerate sites from codons, and covaries regionally with that rate (Waterston et al. 2002; Hardison et al. 2003). …

67 citations










Journal ArticleDOI
TL;DR: The framework of genetic variability in the genome reflects both evolutionary adaptive processes that are locus-specific and population-level forces that affect all the components of the genome equally.
Abstract: DNA molecules are organic elements of information storage imbued with imperfect copying processes. Thus, they are fundamental repositories of an organism’s evolutionary history. The field of human molecular evolution is predicated on the concept that patterns of DNA sequence variation in living populations encode aspects of human heritage shaped by a constellation of evolutionary influences. The framework of genetic variability in the genome reflects both evolutionary adaptive processes that are locus-specific and population-level forces that affect all the components of the genome equally. Genetic research often focuses on distinguishing inconsistencies in patterns of variation between genomic regions to help bridge the gap between particular genes and traits, including matters of function and malfunction. Alternatively, genomic DNA is also an archive of those aspects of human evolutionary processes reflective of population-level forces like drift, subdivision, size fluctuation, and migration. By studying the degree of genetic molec








Journal ArticleDOI
TL;DR: Schizophrenia (SCZ) and bipolar affective disorder (BPAD) (formerly termed manic-depressive illness) aresevere, disabling psychiatric illnesses that feature prominently in the top ten causes of disability worldwide.
Abstract: Schizophrenia (SCZ) and bipolar affective disorder(BPAD) (formerly termed manic-depressive illness) aresevere, disabling psychiatric illnesses that feature prominently in the top ten causes of disability worldwide (Lopezand Murray 1998) Each will affect about 1% of the population in their lifetime




Journal ArticleDOI
TL;DR: The approaches to systematically identification of regulatory 2 regions and gene networks have turned out to be extremely difficult, largely due to the size and complexity of the genomes (hence the diversity of the cell/tissue types and the complication of developmental stages).
Abstract: 1 INTRODUCTION Since the celebrated discovery of Watson-Crick double-helix structure of DNA, it has taken 50 years for human genome to be sequenced. It may very well take another 50 years for the functional information to be fully decoded. Up till recently, genome research has mainly been focusing on coding regions, where the immediate questions are " where are the protein coding regions? " and " what are the functions of the gene products ". Increasingly, the field is advancing towards non-coding regions, where the central questions become " where are the regulatory regions? " and " how do they control gene expressions ". In 1961, Jacob and Monod published " On the regulation of gene activity " at the 26th Cold Spring Harbor Symposium on Quantitative Biology, in which some of the fundamental concepts of gene regulation were first elegantly formulated. Regulatory regions are most fundamental, because all the gene structures are defined by and recognized through the cis-elements in such regions; further more, what a gene does in vivo is intimately related to when, where and how much it is expressed. A phenotype, upon which the selection force is acting, is the integrated result of gene function and regulation. It is argued that the animal diversity is mainly due to the evolutionary expansion in regulatory complexity (Levine & Tjian 2003). Most regulations occur at the transcriptional level and the initiation of transcription is largely determined by the promoter located at the beginning of each gene, identification of promoters and cis-regulatory elements within them has become the prerequisite for understanding of gene regulation. For a few model organisms with compact genome (such as phage, bacteria and yeast), many of the gene regulatory pathways or networks have been worked out. But for mammalian systems, such as human, systematically identification of regulatory 2 regions and gene networks have turned out to be extremely difficult, largely due to the size and complexity of the genomes (hence, as a result, the diversity of the cell/tissue types and the complication of developmental stages). Here I will outline our approaches to this problem. As genome research is data and technology driven, many approaches in the field can soon become obsolete once new or more data or technologies become available. I will try to state generic ideas and methodologies that may be evolving with or refined by new data or technologies. I will also try to …

Journal ArticleDOI
TL;DR: The predicted availability of the complete sequence in the first half of 2003 provided the ideal backdrop to a Symposium that focused not just on the details of the sequence, but on the power of information it contains to transform scientific investigations into fundamental biological processes and the causes of human disease.
Abstract: In 2001, as we considered topics for future Symposia, it rapidly became clear what the focus of attention should be in 2003. It had escaped no-one's notice that there was a momentous event to be celebrated that year—the 50th anniversary of the proposal by James Watson and Francis Crick of a structure for DNA now famously known as the double helix. And of great significance at Cold Spring Harbor was the 35th anniversary in 2003 of Jim Watson's appointment as Laboratory Director and the beginning of his unique influence on all aspects of this institution. These anniversaries would clearly require special attention and were indeed recognized in February at a remarkable meeting at the Laboratory devoted to DNA and at a spectacular gala in New York. For the Symposium topic, however, we turned to another achievement that, even two years in advance, could be seen as the most significant milestone in biological science since the discovery of the double helix: the completion of the sequence of the human genome. Begun formally in 1990, with Jim Watson as its first director, the federally funded effort to map and sequence the entire human DNA molecule had resulted in the publication of a draft sequence in 2001. The predicted availability of the complete sequence in the first half of 2003 provided the ideal backdrop to a Symposium that focused not just on the details of the sequence, but on the power of information it contains to transform scientific investigations into fundamental biological processes and the causes of human disease.

Journal ArticleDOI
TL;DR: Positional identification of structural and regulatory quantitative trait nucleotides in domestic animal species and their role in conservation is studied.
Abstract: Positional identification of structural and regulatory quantitative trait nucleotides in domestic animal species.