scispace - formally typeset
Search or ask a question

Showing papers by "Jane Rogers published in 2007"


Journal ArticleDOI
14 Jun 2007-Nature
TL;DR: Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.
Abstract: We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.

5,091 citations


Journal ArticleDOI
18 Oct 2007-Nature
TL;DR: The Phase II HapMap is described, which characterizes over 3.1 million human single nucleotide polymorphisms genotyped in 270 individuals from four geographically diverse populations and includes 25–35% of common SNP variation in the populations surveyed, and increased differentiation at non-synonymous, compared to synonymous, SNPs is demonstrated.
Abstract: We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.

4,565 citations


Journal ArticleDOI
Pardis C. Sabeti1, Pardis C. Sabeti2, Patrick Varilly2, Patrick Varilly1  +255 moreInstitutions (50)
18 Oct 2007-Nature
TL;DR: ‘Long-range haplotype’ methods, which were developed to identify alleles segregating in a population that have undergone recent selection, and new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population are developed.
Abstract: With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2). We used 'long-range haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population:LARGE and DMD, both related to infection by the Lassa virus, in West Africa;SLC24A5 and SLC45A2, both involved in skin pigmentation, in Europe; and EDAR and EDA2R, both involved in development of hair follicles, in Asia.

1,778 citations


Journal ArticleDOI
TL;DR: It is shown that engineered haplodeficiency of Il2 gene expression not only reduces T cell IL-2 production by twofold but also mimics the autoimmune dysregulatory effects of the naturally occurring susceptibility alleles of Il1.
Abstract: Autoimmune diseases are thought to result from imbalances in normal immune physiology and regulation. Here, we show that autoimmune disease susceptibility and resistance alleles on mouse chromosome 3 (Idd3) correlate with differential expression of the key immunoregulatory cytokine interleukin-2 (IL-2). In order to test directly that an approximately twofold reduction in IL-2 underpins the Idd3-linked destabilization of immune homeostasis, we show that engineered haplodeficiency of Il2 gene expression not only reduces T cell IL-2 production by twofold but also mimics the autoimmune dysregulatory effects of the naturally occurring susceptibility alleles of Il2. Reduced IL-2 production achieved by either genetic mechanism correlates with reduced function of CD4(+) CD25(+) regulatory T cells, which are critical for maintaining immune homeostasis.

377 citations


Journal ArticleDOI
01 Mar 2007-Genomics
TL;DR: This expansion of the original ORFeome resource greatly increases the potential experimental search space for large-scale proteomics studies, which will lead to the generation of more comprehensive datasets.

272 citations


Journal ArticleDOI
TL;DR: Insight is provided into the DNA breakage and repair processes operative in somatic genome rearrangement and how the evolutionary histories of individual cancers can be reconstructed from large-scale cancer genome sequencing.
Abstract: For decades, cytogenetic studies have demonstrated that somatically acquired structural rearrangements of the genome are a common feature of most classes of human cancer. However, the characteristics of these rearrangements at sequence-level resolution have thus far been subject to very limited description. One process that is dependent upon somatic genome rearrangement is gene amplification, a mechanism often exploited by cancer cells to increase copy number and hence expression of dominantly acting cancer genes. The mechanisms underlying gene amplification are complex but must involve chromosome breakage and rejoining. We sequenced 133 different genomic rearrangements identified within four cancer amplicons involving the frequently amplified cancer genes MYC, MYCN, and ERBB2. The observed architectures of rearrangement were diverse and highly distinctive, with evidence for sister chromatid breakage-fusion-bridge cycles, formation and reinsertion of double minutes, and the presence of bizarre clusters of small genomic fragments. There were characteristic features of sequences at the breakage-fusion junctions, indicating roles for nonhomologous end joining and homologous recombination-mediated repair mechanisms together with nontemplated DNA synthesis. Evidence was also found for sequence-dependent variation in susceptibility of the genome to somatic rearrangement. The results therefore provide insights into the DNA breakage and repair processes operative in somatic genome rearrangement and illustrate how the evolutionary histories of individual cancers can be reconstructed from large-scale cancer genome sequencing.

193 citations


Journal ArticleDOI
TL;DR: This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays.
Abstract: This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.

192 citations


Journal ArticleDOI
TL;DR: The construction of the most highly continuous bacterial artificial chromosome (BAC) map of any mammalian genome, for the pig (Sus scrofa domestica) genome, is reported, which will enable immediate electronic positional cloning of genes.
Abstract: Background: The domestic pig is being increasingly exploited as a system for modeling human disease. It also has substantial economic importance for meat-based protein production. Physical clone maps have underpinned large-scale genomic sequencing and enabled focused cloning efforts for many genomes. Comparative genetic maps indicate that there is more structural similarity between pig and human than, for example, mouse and human, and we have used this close relationship between human and pig as a way of facilitating map construction. Results: Here we report the construction of the most highly continuous bacterial artificial chromosome (BAC) map of any mammalian genome, for the pig (Sus scrofa domestica) genome. The map provides a template for the generation and assembly of high-quality anchored sequence across the genome. The physical map integrates previous landmark maps with restriction fingerprints and BAC end sequences from over 260,000 BACs derived from 4 BAC libraries and takes advantage of alignments to the human genome to improve the continuity and local ordering of the clone contigs. We estimate that over 98% of the euchromatin of the 18 pig autosomes and the X chromosome along with localized coverage on Y is represented in 172 contigs, with chromosome 13 (218 Mb) represented by a single contig. The map is accessible through pre-Ensembl, where links to marker and sequence data can be found. Conclusion: The map will enable immediate electronic positional cloning of genes, benefiting the pig research community and further facilitating use of the pig as an alternative animal model for human disease. The clone map and BAC end sequence data can also help to support the assembly of maps and genome sequences of other artiodactyls.

144 citations


Journal ArticleDOI
TL;DR: Results confirm the active role of fish skin in the immune response against infections, acting as an important site of expression of immune-related molecules.

71 citations


Journal ArticleDOI
TL;DR: The chromosomal mapping of the 575 large-insert DNA clones allows for these clones to be integrated into existing zebrafish mapping data and serves as a valuable resource for investigating the molecular basis of human diseases using zebra fish mutant models.
Abstract: The zebrafish (Danio rerio) is an important vertebrate model organism system for biomedical research. The syntenic conservation between the zebrafish and human genome allows one to investigate the function of human genes using the zebrafish model. To facilitate analysis of the zebrafish genome, genetic maps have been constructed and sequence annotation of a reference zebrafish genome is ongoing. However, the duplicative nature of teleost genomes, including the zebrafish, complicates accurate assembly and annotation of a representative genome sequence. Cytogenetic approaches provide "anchors" that can be integrated with accumulating genomic data. Here, we cytogenetically define the zebrafish genome by first estimating the size of each linkage group (LG) chromosome using flow cytometry, followed by the cytogenetic mapping of 575 bacterial artificial chromosome (BAC) clones onto metaphase chromosomes. Of the 575 BAC clones, 544 clones localized to apparently unique chromosomal locations. 93.8% of these clones were assigned to a specific LG chromosome location using fluorescence in situ hybridization (FISH) and compared to the LG chromosome assignment reported in the zebrafish genome databases. Thirty-one BAC clones localized to multiple chromosomal locations in several different hybridization patterns. From these data, a refined second generation probe panel for each LG chromosome was also constructed. The chromosomal mapping of the 575 large-insert DNA clones allows for these clones to be integrated into existing zebrafish mapping data. An accurately annotated zebrafish reference genome serves as a valuable resource for investigating the molecular basis of human diseases using zebrafish mutant models.

48 citations


Journal ArticleDOI
TL;DR: The sequencing, annotation and comparative analysis of an 8 Mb region of pig chromosome 17, which provides a useful test region to assess coverage and quality, demonstrated the considerable advantages of sequencing at increased read depths and discussed the implications that lower coverage sequence may have on subsequent comparative and functional studies, particularly those involving complex loci such as GNAS.
Abstract: We describe here the sequencing, annotation and comparative analysis of an 8 Mb region of pig chromosome 17, which provides a useful test region to assess coverage and quality for the pig genome sequencing project. We report our findings comparing the annotation of draft sequence assembled at different depths of coverage. Within this region we annotated 71 loci, of which 53 are orthologous to human known coding genes. When compared to the syntenic regions in human (20q13.13-q13.33) and mouse (chromosome 2, 167.5 Mb-178.3 Mb), this region was found to be highly conserved with respect to gene order. The most notable difference between the three species is the presence of a large expansion of zinc finger coding genes and pseudogenes on mouse chromosome 2 between Edn3 and Phactr3 that is absent from pig and human. All of our annotation has been made publicly available in the Vertebrate Genome Annotation browser, VEGA. We assessed the impact of coverage on sequence assembly across this region and found, as expected, that increased sequence depth resulted in fewer, longer contigs. One-third of our annotated loci could not be fully re-aligned back to the low coverage version of the sequence, principally because the transcripts are fragmented over several contigs. We have demonstrated the considerable advantages of sequencing at increased read depths and discuss the implications that lower coverage sequence may have on subsequent comparative and functional studies, particularly those involving complex loci such as GNAS.

Journal ArticleDOI
TL;DR: An initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p, is described, which contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb.
Abstract: The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.

Journal ArticleDOI
TL;DR: The data suggest that common variants in RLIP76 are unlikely to contribute to epilepsy drug response, and a subgroup analysis of patients receiving carbamazepine suggested an association that should be investigated further.
Abstract: Introduction: Approximately 30% of patients with epilepsy are resistant to treatment with anti-epileptic drugs (AEDs). The ABC drug transporter proteins are hypothesized to mediate drug resistance in epilepsy. More recently, a non-ABC putative transporter, RLIP76, has also been proposed to be involved in the mechanism of pharmacoresistance. One previous association study of six polymorphisms in RLIP76 failed to find any association with drug resistance in a retrospective cohort of epilepsy patients. We aimed to look for an association with outcomes reflecting drug response in a larger prospective cohort, with gene-wide coverage. Patients and methods: We investigated the role of common polymorphisms in RLIP76 in epilepsy pharmacoresistance by genotyping 23 common RLIP76 polymorphisms in a prospective cohort of 503 epilepsy patients, from the standard and new anti-epileptic drugs (SANAD) prospective study of new and old AEDs. A total of 13 of these were tested for association with four outcomes reflecting r...

Journal ArticleDOI
TL;DR: The weak T1D association that was detected in the association scan near the PAPD1 gene may be either false or due to a small genuine effect, and cannot explain linkage at the IDDM10 region.
Abstract: In an effort to locate susceptibility genes for type 1 diabetes (T1D) several genome-wide linkage scans have been undertaken. A chromosomal region designated IDDM10 retained genome-wide significance in a combined analysis of the main linkage scans. Here, we studied sequence polymorphisms in 23 Mb on chromosome 10p12-q11, including the putative IDDM10 region, to identify genes associated with T1D. Initially, we resequenced the functional candidate genes, CREM and SDF1, located in this region, genotyped 13 tag single nucleotide polymorphisms (SNPs) and found no association with T1D. We then undertook analysis of the whole 23 Mb region. We constructed and sequenced a contig tile path from two bacterial artificial clone libraries. By comparison with a clone library from an unrelated person used in the Human Genome Project, we identified 12,058 SNPs. We genotyped 303 SNPs and 25 polymorphic microsatellite markers in 765 multiplex T1D families and followed up 22 associated polymorphisms in up to 2,857 families. We found nominal evidence of association in six loci (P = 0.05 – 0.0026), located near the PAPD1 gene. Therefore, we resequenced 38.8 kb in this region, found 147 SNPs and genotyped 84 of them in the T1D families. We also tested 13 polymorphisms in the PAPD1 gene and in five other loci in 1,612 T1D patients and 1,828 controls from the UK. Overall, only the D10S193 microsatellite marker located 28 kb downstream of PAPD1 showed nominal evidence of association in both T1D families and in the case-control sample (P = 0.037 and 0.03, respectively). We conclude that polymorphisms in the CREM and SDF1 genes have no major effect on T1D. The weak T1D association that we detected in the association scan near the PAPD1 gene may be either false or due to a small genuine effect, and cannot explain linkage at the IDDM10 region.