scispace - formally typeset
Search or ask a question

Showing papers by "Barbara J. Wold published in 2012"


Journal ArticleDOI
Sarah Djebali, Carrie A. Davis1, Angelika Merkel, Alexander Dobin1, Timo Lassmann, Ali Mortazavi2, Ali Mortazavi3, Andrea Tanzer, Julien Lagarde, Wei Lin1, Felix Schlesinger1, Chenghai Xue1, Georgi K. Marinov3, Jainab Khatun4, Brian A. Williams3, Chris Zaleski1, Joel Rozowsky5, Marion S. Röder, Felix Kokocinski6, Rehab F. Abdelhamid, Tyler Alioto, Igor Antoshechkin3, Michael T. Baer1, Nadav Bar7, Philippe Batut1, Kimberly Bell1, Ian Bell8, Sudipto K. Chakrabortty1, Xian Chen9, Jacqueline Chrast10, Joao Curado, Thomas Derrien, Jorg Drenkow1, Erica Dumais8, Jacqueline Dumais8, Radha Duttagupta8, Emilie Falconnet11, Meagan Fastuca1, Kata Fejes-Toth1, Pedro G. Ferreira, Sylvain Foissac8, Melissa J. Fullwood12, Hui Gao8, David Gonzalez, Assaf Gordon1, Harsha P. Gunawardena9, Cédric Howald10, Sonali Jha1, Rory Johnson, Philipp Kapranov8, Brandon King3, Colin Kingswood, Oscar Junhong Luo12, Eddie Park2, Kimberly Persaud1, Jonathan B. Preall1, Paolo Ribeca, Brian A. Risk4, Daniel Robyr11, Michael Sammeth, Lorian Schaffer3, Lei-Hoon See1, Atif Shahab12, Jørgen Skancke7, Ana Maria Suzuki, Hazuki Takahashi, Hagen Tilgner13, Diane Trout3, Nathalie Walters10, Huaien Wang1, John A. Wrobel4, Yanbao Yu9, Xiaoan Ruan12, Yoshihide Hayashizaki, Jennifer Harrow6, Mark Gerstein5, Tim Hubbard6, Alexandre Reymond10, Stylianos E. Antonarakis11, Gregory J. Hannon1, Morgan C. Giddings9, Morgan C. Giddings4, Yijun Ruan12, Barbara J. Wold3, Piero Carninci, Roderic Guigó14, Thomas R. Gingeras1, Thomas R. Gingeras8 
06 Sep 2012-Nature
TL;DR: Evidence that three-quarters of the human genome is capable of being transcribed is reported, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs that prompt a redefinition of the concept of a gene.
Abstract: Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue of these RNAs is not yet available and their characteristic subcellular localizations are also poorly understood. Because RNA represents the direct output of the genetic information encoded by genomes and a significant proportion of a cell's regulatory capabilities are focused on its synthesis, processing, transport, modification and translation, the generation of such a catalogue is crucial for understanding genome function. Here we report evidence that three-quarters of the human genome is capable of being transcribed, as well as observations about the range and levels of expression, localization, processing fates, regulatory regions and modifications of almost all currently annotated and thousands of previously unannotated RNAs. These observations, taken together, prompt a redefinition of the concept of a gene.

4,450 citations


01 Sep 2012
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.

2,767 citations


Journal ArticleDOI
TL;DR: This work discusses how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data and develops a set of working standards and guidelines for ChIP experiments that are updated routinely.
Abstract: Chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) has become a valuable and widely used approach for mapping the genomic location of transcription-factor binding and histone modifications in living cells. Despite its widespread use, there are considerable differences in how these experiments are conducted, how the results are scored and evaluated for quality, and how the data and metadata are archived for public use. These practices affect the quality and utility of any global ChIP experiment. Through our experience in performing ChIP-seq experiments, the ENCODE and modENCODE consortia have developed a set of working standards and guidelines for ChIP experiments that are updated routinely. The current guidelines address antibody validation, experimental replication, sequencing depth, data and metadata reporting, and data quality assessment. We discuss how ChIP quality, assessed in these ways, affects different uses of ChIP-seq data. All data sets used in the analysis have been deposited for public viewing and downloading at the ENCODE (http://encodeproject.org/ENCODE/) and modENCODE (http://www.modencode.org/) portals.

1,801 citations



Journal ArticleDOI
John A. Stamatoyannopoulos1, Michael Snyder2, Ross C. Hardison3, Bing Ren4, Thomas R. Gingeras5, David M. Gilbert6, Mark Groudine7, M. A. Bender7, Rajinder Kaul1, Theresa K. Canfield1, Erica Giste1, Audra K. Johnson1, Mia Zhang7, Gayathri Balasundaram7, Rachel Byron7, Vaughan Roach1, Peter J. Sabo1, Richard Sandstrom1, A Sandra Stehling1, Robert E. Thurman1, Sherman M. Weissman8, Philip Cayting8, Manoj Hariharan2, Jin Lian8, Yong Cheng2, Stephen G. Landt2, Zhihai Ma2, Barbara J. Wold9, Job Dekker10, Gregory E. Crawford11, Cheryl A. Keller3, Weisheng Wu3, Christopher T. Morrissey3, Swathi Ashok Kumar3, Tejaswini Mishra3, Deepti Jain3, Marta Byrska-Bishop3, Daniel Blankenberg3, Bryan R. Lajoie2, Gaurav Jain10, Amartya Sanyal10, Kaun-Bei Chen11, Olgert Denas11, James Taylor12, Gerd A. Blobel13, Mitchell J. Weiss13, Max Pimkin13, Wulan Deng13, Georgi K. Marinov9, Brian A. Williams9, Katherine I. Fisher-Aylor9, Gilberto DeSalvo9, Anthony Kiralusha9, Diane Trout9, Henry Amrhein9, Ali Mortazavi14, Lee Edsall4, David McCleary4, Samantha Kuan4, Yin Shen4, Feng Yue4, Zhen Ye4, Carrie A. Davis5, Chris Zaleski5, Sonali Jha5, Chenghai Xue5, Alexander Dobin5, Wei Lin5, Meagan Fastuca5, Huaien Wang5, Roderic Guigó, Sarah Djebali, Julien Lagarde, Tyrone Ryba6, Takayo Sasaki6, Venkat S. Malladi15, Melissa S. Cline15, Vanessa M. Kirkup15, Katrina Learned15, Kate R. Rosenbloom15, W. James Kent15, Elise A. Feingold16, Peter J. Good16, Michael J. Pazin16, Rebecca F. Lowdon16, Leslie B Adams16 
TL;DR: The Mouse E NCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome to enable a broad range of mouse genomics efforts.
Abstract: To complement the human Encyclopedia of DNA Elements (ENCODE) project and to enable a broad range of mouse genomics efforts, the Mouse ENCODE Consortium is applying the same experimental pipelines developed for human ENCODE to annotate the mouse genome

445 citations


Journal ArticleDOI
13 Apr 2012-Cell
TL;DR: Five stages spanning the commitment process are probed using RNA-seq and ChIP-seq to track genome-wide shifts in transcription, cohorts of active transcription factor genes, histone modifications at diverse classes of cis-regulatory elements, and binding repertoire of GATA-3 and PU.1, transcription factors with complementary roles in T cell development.

323 citations


Journal ArticleDOI
TL;DR: Since increased densities of microglia in two functionally and anatomically disparate cortical areas are observed, it is suggested that these immune cells are probably denser throughout cerebral cortex in brains of people with autism.
Abstract: We immunocytochemically identified microglia in fronto-insular (FI) and visual cortex (VC) in autopsy brains of well-phenotyped subjects with autism and matched controls, and stereologically quantified the microglial densities. Densities were determined blind to phenotype using an optical fractionator probe. In FI, individuals with autism had significantly more microglia compared to controls (p = 0.02). One such subject had a microglial density in FI within the control range and was also an outlier behaviorally with respect to other subjects with autism. In VC, microglial densities were also significantly greater in individuals with autism versus controls (p = 0.0002). Since we observed increased densities of microglia in two functionally and anatomically disparate cortical areas, we suggest that these immune cells are probably denser throughout cerebral cortex in brains of people with autism.

224 citations


Journal ArticleDOI
TL;DR: Evaluation of widely used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle data sets with deep coverage, and a chromatin-state bias was detected: open chromatin regions yielded higher coverage, which led to false positives if not corrected.
Abstract: We evaluated how variations in sequencing depth and other parameters influence interpretation of chromatin immunoprecipitation–sequencing (ChIP-seq) experiments. Using Drosophila melanogaster S2 cells, we generated ChIP-seq data sets for a site-specific transcription factor (Suppressor of Hairy-wing) and a histone modification (H3K36me3). We detected a chromatin-state bias: open chromatin regions yielded higher coverage, which led to false positives if not corrected. This bias had a greater effect on detection specificity than any base-composition bias. Paired-end sequencing revealed that single-end data underestimated ChIP-library complexity at high coverage. Removal of reads originating at the same base reduced false-positives but had little effect on detection sensitivity. Even at mappable-genome coverage depth of ~1 read per base pair, ~1% of the narrow peaks detected on a tiling array were missed by ChIP-seq. Evaluation of widely used ChIP-seq analysis tools suggests that adjustments or algorithm improvements are required to handle data sets with deep coverage.

159 citations


Journal ArticleDOI
TL;DR: In this paper, the authors measured genome-wide differential allelic occupancy of 24 transcription factors and EP300 in a human lymphoblastoid cell line GM12878 and found strong association between TF occupancy and expression within 100 bp of transcription start sites (TSSs), and weak association up to 100 kb from TSSs.
Abstract: A complex interplay between transcription factors (TFs) and the genome regulates transcription. However, connecting variation in genome sequence with variation in TF binding and gene expression is challenging due to environmental differences between individuals and cell types. To address this problem, we measured genome-wide differential allelic occupancy of 24 TFs and EP300 in a human lymphoblastoid cell line GM12878. Overall, 5% of human TF binding sites have an allelic imbalance in occupancy. At many sites, TFs clustered in TF-binding hubs on the same homolog in especially open chromatin. While genetic variation in core TF binding motifs generally resulted in large allelic differences in TF occupancy, most allelic differences in occupancy were subtle and associated with disruption of weak or noncanonical motifs. We also measured genome-wide differential allelic expression of genes with and without heterozygous exonic variants in the same cells. We found that genes with differential allelic expression were overall less expressed both in GM12878 cells and in unrelated human cell lines. Comparing TF occupancy with expression, we found strong association between allelic occupancy and expression within 100 bp of transcription start sites (TSSs), and weak association up to 100 kb from TSSs. Sites of differential allelic occupancy were significantly enriched for variants associated with disease, particularly autoimmune disease, suggesting that allelic differences in TF occupancy give functional insights into intergenic variants associated with disease. Our results have the potential to increase the power and interpretability of association studies by targeting functional intergenic variants in addition to protein coding sequences.

156 citations


Journal ArticleDOI
TL;DR: This study analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events to find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the edit of any individual site.
Abstract: RNA-seq data can be mined for sequence differences relative to the reference genome to identify both genomic SNPs and RNA editing events. We analyzed the long, polyA-selected, unstranded, deeply sequenced RNA-seq data from the ENCODE Project across 14 human cell lines for candidate RNA editing events. On average, 43% of the RNA sequencing variants that are not in dbSNP and are within gene boundaries are A-to-G(I) RNA editing candidates. The vast majority of A-to-G(I) edits are located in introns and 3′ UTRs, with only 123 located in protein-coding sequence. In contrast, the majority of non–A-to-G variants (60%–80%) map near exon boundaries and have the characteristics of splice-mapping artifacts. After filtering out all candidates with evidence of private genomic variation using genome resequencing or ChIP-seq data, we find that up to 85% of the high-confidence RNA variants are A-to-G(I) editing candidates. Genes with A-to-G(I) edits are enriched in Gene Ontology terms involving cell division, viral defense, and translation. The distribution and character of the remaining non–A-to-G variants closely resemble known SNPs. We find no reproducible A-to-G(I) edits that result in nonsynonymous substitutions in all three lymphoblastoid cell lines in our study, unlike RNA editing in the brain. Given that only a fraction of sites are reproducibly edited in multiple cell lines and that we find a stronger association of editing and specific genes suggests that the editing of the transcript is more important than the editing of any individual site.

155 citations


Journal ArticleDOI
TL;DR: The investigation of a DNA-binding pyrrole-imidazole polyamide targeted to bind the DNA sequence 5′-WGGWWW-3′ with reference to its potency in a subcutaneous xenograft tumor model shows the molecule is capable of trafficking to the tumor site following sub cutaneous injection and modulates transcription of select genes in vivo.
Abstract: Gene regulation by DNA binding small molecules could have important therapeutic applications. This study reports the investigation of a DNA-binding pyrrole-imidazole polyamide targeted to bind the DNA sequence 5′-WGGWWW-3′ with reference to its potency in a subcutaneous xenograft tumor model. The molecule is capable of trafficking to the tumor site following subcutaneous injection and modulates transcription of select genes in vivo. An FITC-labeled analogue of this polyamide can be detected in tumor-derived cells by confocal microscopy. RNA deep sequencing (RNA-seq) of tumor tissue allowed the identification of further affected genes, a representative panel of which was interrogated by quantitative reverse transcription-PCR and correlated with cell culture expression levels.

Journal ArticleDOI
TL;DR: It is concluded that these tissue-specific factors contribute much more broadly to the transcriptional output of muscle tissue than previously thought, offering a partial explanation for widespread HLH-1 occupancy.
Abstract: Two major transcriptional regulators of Caenorhabditis elegans bodywall muscle (BWM) differentiation, hlh-1 and unc-120, are expressed in muscle where they are known to bind and regulate several well-studied muscle-specific genes. Simultaneously mutating both factors profoundly inhibits formation of contractile BWM. These observations were consistent with a simple network model in which the muscle regulatory factors drive tissue-specific transcription by binding selectively near muscle-specific targets to activate them. We tested this model by measuring the number, identity, and tissue-specificity of functional regulatory targets for each factor. Some joint regulatory targets (218) are BWM-specific and enriched for nearby HLH-1 binding. However, contrary to the simple model, the majority of genes regulated by one or both muscle factors are also expressed significantly in non-BWM tissues. We also mapped global factor occupancy by HLH-1, and created a genetic interaction map that identifies hlh-1 collaborating transcription factors. HLH-1 binding did not predict proximate regulatory action overall, despite enrichment for binding among BWM-specific positive regulatory targets of hlh-1. We conclude that these tissue-specific factors contribute much more broadly to the transcriptional output of muscle tissue than previously thought, offering a partial explanation for widespread HLH-1 occupancy. We also identify a novel regulatory connection between the BWM-specific hlh-1 network and the hlh-8/twist nonstriated muscle network. Finally, our results suggest a molecular basis for synthetic lethality in which hlh-1 and unc-120 mutant phenotypes are mutually buffered by joint additive regulation of essential target genes, with additional buffering suggested via newly identified hlh-1 interacting factors.