Genome-Wide Mapping of in Vivo Protein-DNA Interactions

doi:10.1126/SCIENCE.1141319

Home
/
Papers
/
Genome-Wide Mapping of in Vivo Protein-DNA Interactions

Journal Article•DOI•

Genome-Wide Mapping of in Vivo Protein-DNA Interactions

David S. Johnson¹, Ali Mortazavi¹, Ali Mortazavi², Richard M. Myers¹, Richard M. Myers², Barbara J. Wold¹, Barbara J. Wold² - Show less +3 more•Institutions (2)

Stanford University¹, California Institute of Technology²

08 Jun 2007-Science (American Association for the Advancement of Science)-Vol. 316, Iss: 5830, pp 1497-1502

TL;DR: A large-scale chromatin immunoprecipitation assay based on direct ultrahigh-throughput DNA sequencing was developed, which was then used to map in vivo binding of the neuron-restrictive silencer factor (NRSF; also known as REST) to 1946 locations in the human genome.

read less

Abstract: In vivo protein-DNA interactions connect each transcription factor with its direct targets to form a gene network scaffold. To map these protein-DNA interactions comprehensively across entire mammalian genomes, we developed a large-scale chromatin immunoprecipitation assay (ChIPSeq) based on direct ultrahigh-throughput DNA sequencing. This sequence census method was then used to map in vivo binding of the neuron-restrictive silencer factor (NRSF; also known as REST, for repressor element–1 silencing transcription factor) to 1946 locations in the human genome. The data display sharp resolution of binding position [±50 base pairs (bp)], which facilitated our finding motifs and allowed us to identify noncanonical NRSF-binding motifs. These ChIPSeq data also have high sensitivity and specificity [ROC (receiver operator characteristic) area ≥ 0.96] and statistical confidence (P <10^(–4)), properties that were important for inferring new candidate interactions. These include key transcription factors in the gene network that regulates pancreatic islet cell development.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

[...]

Ben Langmead¹, Cole Trapnell¹, Mihai Pop¹, Steven L. Salzberg¹•Institutions (1)

University of Maryland, College Park¹

04 Mar 2009-Genome Biology

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.

...read moreread less

Abstract: Bowtie is an ultrafast, memory-efficient alignment program for aligning short DNA sequence reads to large genomes. For the human genome, Burrows-Wheeler indexing allows Bowtie to align more than 25 million reads per CPU hour with a memory footprint of approximately 1.3 gigabytes. Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches. Multiple processor cores can be used simultaneously to achieve even greater alignment speeds. Bowtie is open source http://bowtie.cbcb.umd.edu.

...read moreread less

20,335 citations

Cites background from "Genome-Wide Mapping of in Vivo Prot..."

...Technologies from Illumina (San Diego, CA, USA) and Applied Biosystems (Foster City, CA, USA) have been used to profile methylation patterns (MeDIP-Seq) [1], to map DNA-protein interactions (ChIP-Seq) [2], and to identify differentially expressed genes (RNA-Seq) [3] in the human genome and other species....
[...]

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1（PfPMP1）与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员，通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

...read moreread less

18,940 citations

Journal Article•DOI•

Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation

[...]

Cole Trapnell¹, Cole Trapnell², Brian A. Williams³, Geo Pertea², Ali Mortazavi³, Gordon Kwan³, Marijke J. van Baren⁴, Steven L. Salzberg², Barbara J. Wold³, Lior Pachter¹ - Show less +6 more•Institutions (4)

University of California, Berkeley¹, University of Maryland, College Park², California Institute of Technology³, Washington University in St. Louis⁴

01 May 2010-Nature Biotechnology

TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

...read moreread less

13,337 citations

Cites methods from "Genome-Wide Mapping of in Vivo Prot..."

...To validate our novel observed 5′ exons, we conducted ChIP-Seq experiments as describe...
[...]

Journal Article•DOI•

Model-based Analysis of ChIP-Seq (MACS)

[...]

Yong Zhang¹, Tao Liu¹, Clifford A. Meyer¹, Jérôme Eeckhoute², David S. Johnson, Bradley E. Bernstein³, Bradley E. Bernstein¹, Chad Nusbaum³, Richard M. Myers⁴, Myles Brown², Wei Li⁵, X. Shirley Liu¹ - Show less +8 more•Institutions (5)

Harvard University¹, Brigham and Women's Hospital², Broad Institute³, Stanford University⁴, Baylor College of Medicine⁵

17 Sep 2008-Genome Biology

TL;DR: This work presents Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer, and uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions.

...read moreread less

Abstract: We present Model-based Analysis of ChIP-Seq data, MACS, which analyzes data generated by short read sequencers such as Solexa's Genome Analyzer. MACS empirically models the shift size of ChIP-Seq tags, and uses it to improve the spatial resolution of predicted binding sites. MACS also uses a dynamic Poisson distribution to effectively capture local biases in the genome, allowing for more robust predictions. MACS compares favorably to existing ChIP-Seq peak-finding algorithms, and is freely available.

...read moreread less

13,008 citations

Cites background or methods or result from "Genome-Wide Mapping of in Vivo Prot..."

...On the Genome Biology 2008, 9:R137 Genome Biology 2008, 9:R137 Comparison of MACS with ChIPSeq Peak Finder, FindPeaks and QuESTFigure 2 Comparison of MACS with ChIPSeq Peak Finder, FindPeaks and QuEST....
[...]
...and sequencing (ChIP-Seq) [5-8] have become popular tech-...
[...]
...Libraries were prepared as described in [8] using a PCR preamplification step and size selection for DNA fragments between 150 and 400 bp....
[...]
...When applied to three human ChIP-Seq datasets to identify binding sites of FoxA1 in MCF7 cells, NRSF (neuron-restrictive silencer factor) in Jurkat T cells [8], and CTCF (CCCTC-binding factor) in CD4+ T cells [5] (summarized in Table S1 in Additional data file 1), MACS gives results superior to those produced by other published ChIP-Seq peak finding algorithms [8,11,12]....
[...]
...For CTCF, since QuEST does not run on samples without controls, we only compared MACS to ChIPSeq Peak Finder and FindPeaks....
[...]

Journal Article•DOI•

Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks

[...]

Cole Trapnell¹, Adam Roberts², Loyal A. Goff¹, Loyal A. Goff³, Loyal A. Goff⁴, Geo Pertea⁵, Daehwan Kim⁶, Daehwan Kim⁷, David R. Kelley⁴, David R. Kelley¹, Harold Pimentel², Steven L. Salzberg⁵, John L. Rinn⁴, John L. Rinn¹, Lior Pachter² - Show less +11 more•Institutions (7)

Broad Institute¹, University of California, Berkeley², Massachusetts Institute of Technology³, Harvard University⁴, Johns Hopkins University⁵, University of Maryland, College Park⁶, Johns Hopkins University School of Medicine⁷

01 Mar 2012-Nature Protocols

TL;DR: This protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results, which takes less than 1 d of computer time for typical experiments and ∼1 h of hands-on time.

...read moreread less

Abstract: Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ∼1 h of hands-on time.

...read moreread less

10,913 citations

Cites methods from "Genome-Wide Mapping of in Vivo Prot..."

...Read alignment with TopHat Alignment of sequencing reads to a reference genome is a core step in the analysis workflows for many high-throughput sequencing assays, including ChIP-Se...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

References

PDF

Open Access

More filters

疟原虫var基因转换速率变化导致抗原变异[英]／Paul H, Robert P, Christodoulou Z, et al//Proc Natl Acad Sci U S A

[...]

宁北芳, 朱淮民

28 Jul 2005

TL;DR: PfPMP1）与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用，在黏附及免疫逃避中起关键的作�ly.

...read moreread less

18,940 citations

“Bioinformatics” 특집을 내면서

[...]

장병탁, 김삼묘, 허철구

01 Aug 2000

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.

...read moreread less

Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

...read moreread less

4,833 citations

Journal Article•DOI•

The ENCODE (ENCyclopedia of DNA elements) Project

[...]

Elise A. Feingold¹, Peter J. Good¹, Mark S. Guyer¹, S. Kamholz¹ +193 more•Institutions (19)

22 Oct 2004-Science

TL;DR: The ENCyclopedia Of DNA Elements (ENCODE) Project is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function.

...read moreread less

Abstract: The ENCyclopedia Of DNA Elements (ENCODE) Project aims to identify all functional elements in the human genome sequence. The pilot phase of the Project is focused on a specified 30 megabases (∼1%) of the human genome sequence and is organized as an international consortium of computational and laboratory-based scientists working to develop and apply high-throughput approaches for detecting all sequence elements that confer biological function. The results of this pilot phase will guide future efforts to analyze the entire human genome.

...read moreread less

2,248 citations

Journal Article•DOI•

A global map of p53 transcription-factor binding sites in the human genome.

[...]

Chia-Lin Wei¹, Qiang Wu¹, Vinsensius B. Vega¹, Kuo Ping Chiu¹, Patrick Ng¹, Tao Zhang¹, Atif Shahab, How Choong Yong, Yutao Fu², Zhiping Weng², Jianjun Liu¹, Xiao Dong Zhao¹, Joon-Lin Chew³, Joon-Lin Chew¹, Yen Ling Lee¹, Vladimir A. Kuznetsov¹, Wing-Kin Sung¹, Lance D. Miller¹, Bing Lim⁴, Bing Lim¹, Edison T. Liu¹, Qiang Yu¹, Huck-Hui Ng¹, Huck-Hui Ng³, Yijun Ruan¹ - Show less +21 more•Institutions (4)

Genome Institute of Singapore¹, Boston University², National University of Singapore³, Harvard University⁴

13 Jan 2006-Cell

TL;DR: A robust approach is described that couples chromatin immunoprecipitation (ChIP) with the paired-end ditag (PET) sequencing strategy for unbiased and precise global localization of transcription-factor binding sites (TFBS).

...read moreread less

1,180 citations

Journal Article•DOI•

Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs.

[...]

Simon Cawley¹, Stefan Bekiranov¹, Huck H Ng², Huck H Ng³, Huck H Ng⁴, Philipp Kapranov¹, Edward A. Sekinger², Dione Kampa¹, Antonio Piccolboni¹, Victor Sementchenko¹, Jill Cheng¹, Alan Williams¹, Raymond Wheeler¹, Brant Wong¹, Jorg Drenkow¹, Mark Yamanaka¹, Sandeep Patel¹, Shane Brubaker¹, Hari Tammana¹, Gregg Helt¹, Kevin Struhl², Thomas R. Gingeras¹ - Show less +18 more•Institutions (4)

Affymetrix¹, Harvard University², National University of Singapore³, Genome Institute of Singapore⁴

20 Feb 2004-Cell

TL;DR: The human genome contains roughly comparable numbers of protein-coding and noncoding genes that are bound by common transcription factors and regulated by common environmental signals.

...read moreread less

1,121 citations