Home
/
Authors
/
Alan Williams

Author

Alan Williams

Other affiliations: Life Technologies, Thermo Fisher Scientific

Bio: Alan Williams is an academic researcher from Affymetrix. The author has contributed to research in topics: Alternative splicing & Exon. The author has an hindex of 17, co-authored 23 publications receiving 26517 citations. Previous affiliations of Alan Williams include Life Technologies & Thermo Fisher Scientific.

Topics: Alternative splicing, Exon, RNA splicing, Gene, Genome ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Initial sequencing and analysis of the human genome.

[...]

Eric S. Lander¹, Lauren Linton¹, Bruce W. Birren¹, Chad Nusbaum¹ +245 more•Institutions (29)

15 Feb 2001-Nature

TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.

...read moreread less

Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

...read moreread less

22,269 citations

Journal Article•DOI•

An integrated semiconductor device enabling non-optical genome sequencing

[...]

Jonathan M. Rothberg¹, Wolfgang Hinz¹, Todd Rearick¹, Jonathan Schultz¹, William J. Mileski¹, Melville Davey¹, John H. Leamon¹, Kim L. Johnson¹, Mark James Milgrew¹, Matthew D. Edwards¹, Jeremy Hoon¹, Jan Fredrik Simons¹, David Marran¹, Jason W. Myers¹, John F. Davidson¹, Annika Branting¹, John Nobile¹, Bernard P. Puc¹, David Light¹, Travis A. Clark¹, Martin Huber¹, Jeffrey T. Branciforte¹, Isaac B. Stoner¹, Simon Cawley¹, Michael R. Lyons¹, Yutao Fu¹, Nils Homer¹, Marina Sedova¹, Xin Miao¹, Brian Reed¹, Jeffrey Sabina¹, Erika Feierstein¹, Michelle Schorn¹, Mohammad Alanjary¹, Eileen T. Dimalanta¹, Devin Dressman¹, Rachel Kasinskas¹, Tanya Sokolsky¹, Jacqueline A. Fidanza¹, Eugeni Namsaraev¹, Kevin McKernan¹, Alan Williams¹, G. Thomas Roth¹, James Bustillo¹ - Show less +40 more•Institutions (1)

Life Technologies¹

21 Jul 2011-Nature

TL;DR: A DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes, showing its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

...read moreread less

Abstract: The seminal importance of DNA sequencing to the life sciences, biotechnology and medicine has driven the search for more scalable and lower-cost solutions. Here we describe a DNA sequencing technology in which scalable, low-cost semiconductor manufacturing techniques are used to make an integrated circuit able to directly perform non-optical DNA sequencing of genomes. Sequence data are obtained by directly sensing the ions produced by template-directed DNA polymerase synthesis using all-natural nucleotides on this massively parallel semiconductor-sensing device or ion chip. The ion chip contains ion-sensitive, field-effect transistor-based sensors in perfect register with 1.2 million wells, which provide confinement and allow parallel, simultaneous detection of independent sequencing reactions. Use of the most widely used technology for constructing integrated circuits, the complementary metal-oxide semiconductor (CMOS) process, allows for low-cost, large-scale production and scaling of the device to higher densities and larger array sizes. We show the performance of the system by sequencing three bacterial genomes, its robustness and scalability by producing ion chips with up to 10 times as many sensors and sequencing a human genome.

...read moreread less

2,246 citations

Journal Article•DOI•

Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs.

[...]

Simon Cawley¹, Stefan Bekiranov¹, Huck H Ng², Huck H Ng³, Huck H Ng⁴, Philipp Kapranov¹, Edward A. Sekinger³, Dione Kampa¹, Antonio Piccolboni¹, Victor Sementchenko¹, Jill Cheng¹, Alan Williams¹, Raymond Wheeler¹, Brant Wong¹, Jorg Drenkow¹, Mark Yamanaka¹, Sandeep Patel¹, Shane Brubaker¹, Hari Tammana¹, Gregg Helt¹, Kevin Struhl³, Thomas R. Gingeras¹ - Show less +18 more•Institutions (4)

Affymetrix¹, National University of Singapore², Harvard University³, Genome Institute of Singapore⁴

20 Feb 2004-Cell

TL;DR: The human genome contains roughly comparable numbers of protein-coding and noncoding genes that are bound by common transcription factors and regulated by common environmental signals.

...read moreread less

1,121 citations

Journal Article•DOI•

Nova regulates brain-specific splicing to shape the synapse

[...]

Jernej Ule¹, Aljaž Ule², Joanna L. Spencer¹, Alan Williams³, Jing Shan Hu³, Melissa S. Cline³, Hui Wang³, Tyson A. Clark³, Claire E. Fraser¹, Matteo Ruggiu¹, Barry R. Zeeberg⁴, David W. Kane⁵, John N. Weinstein, John E. Blume³, Robert B. Darnell¹ - Show less +11 more•Institutions (5)

Rockefeller University¹, University of Amsterdam², Affymetrix³, National Institutes of Health⁴, SRA International⁵

24 Jul 2005-Nature Genetics

TL;DR: Validating a large set of Nova RNA targets has led us to identify a multi-tiered network in which Nova regulates the exon content of RNAs encoding proteins that interact in the synapse, which may contribute to tissue-specific functions.

...read moreread less

Abstract: Alternative RNA splicing greatly increases proteome diversity and may thereby contribute to tissue-specific functions. We carried out genome-wide quantitative analysis of alternative splicing using a custom Affymetrix microarray to assess the role of the neuronal splicing factor Nova in the brain. We used a stringent algorithm to identify 591 exons that were differentially spliced in the brain relative to immune tissues, and 6.6% of these showed major splicing defects in the neocortex of Nova2−/− mice. We tested 49 exons with the largest predicted Nova-dependent splicing changes and validated all 49 by RT-PCR. We analyzed the encoded proteins and found that all those with defined brain functions acted in the synapse (34 of 40, including neurotransmitter receptors, cation channels, adhesion and scaffold proteins) or in axon guidance (8 of 40). Moreover, of the 35 proteins with known interaction partners, 74% (26) interact with each other. Validating a large set of Nova RNA targets has led us to identify a multi-tiered network in which Nova regulates the exon content of RNAs encoding proteins that interact in the synapse.

...read moreread less

498 citations

Journal Article•DOI•

Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array

[...]

Paul Gardina¹, Tyson A. Clark¹, Brian Shimada¹, Michelle K. Staples¹, Qing Yang¹, James Veitch¹, Anthony C. Schweitzer¹, Tarif Awad¹, Charles W. Sugnet¹, Suzanne Dee¹, Christopher Davies¹, Alan Williams¹, Yaron Turpaz¹ - Show less +9 more•Institutions (1)

Affymetrix¹

27 Dec 2006-BMC Genomics

TL;DR: A new exon-centric array is presented that allows genome-wide identification of differential splice variation, and concurrently provides a flexible and inclusive analysis of gene expression, suggesting that the more speculative transcripts, largely based solely on computational prediction, might be novel targets in colon cancer.

...read moreread less

Abstract: Alternative splicing is a mechanism for increasing protein diversity by excluding or including exons during post-transcriptional processing. Alternatively spliced proteins are particularly relevant in oncology since they may contribute to the etiology of cancer, provide selective drug targets, or serve as a marker set for cancer diagnosis. While conventional identification of splice variants generally targets individual genes, we present here a new exon-centric array (GeneChip Human Exon 1.0 ST) that allows genome-wide identification of differential splice variation, and concurrently provides a flexible and inclusive analysis of gene expression. We analyzed 20 paired tumor-normal colon cancer samples using a microarray designed to detect over one million putative exons that can be virtually assembled into potential gene-level transcripts according to various levels of prior supporting evidence. Analysis of high confidence (empirically supported) transcripts identified 160 differentially expressed genes, with 42 genes occupying a network impacting cell proliferation and another twenty nine genes with unknown functions. A more speculative analysis, including transcripts based solely on computational prediction, produced another 160 differentially expressed genes, three-fourths of which have no previous annotation. We also present a comparison of gene signal estimations from the Exon 1.0 ST and the U133 Plus 2.0 arrays. Novel splicing events were predicted by experimental algorithms that compare the relative contribution of each exon to the cognate transcript intensity in each tissue. The resulting candidate splice variants were validated with RT-PCR. We found nine genes that were differentially spliced between colon tumors and normal colon tissues, several of which have not been previously implicated in cancer. Top scoring candidates from our analysis were also found to substantially overlap with EST-based bioinformatic predictions of alternative splicing in cancer. Differential expression of high confidence transcripts correlated extremely well with known cancer genes and pathways, suggesting that the more speculative transcripts, largely based solely on computational prediction and mostly with no previous annotation, might be novel targets in colon cancer. Five of the identified splicing events affect mediators of cytoskeletal organization (ACTN1, VCL, CALD1, CTTN, TPM1), two affect extracellular matrix proteins (FN1, COL6A3) and another participates in integrin signaling (SLC3A2). Altogether they form a pattern of colon-cancer specific alterations that may particularly impact cell motility.

...read moreread less

376 citations

1
2
3
4
…
5

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Fast gapped-read alignment with Bowtie 2

[...]

Ben Langmead¹, Steven L. Salzberg², Steven L. Salzberg³, Steven L. Salzberg¹•Institutions (3)

University of Maryland, College Park¹, Johns Hopkins University School of Medicine², Johns Hopkins University³

01 Apr 2012-Nature Methods

TL;DR: Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

Abstract: As the rate of sequencing increases, greater throughput is demanded from read aligners. The full-text minute index is often used to make alignment very fast and memory-efficient, but the approach is ill-suited to finding longer, gapped alignments. Bowtie 2 combines the strengths of the full-text minute index with the flexibility and speed of hardware-accelerated dynamic programming algorithms to achieve a combination of high speed, sensitivity and accuracy.

...read moreread less

37,898 citations

Journal Article•DOI•

STAR: ultrafast universal RNA-seq aligner

[...]

Alexander Dobin¹, Carrie A. Davis¹, Felix Schlesinger¹, Jorg Drenkow¹, Chris Zaleski¹, Sonali Jha¹, Philippe Batut¹, Mark Chaisson¹, Thomas R. Gingeras¹ - Show less +5 more•Institutions (1)

Cold Spring Harbor Laboratory¹

01 Jan 2013-Bioinformatics

TL;DR: The Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure outperforms other aligners by a factor of >50 in mapping speed.

...read moreread less

Abstract: Motivation Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because of the non-contiguous transcript structure, relatively short read lengths and constantly increasing throughput of the sequencing technologies. Currently available RNA-seq aligners suffer from high mapping error rates, low mapping speed, read length limitation and mapping biases. Results To align our large (>80 billon reads) ENCODE Transcriptome RNA-seq dataset, we developed the Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, we experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy. Availability and implementation STAR is implemented as a standalone C++ code. STAR is free open source software distributed under GPLv3 license and can be downloaded from http://code.google.com/p/rna-star/.

...read moreread less

30,684 citations

Journal Article•DOI•

Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors.

[...]

Kazutoshi Takahashi¹, Shinya Yamanaka¹•Institutions (1)

Kyoto University¹

25 Aug 2006-Cell

TL;DR: Induction of pluripotent stem cells from mouse embryonic or adult fibroblasts by introducing four factors, Oct3/4, Sox2, c-Myc, and Klf4, under ES cell culture conditions is demonstrated and iPS cells, designated iPS, exhibit the morphology and growth properties of ES cells and express ES cell marker genes.

...read moreread less

23,959 citations

Journal Article•DOI•

The Pfam protein families database

[...]

Marco Punta¹, Penny Coggill¹, Ruth Y. Eberhardt¹, Jaina Mistry¹, John Tate¹, Chris Boursnell¹, Ningze Pang¹, Kristoffer Forslund¹, Goran Ceric¹, Jody Clements¹, Andreas Heger¹, Liisa Holm¹, Erik L. L. Sonnhammer¹, Sean R. Eddy¹, Alex Bateman¹, Robert D. Finn¹ - Show less +12 more•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Jan 2000-Nucleic Acids Research

TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.

...read moreread less

Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

...read moreread less

14,075 citations

Journal Article•DOI•

The sequence of the human genome.

[...]

J. Craig Venter¹, Mark Raymond Adams¹, Eugene W. Myers¹, Peter W. Li¹ +269 more•Institutions (12)

16 Feb 2001-Science

TL;DR: Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems are indicated.

...read moreread less

Abstract: A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies-a whole-genome assembly and a regional chromosome assembly-were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional approximately 12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

...read moreread less

12,098 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse