scispace - formally typeset
Search or ask a question
Author

Thomas D. Schneider

Bio: Thomas D. Schneider is an academic researcher from National Institutes of Health. The author has contributed to research in topics: Information theory & Promoter. The author has an hindex of 44, co-authored 110 publications receiving 11719 citations. Previous affiliations of Thomas D. Schneider include Penn State Milton S. Hershey Medical Center & Jawaharlal Nehru University.


Papers
More filters
Journal ArticleDOI
TL;DR: From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content at every position in a site or sequence.
Abstract: A graphical method is presented for displaying the patterns in a set of aligned sequences. The characters representing the sequence are stacked on top of each other for each position in the aligned sequences. The height of each letter is made proportional to its frequency, and the letters are sorted so the most common one is on top. The height of the entire stack is then adjusted to signify the information content of the sequences at that position. From these 'sequence logos', one can determine not only the consensus sequence but also the relative frequency of bases and the information content (measured in bits) at every position in a site or sequence. The logo displays both significant residues and subtle sequence patterns.

3,232 citations

Journal ArticleDOI
TL;DR: Comparisons between R sequence and R frequency suggest that the information at binding sites is just sufficient for the sites to be distinguished from the rest of the genome.

975 citations

Journal ArticleDOI
TL;DR: A conceptual framework for developing a theory of translational initiation and three approaches to a higher order approximation are described.
Abstract: INTRODUCTION 365 BIOCHEMISTRY OF TRANSLATIONAL INITIATION 367 A CONCEPTUAL FRAMEWORK: RIBOSOME BINDING SITE STRENGTHS 373 THE NATURE OF RIBOSOME BINDING SITES 374 A First Approximation 374 Three Approaches to a Higher Order Approximation 380 Genetics and biochemistry 380 Statistics .......•••••••••.••..•.••.•...•.•........•....•••••••••••••••••• •.•••••••••• •.........•.• 382 DETERMINANTS: SEQUENCE AND/OR STRUCTURE ... 388 Pathways to Initiation 388 Unstructured Signals 389 Structured Initiation Regions 389 TRANSLATIONAL REGULATION 392 Unstructured RNAs 393 RNAs Whose Structures Seem Obvious 394 Complex Translational Regulation 396

754 citations

Journal ArticleDOI
TL;DR: A "Perceptron" algorithm is used to find a weighting function which distinguishes E. coli translational initiation sites from all other sites in a library of over 78,000 nucleotides of mRNA sequence.
Abstract: We have used a "Perceptron" algorithm to find a weighting function which distinguishes E. coli translational initiation sites from all other sites in a library of over 78,000 nucleotides of mRNA sequence. The "Perceptron" examined sequences as linear representations. The "Perceptron" is more successful at finding gene beginnings than our previous searches using "rules" (see previous paper). We note that the weighting function can find translational initiation sites within sequences that were not included in the training set.

668 citations

Journal ArticleDOI
TL;DR: The Shine and Dalgarno sequence of 124 known gene beginnings is characterized and this information is used to make "rules" which help distinguish gene beginning from other sites in a library of over 78,000 bases of mRNA.
Abstract: We characterize the Shine and Dalgarno sequence of 124 known gene beginnings. This information is used to make "rules" which help distinguish gene beginning from other sites in a library of over 78,000 bases of mRNA. Gene beginnings are found to have information besides the initiation codon and Shine and Dalgarno sequence which can be used to make better "rules".

641 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment that provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive.
Abstract: WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization.

10,721 citations

Journal ArticleDOI
03 Aug 1990-Science
TL;DR: High-affinity nucleic acid ligands for a protein were isolated by a procedure that depends on alternate cycles of ligand selection from pools of variant sequences and amplification of the bound species.
Abstract: High-affinity nucleic acid ligands for a protein were isolated by a procedure that depends on alternate cycles of ligand selection from pools of variant sequences and amplification of the bound species. Multiple rounds exponentially enrich the population for the highest affinity species that can be clonally isolated and characterized. In particular one eight-base region of an RNA that interacts with the T4 DNA polymerase was chosen and randomized. Two different sequences were selected by this procedure from the calculated pool of 65,536 species. One is the wild-type sequence found in the bacteriophage mRNA; one is varied from wild type at four positions. The binding constants of these two RNA's to T4 DNA polymerase are equivalent. These protocols with minimal modification can yield high-affinity ligands for any protein that binds nucleic acids as part of its function; high-affinity ligands could conceivably be developed for any target molecule.

9,367 citations

Journal ArticleDOI
TL;DR: There is growing evidence that aging involves, in addition, progressive changes in free radical-mediated regulatory processes that result in altered gene expression.
Abstract: At high concentrations, free radicals and radical-derived, nonradical reactive species are hazardous for living organisms and damage all major cellular constituents. At moderate concentrations, how...

9,131 citations

Journal ArticleDOI
15 Jul 1988-Gene
TL;DR: Plasmid expression vectors have been constructed that direct the synthesis of foreign polypeptides in Escherichia coli as fusions with the C terminus of Sj26, a 26-kDa glutathione S-transferase (GST; EC 2.5.1.18) encoded by the parasitic helminth Schistosoma japonicum.

6,003 citations