scispace - formally typeset
Search or ask a question
Author

Charles DeLisi

Bio: Charles DeLisi is an academic researcher from Boston University. The author has contributed to research in topics: Antigen & Genome. The author has an hindex of 62, co-authored 231 publications receiving 13219 citations. Previous affiliations of Charles DeLisi include Icahn School of Medicine at Mount Sinai & College of Engineering, Trivandrum.


Papers
More filters
Journal ArticleDOI
TL;DR: Discriminant analysis can be used to precisely classify membrane proteins as integral or peripheral and to estimate the odds that the classification is correct, and it is found that discrimination between integral and peripheral membrane proteins can be achieved with 99% reliability.

819 citations

Journal ArticleDOI
TL;DR: Although the scale is optimal only for predicting alpha-amphipathicity, it also ranks high in identifying beta-ampshipathicity and in distinguishing interior from exterior residues in a protein.

624 citations

Journal ArticleDOI
TL;DR: The finding that T-cell antigenic sites show a high correlation with amphipathicity greatly simplifies the search for such sites and is potentially important for vaccine development.
Abstract: We propose, on the basis of physical chemical and biological requirements for T-cell activation by antigen, that sites on a protein that can stimulate T lymphocytes will be capable of forming a stable amphipathic structure (i.e., one with separated hydrophobic and hydrophilic surfaces), displaying periodicity in hydrophobic residues. A spectral analysis of the 12 antigenic sites to which the method could be applied indicates that the amphipathic periodicity hypothesis is valid for 10 of them, generally with reliabilities that are well above 98%, with periodicities compatible with an alpha-helical structure. An 11th case manifests a different type of amphipathicity. The analyses require only a knowledge of amino acid sequence. The finding that T-cell antigenic sites show a high correlation with amphipathicity greatly simplifies the search for such sites and is potentially important for vaccine development.

532 citations

Journal Article
TL;DR: The hypothesis that stable amphipathic helices are fundamentally important in determining immunodominance is supported, and this approach may be of practical value in designing synthetic vaccines aimed at T cell immunity.
Abstract: We have used a data base of 23 known immunodominant helper T cell antigenic sites located on 12 proteins to systematically develop an optimized algorithm for predicting T cell antigenic sites. The algorithm is based on the amphipathic helix model in which antigenic sites are postulated to be helices with one face predominantly polar and the opposite face predominantly apolar. Such amphipathic structures can form when the polarity of residues along the sequence varies with a more or less regular period. Hence they can be identified by methods (so called power spectrum procedures) that detect periodic variations in properties of a sequence. The choice of power spectrum procedure, hydrophobicity scale, and model parameters are examined. An algorithm is tested by comparing the predicted amphipathic segments with the locations of the known T cell sites, counting the number of matches, and calculating the probability of getting this number by chance alone. The optimum algorithm, which predicts the largest number of sites with the lowest chance probability, uses the Fauchere-Pliska hydrophobicity scale and a least squares fit of a sinusoid as its power spectrum procedure. By applying this algorithm, 18 of the 23 known sites are identified (75% sensitivity) with a high degree of significance (p less than 0.001). The success of the algorithm supports the hypothesis that stable amphipathic helices are fundamentally important in determining immunodominance. This approach may be of practical value in designing synthetic vaccines aimed at T cell immunity.

522 citations

Journal ArticleDOI
TL;DR: In this article, effective atomic contact energies (ACE) were estimated for 18 different atom types, which were resolved on the basis of the way their properties cluster in the 20 common amino acids.

494 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis that includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software.
Abstract: Correlation networks are increasingly being used in bioinformatics applications For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets These methods have been successfully applied in various biological contexts, eg cancer, mouse genetics, yeast genetics, and analysis of brain imaging data While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software Along with the R package we also present R software tutorials While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings The WGCNA package provides R functions for weighted correlation network analysis, eg co-expression network analysis of gene expression data The R package along with its source code and additional material are freely available at http://wwwgeneticsuclaedu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA

14,243 citations

Journal ArticleDOI
TL;DR: The objective of this web server is to provide easy access to RNA and DNA folding and hybridization software to the scientific community at large by making use of universally available web GUIs (Graphical User Interfaces).
Abstract: The abbreviated name,‘mfold web server’,describes a number of closely related software applications available on the World Wide Web (WWW) for the prediction of the secondary structure of single stranded nucleic acids. The objective of this web server is to provide easy access to RNA and DNA folding and hybridization software to the scientific community at large. By making use of universally available web GUIs (Graphical User Interfaces),the server circumvents the problem of portability of this software. Detailed output,in the form of structure plots with or without reliability information,single strand frequency plots and ‘energy dot plots’, are available for the folding of single sequences. A variety of ‘bulk’ servers give less information,but in a shorter time and for up to hundreds of sequences at once. The portal for the mfold web server is http://www.bioinfo.rpi.edu/applications/ mfold. This URL will be referred to as ‘MFOLDROOT’.

12,535 citations

Journal ArticleDOI
TL;DR: A new membrane protein topology prediction method, TMHMM, based on a hidden Markov model is described and validated, and it is discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C-in topologies.

11,453 citations

Journal ArticleDOI
TL;DR: FLASH is a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short and when FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds.
Abstract: Motivation: Next-generation sequencing technologies generate very large numbers of short reads. Even with very deep genome coverage, short read lengths cause problems in de novo assemblies. The use of paired-end libraries with a fragment size shorter than twice the read length provides an opportunity to generate much longer reads by overlapping and merging read pairs before assembling a genome. Results: We present FLASH, a fast computational tool to extend the length of short reads by overlapping paired-end reads from fragment libraries that are sufficiently short. We tested the correctness of the tool on one million simulated read pairs, and we then applied it as a pre-processor for genome assemblies of Illumina reads from the bacterium Staphylococcus aureus and human chromosome 14. FLASH correctly extended and merged reads >99% of the time on simulated reads with an error rate of <1%. With adequately set parameters, FLASH correctly merged reads over 90% of the time even when the reads contained up to 5% errors. When FLASH was used to extend reads prior to assembly, the resulting assemblies had substantially greater N50 lengths for both contigs and scaffolds. Availability and Implementation: The FLASH system is implemented in C and is freely available as open-source code at http://www.cbcb.umd.edu/software/flash. Contact: moc.liamg@cogam.t

9,827 citations

Journal ArticleDOI
TL;DR: It is shown that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckia genetic algorithm is the most efficient, reliable, and successful of the three.
Abstract: A novel and robust automated docking method that predicts the bound conformations of flexible ligands to macromolecular targets has been developed and tested, in combination with a new scoring function that estimates the free energy change upon binding. Interestingly, this method applies a Lamarckian model of genetics, in which environmental adaptations of an individual's phenotype are reverse transcribed into its genotype and become . heritable traits sic . We consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein)ligand test systems having known three-dimensional structure. We show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three. The empirical free energy function was calibrated using a set of 30 structurally known protein)ligand complexes with experimentally determined binding constants. Linear regression analysis of the observed binding constants in terms of a wide variety of structure-derived molecular properties was performed. The final model had a residual standard y1 y1 .

9,322 citations