scispace - formally typeset
Search or ask a question
Author

Chris Sander

Bio: Chris Sander is an academic researcher from Harvard University. The author has contributed to research in topics: Large Hadron Collider & Protein structure. The author has an hindex of 178, co-authored 713 publications receiving 233287 citations. Previous affiliations of Chris Sander include Purdue University & University of Leeds.


Papers
More filters
Journal ArticleDOI
Katherine A Hoadley1, Christina Yau2, Christina Yau3, Toshinori Hinoue4  +735 moreInstitutions (16)
05 Apr 2018-Cell
TL;DR: Molecular similarities among histologically or anatomically related cancer types provide a basis for focused pan-cancer analyses, such as pan-gastrointestinal, Pan-gynecological, pan-kidney, and pan-squamous cancers, and those related by stemness features, which may inform strategies for future therapeutic development.

1,535 citations

Journal ArticleDOI
TL;DR: In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels.
Abstract: mirSVR is a new machine learning method for ranking microRNA target sites by a down-regulation score. The algorithm trains a regression model on sequence and contextual features extracted from miRanda-predicted target sites. In a large-scale evaluation, miRanda-mirSVR is competitive with other target prediction methods in identifying target genes and predicting the extent of their downregulation at the mRNA or protein levels. Importantly, the method identifies a significant number of experimentally determined non-canonical and non-conserved sites.

1,506 citations

Journal ArticleDOI
01 May 1994-Proteins
TL;DR: This work extends the previous three‐level system of neural networks by using additional input information derived from multiple alignments using a position‐specific conservation weight as part of the input to increase performance and greatly increased accuracy.
Abstract: Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments.

1,470 citations

Journal ArticleDOI
Giovanni Ciriello1, Giovanni Ciriello2, Michael L. Gatza3, Michael L. Gatza4, Andrew H. Beck5, Matthew D. Wilkerson4, Suhn K. Rhie6, Alessandro Pastore2, Hailei Zhang7, Michael D. McLellan8, Christina Yau9, Cyriac Kandoth2, Reanne Bowlby10, Hui Shen11, Sikander Hayat2, Robert J. Fieldhouse2, Susan C. Lester5, Gary M. Tse12, Rachel E. Factor13, Laura C. Collins5, Kimberly H. Allison14, Yunn Yi Chen15, Kristin C. Jensen14, Kristin C. Jensen16, Nicole B. Johnson5, Steffi Oesterreich17, Gordon B. Mills18, Andrew D. Cherniack7, Gordon Robertson10, Christopher C. Benz9, Chris Sander2, Peter W. Laird11, Katherine A. Hoadley4, Tari A. King2, Rehan Akbani, J. Todd Auman4, Miruna Balasundaram, Saianand Balu, Thomas Barr, Stephen C. Benz, Mario Berrios, Rameen Beroukhim, Tom Bodenheimer, Lori Boice, Moiz S. Bootwalla, Jay Bowen, Denise Brooks, Lynda Chin, Juok Cho, Sudha Chudamani, Tanja M. Davidsen, John A. Demchok, Jennifer B. Dennison, Li Ding, Ina Felau, Martin L. Ferguson, Scott Frazer, Stacey Gabriel, Jianjiong Gao, Julie M. Gastier-Foster, Nils Gehlenborg, Mark Gerken, Gad Getz, William J. Gibson, D. Neil Hayes, David I. Heiman, Andrea Holbrook, Robert A. Holt, Alan P. Hoyle, Hai Hu, Mei Huang, Carolyn M. Hutter, E. Shelley Hwang, Stuart R. Jefferys, Steven J.M. Jones, Zhenlin Ju, Jaegil Kim, Phillip H. Lai, Michael S. Lawrence, Kristen M. Leraas, Tara M. Lichtenberg, Pei Lin, Shiyun Ling, Jia Liu, Wen-Bin Liu, Laxmi Lolla, Yiling Lu, Yussanne Ma, Dennis T. Maglinte, Elaine R. Mardis, Jeffrey R. Marks, Marco A. Marra, Cynthia McAllister, Shaowu Meng, Matthew Meyerson, Richard A. Moore, Lisle E. Mose, Andrew J. Mungall, Bradley A. Murray, Rashi Naresh, Michael S. Noble, Olufunmilayo I. Olopade, Joel S. Parker, Todd Pihl, Gordon Saksena, Steven E. Schumacher, Kenna R. Mills Shaw, Nilsa C. Ramirez, W. Kimryn Rathmell, Jeffrey Roach, A. Gordon Robertson19, Jacqueline E. Schein, Nikolaus Schultz, Margi Sheth, Yan Shi, Juliann Shih, Carl Simon Shelley, Craig D. Shriver, Janae V. Simons, Heidi J. Sofia, Matthew G. Soloway, Carrie Sougnez, Charlie Sun, Roy Tarnuzzer, Daniel Guimarães Tiezzi, David Van Den Berg, Doug Voet, Yunhu Wan, Zhining Wang, John N. Weinstein, Daniel J. Weisenberger, Rick K. Wilson, Lisa Wise, Maciej Wiznerowicz, Junyuan Wu, Ye Wu, Liming Yang, Travis I. Zack, Jean C. Zenklusen, Jiashan Zhang, Erik Zmuda, Charles M. Perou4 
08 Oct 2015-Cell
TL;DR: This multidimensional molecular atlas sheds new light on the genetic bases of ILC and provides potential clinical options, suggesting differential modulation of ER activity in I LC and IDC.

1,414 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Abstract: The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSIBLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.

70,111 citations

Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

63,427 citations

Journal ArticleDOI
TL;DR: ClUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W, providing an integrated system for performing multiple sequence and profile alignments and analysing the results.
Abstract: CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.

38,522 citations

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations

Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations