scispace - formally typeset
Search or ask a question
Author

Martin Vingron

Bio: Martin Vingron is an academic researcher from Max Planck Society. The author has contributed to research in topics: Gene & Enhancer. The author has an hindex of 77, co-authored 359 publications receiving 33403 citations. Previous affiliations of Martin Vingron include CAS-MPG Partner Institute for Computational Biology & Center for Information Technology.


Papers
More filters
Journal ArticleDOI
TL;DR: The ultimate goal of this work is to establish a standard for recording and reporting microarray-based gene expression data, which will in turn facilitate the establishment of databases and public repositories and enable the development of data analysis tools.
Abstract: Microarray analysis has become a widely used tool for the generation of gene expression data on a genomic scale. Although many significant results have been derived from microarray studies, one limitation has been the lack of standards for presenting and exchanging such data. Here we present a proposal, the Minimum Information About a Microarray Experiment (MIAME), that describes the minimum information required to ensure that microarray data can be easily interpreted and that results derived from its analysis can be independently verified. The ultimate goal of this work is to establish a standard for recording and reporting microarray-based gene expression data, which will in turn facilitate the establishment of databases and public repositories and enable the development of data analysis tools. With respect to MIAME, we concentrate on defining the content and structure of the necessary information rather than the technical format for capturing it.

4,030 citations

Journal ArticleDOI
TL;DR: TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences to reduce waiting time for larger datasets.
Abstract: SUMMARY TREE-PUZZLE is a program package for quartet-based maximum-likelihood phylogenetic analysis (formerly PUZZLE, Strimmer and von Haeseler, Mol. Biol. Evol., 13, 964-969, 1996) that provides methods for reconstruction, comparison, and testing of trees and models on DNA as well as protein sequences. To reduce waiting time for larger datasets the tree reconstruction part of the software has been parallelized using message passing that runs on clusters of workstations as well as parallel computers. AVAILABILITY http://www.tree-puzzle.de. The program is written in ANSI C. TREE-PUZZLE can be run on UNIX, Windows and Mac systems, including Mac OS X. To run the parallel version of PUZZLE, a Message Passing Interface (MPI) library has to be installed on the system. Free MPI implementations are available on the Web (cf. http://www.lam-mpi.org/mpi/implementations/).

2,581 citations

Journal ArticleDOI
TL;DR: Examining the expression patterns of large gene families, it is found that they are often more similar than would be expected by chance, indicating that many gene families have been co-opted for specific developmental processes.
Abstract: Regulatory regions of plant genes tend to be more compact than those of animal genes, but the complement of transcription factors encoded in plant genomes is as large or larger than that found in those of animals. Plants therefore provide an opportunity to study how transcriptional programs control multicellular development. We analyzed global gene expression during development of the reference plant Arabidopsis thaliana in samples covering many stages, from embryogenesis to senescence, and diverse organs. Here, we provide a first analysis of this data set, which is part of the AtGenExpress expression atlas. We observed that the expression levels of transcription factor genes and signal transduction components are similar to those of metabolic genes. Examining the expression patterns of large gene families, we found that they are often more similar than would be expected by chance, indicating that many gene families have been co-opted for specific developmental processes.

2,510 citations

Journal ArticleDOI
01 Jul 2002
TL;DR: A statistical model for microarray gene expression data that comprises data calibration, the quantifying of differential expression, and the quantification of measurement error is introduced, and a difference statistic Deltah whose variance is approximately constant along the whole intensity range is derived.
Abstract: We introduce a statistical model for microarray gene expression data that comprises data calibration, the quantification of differential expression, and the quan- tification of measurement error. In particular, we derive a transformation h for intensity measurements, and a difference statistich whose variance is approximately constant along the whole intensity range. This forms a basis for statistical inference from microarray data, and provides a rational data pre-processing strategy for multi- variate analyses. For the transformation h, the parametric form h(x) = arsinh(a + bx) is derived from a model of the variance-versus-mean dependence for microarray intensity data, using the method of variance stabilizing transformations. For large intensities, h coincides with the logarithmic transformation, andh with the log-ratio. The parameters of h together with those of the calibration between experiments are estimated with a robust variant of maximum-likelihood estimation. We demonstrate our approach on data sets from different experimental plat- forms, including two-colour cDNA arrays and a series of Affymetrix oligonucleotide arrays. Availability: Software is freely available for academic use as an R package at http://www.dkfz.de/abt0840/whuber

2,323 citations

Journal ArticleDOI
Julie George1, Jing Shan Lim2, Se Jin Jang3, Yupeng Cun1, Luka Ozretić, Gu Kong4, Frauke Leenders1, Xin Lu1, Lynnette Fernandez-Cuesta1, Graziella Bosco1, Christian Müller1, Ilona Dahmen1, Nadine Jahchan2, Kwon-Sik Park2, Dian Yang2, Anthony N. Karnezis5, Dedeepya Vaka2, Ángela Torres2, Maia Segura Wang, Jan O. Korbel, Roopika Menon6, Sung-Min Chun3, Deokhoon Kim3, Matthew D. Wilkerson7, Neil Hayes7, David Engelmann8, Brigitte M. Pützer8, Marc Bos1, Sebastian Michels6, Ignacija Vlasic, Danila Seidel1, Berit Pinther1, Philipp Schaub1, Christian Becker1, Janine Altmüller1, Jun Yokota9, Takashi Kohno, Reika Iwakawa, Koji Tsuta, Masayuki Noguchi10, Thomas Muley11, Hans Hoffmann11, Philipp A. Schnabel12, Iver Petersen13, Yuan Chen13, Alex Soltermann14, Verena Tischler14, Chang-Min Choi3, Yong-Hee Kim3, Pierre P. Massion15, Yong Zou15, Dragana Jovanovic16, Milica Kontic16, Gavin M. Wright17, Prudence A. Russell17, Benjamin Solomon17, Ina Koch, Michael Lindner, Lucia Anna Muscarella18, Annamaria la Torre18, John K. Field19, Marko Jakopović20, Jelena Knezevic, Esmeralda Castaños-Vélez21, Luca Roz, Ugo Pastorino, O.T. Brustugun22, Marius Lund-Iversen22, Erik Thunnissen23, Jens Köhler, Martin Schuler, Johan Botling24, Martin Sandelin24, Montserrat Sanchez-Cespedes, Helga B. Salvesen25, Viktor Achter1, Ulrich Lang1, Magdalena Bogus1, Peter M. Schneider1, Thomas Zander, Sascha Ansén6, Michael Hallek1, Jürgen Wolf6, Martin Vingron26, Yasushi Yatabe, William D. Travis27, Peter Nürnberg1, Christian Reinhardt, Sven Perner3, Lukas C. Heukamp, Reinhard Büttner, Stefan A. Haas26, Elisabeth Brambilla28, Martin Peifer1, Julien Sage2, Roman K. Thomas1 
06 Aug 2015-Nature
TL;DR: This first comprehensive study of somatic genome alterations in SCLC uncovers several key biological processes and identifies candidate therapeutic targets in this highly lethal form of cancer.
Abstract: We have sequenced the genomes of 110 small cell lung cancers (SCLC), one of the deadliest human cancers. In nearly all the tumours analysed we found bi-allelic inactivation of TP53 and RB1, sometimes by complex genomic rearrangements. Two tumours with wild-type RB1 had evidence of chromothripsis leading to overexpression of cyclin D1 (encoded by the CCND1 gene), revealing an alternative mechanism of Rb1 deregulation. Thus, loss of the tumour suppressors TP53 and RB1 is obligatory in SCLC. We discovered somatic genomic rearrangements of TP73 that create an oncogenic version of this gene, TP73Δex2/3. In rare cases, SCLC tumours exhibited kinase gene mutations, providing a possible therapeutic opportunity for individual patients. Finally, we observed inactivating mutations in NOTCH family genes in 25% of human SCLC. Accordingly, activation of Notch signalling in a pre-clinical SCLC mouse model strikingly reduced the number of tumours and extended the survival of the mutant mice. Furthermore, neuroendocrine gene expression was abrogated by Notch activity in SCLC cells. This first comprehensive study of somatic genome alterations in SCLC uncovers several key biological processes and identifies candidate therapeutic targets in this highly lethal form of cancer.

1,504 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved and modifications are incorporated into a new program, CLUSTAL W, which is freely available.
Abstract: The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.

63,427 citations

Journal ArticleDOI
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Abstract: In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html .

47,038 citations

Journal ArticleDOI
TL;DR: ClUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W, providing an integrated system for performing multiple sequence and profile alignments and analysing the results.
Abstract: CLUSTAL X is a new windows interface for the widely-used progressive multiple sequence alignment program CLUSTAL W. The new system is easy to use, providing an integrated system for performing multiple sequence and profile alignments and analysing the results. CLUSTAL X displays the sequence alignment in a window on the screen. A versatile sequence colouring scheme allows the user to highlight conserved features in the alignment. Pull-down menus provide all the options required for traditional multiple sequence and profile alignment. New features include: the ability to cut-and-paste sequences to change the order of the alignment, selection of a subset of the sequences to be realigned, and selection of a sub-range of the alignment to be realigned and inserted back into the original alignment. Alignment quality analysis can be performed and low-scoring segments or exceptional residues can be highlighted. Quality analysis and realignment of selected residue ranges provide the user with a powerful tool to improve and refine difficult alignments and to trap errors in input sequences. CLUSTAL X has been compiled on SUN Solaris, IRIX5.3 on Silicon Graphics, Digital UNIX on DECstations, Microsoft Windows (32 bit) for PCs, Linux ELF for x86 PCs, and Macintosh PowerMac.

38,522 citations

Journal ArticleDOI
TL;DR: MUSCLE is a new computer program for creating multiple alignments of protein sequences that includes fast distance estimation using kmer counting, progressive alignment using a new profile function the authors call the log-expectation score, and refinement using tree-dependent restricted partitioning.
Abstract: We describe MUSCLE, a new computer program for creating multiple alignments of protein sequences. Elements of the algorithm include fast distance estimation using kmer counting, progressive alignment using a new profile function we call the logexpectation score, and refinement using treedependent restricted partitioning. The speed and accuracy of MUSCLE are compared with T-Coffee, MAFFT and CLUSTALW on four test sets of reference alignments: BAliBASE, SABmark, SMART and a new benchmark, PREFAB. MUSCLE achieves the highest, or joint highest, rank in accuracy on each of these sets. Without refinement, MUSCLE achieves average accuracy statistically indistinguishable from T-Coffee and MAFFT, and is the fastest of the tested methods for large numbers of sequences, aligning 5000 sequences of average length 350 in 7 min on a current desktop computer. The MUSCLE program, source code and PREFAB test data are freely available at http://www.drive5. com/muscle.

37,524 citations

01 May 1993
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Abstract: Three parallel algorithms for classical molecular dynamics are presented. The first assigns each processor a fixed subset of atoms; the second assigns each a fixed subset of inter-atomic forces to compute; the third assigns each a fixed spatial region. The algorithms are suitable for molecular dynamics models which can be difficult to parallelize efficiently—those with short-range forces where the neighbors of each atom change rapidly. They can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors. The algorithms are tested on a standard Lennard-Jones benchmark problem for system sizes ranging from 500 to 100,000,000 atoms on several parallel supercomputers--the nCUBE 2, Intel iPSC/860 and Paragon, and Cray T3D. Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems. For large problems, the spatial algorithm achieves parallel efficiencies of 90% and a 1840-node Intel Paragon performs up to 165 faster than a single Cray C9O processor. Trade-offs between the three algorithms and guidelines for adapting them to more complex molecular dynamics simulations are also discussed.

29,323 citations