Author
Marcel Martin
Other affiliations: Max Planck Society, University of Duisburg-Essen, Technical University of Dortmund ...read more
Bio: Marcel Martin is a academic researcher from Science for Life Laboratory. The author has contributed to research in topic(s): Exome sequencing & Population. The author has an hindex of 24, co-authored 42 publication(s) receiving 15979 citation(s). Previous affiliations of Marcel Martin include Max Planck Society & University of Duisburg-Essen.
...read more
Topics: Exome sequencing, Population, Exome ...read more
Papers
More
Abstract: When small RNA is sequenced on current sequencing machines, the resulting reads are usually longer than the RNA and therefore contain parts of the 3' adapter. That adapter must be found and removed error-tolerantly from each read before read mapping. Previous solutions are either hard to use or do not offer required features, in particular support for color space data. As an easy to use alternative, we developed the command-line tool cutadapt, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features. Cutadapt, including its MIT-licensed source code, is available for download at http://code.google.com/p/cutadapt/
...read more
Topics: Adapter (genetics) (50%)
13,576 Citations
Abstract: Michael Zeschnigk and colleagues identify recurrent somatic mutations of EIF1AX and SF3B1 in uveal melanomas with disomy 3. The EIF1AX mutations specifically alter the N-terminal tail of the protein and were found exclusively in tumors lacking SF3B1 mutations.
...read more
Topics: Exome (59%), Uveal Neoplasm (56%), Exome sequencing (54%)
338 Citations
Abstract: Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms are needed. We will witness the rapid extension of computational pan-genomics, a new sub-area of research in computational biology. In this article, we generalize existing definitions and understand a pan-genome as any collection of genomic sequences to be analyzed jointly or to be used as a reference. We examine already available approaches to construct and use pan-genomes, discuss the potential benefits of future technologies and methodologies and review open challenges from the vantage point of the above-mentioned biological disciplines. As a prominent example for a computational paradigm shift, we particularly highlight the transition from the representation of reference genomes as strings to representations as graphs. We outline how this and other challenges from different application domains translate into common computational problems, point out relevant bioinformatics techniques and identify open problems in computer science. With this review, we aim to increase awareness that a joint approach to computational pan-genomics can help address many of the problems currently faced in various domains.
...read more
Topics: Computational problem (52%), Population (51%)
184 Citations
Open access•Journal Article•
Abstract: Small non-coding RNAs, in particular microRNAs(miRNAs), regulate fine-tuning of gene expression and can act as oncogenes or tumor suppressor genes. Differential miRNA expression has been reported to be of functional relevance for tumor biology. Using next-generation sequencing, the unbiased and absolute quantification of the small RNA transcriptome is now feasible. Neuroblastoma(NB) is an embryonal tumor with highly variable clinical course. We analyzed the small RNA transcriptomes of five favorable and five unfavorable NBs using SOLiD next-generation sequencing, generating a total of >188 000 000 reads. MiRNA expression profiles obtained by deep sequencing correlated well with real-time PCR data. Cluster analysis differentiated between favorable and unfavorable NBs, and the miRNA transcriptomes of these two groups were significantly different. Oncogenic miRNAs of the miR17-92 cluster and the miR-181 family were overexpressed in unfavorable NBs. In contrast, the putative tumor suppressive microRNAs, miR-542-5p and miR-628, were expressed in favorable NBs and virtually absent in unfavorable NBs. In-depth sequence analysis revealed extensive post-transcriptional miRNA editing. Of 13 identified novel miRNAs, three were further analyzed, and expression could be confirmed in a cohort of 70 NBs.
...read more
Topics: Deep sequencing (61%)
180 Citations
Johannes H. Schulte1, Tobias Marschall2, Marcel Martin2, Philipp Rosenstiel2 +8 more•Institutions (2)
Abstract: Small non-coding RNAs, in particular microRNAs(miRNAs), regulate fine-tuning of gene expression and can act as oncogenes or tumor suppressor genes. Differential miRNA expression has been reported to be of functional relevance for tumor biology. Using next-generation sequencing, the unbiased and absolute quantification of the small RNA transcriptome is now feasible. Neuroblastoma(NB) is an embryonal tumor with highly variable clinical course. We analyzed the small RNA transcriptomes of five favorable and five unfavorable NBs using SOLiD next-generation sequencing, generating a total of >188 000 000 reads. MiRNA expression profiles obtained by deep sequencing correlated well with real-time PCR data. Cluster analysis differentiated between favorable and unfavorable NBs, and the miRNA transcriptomes of these two groups were significantly different. Oncogenic miRNAs of the miR17-92 cluster and the miR-181 family were overexpressed in unfavorable NBs. In contrast, the putative tumor suppressive microRNAs, miR-542-5p and miR-628, were expressed in favorable NBs and virtually absent in unfavorable NBs. In-depth sequence analysis revealed extensive post-transcriptional miRNA editing. Of 13 identified novel miRNAs, three were further analyzed, and expression could be confirmed in a cohort of 70 NBs.
...read more
Topics: Deep sequencing (58%), Small RNA (52%), Gene expression profiling (51%) ...read more
168 Citations
Cited by
More
Abstract: Motivation: Although many next-generation sequencing (NGS) read preprocessing tools already existed, we could not find any tool or combination of tools that met our requirements in terms of flexibility, correct handling of paired-end data and high performance. We have developed Trimmomatic as a more flexible and efficient preprocessing tool, which could correctly handle paired-end data.
Results: The value of NGS read preprocessing is demonstrated for both reference-based and reference-free tasks. Trimmomatic is shown to produce output that is at least competitive with, and in many cases superior to, that produced by other tools, in all scenarios tested.
Availability and implementation: Trimmomatic is licensed under GPL V3. It is cross-platform (Java 1.5+ required) and available at http://www.usadellab.org/cms/index.php?page=trimmomatic
Contact: ed.nehcaa-htwr.1oib@ledasu
Supplementary information: Supplementary data are available at Bioinformatics online.
...read more
26,464 Citations
Abstract: When small RNA is sequenced on current sequencing machines, the resulting reads are usually longer than the RNA and therefore contain parts of the 3' adapter. That adapter must be found and removed error-tolerantly from each read before read mapping. Previous solutions are either hard to use or do not offer required features, in particular support for color space data. As an easy to use alternative, we developed the command-line tool cutadapt, which supports 454, Illumina and SOLiD (color space) data, offers two adapter trimming algorithms, and has other useful features. Cutadapt, including its MIT-licensed source code, is available for download at http://code.google.com/p/cutadapt/
...read more
Topics: Adapter (genetics) (50%)
13,576 Citations
Abstract: De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
...read more
Topics: De novo transcriptome assembly (56%), Bioconductor (55%), Sequence assembly (53%)
5,056 Citations
Open access•
01 Aug 2000-
Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.
...read more
4,833 Citations
Abstract: Genome-wide, targeted loss-of-function pooled screens using the CRISPR (clustered regularly interspaced short palindrome repeats)–associated nuclease Cas9 in human and mouse cells provide an alternative screening system to RNA interference (RNAi) and have been used to reveal new mechanisms in diverse biological models1-4. Previously, we used a Genome-scale CRISPR Knock-Out (GeCKO) library to identify loss-of-function mutations conferring vemurafenib resistance in a melanoma model1. However, initial lentiviral delivery systems for CRISPR screening had low viral titer or required a cell line already expressing Cas9, limiting the range of biological systems amenable to screening.
Here, we sought to improve both the lentiviral packaging and choice of guide sequences in our original GeCKO library1, where a pooled library of synthesized oligonucleotides was cloned into a lentiviral backbone containing both the Streptococcus pyogenes Cas9 nuclease and the single guide RNA (sgRNA) scaffold. To create a new vector capable of producing higher-titer virus (lentiCRISPRv2), we made several modifications, including removal of one of the nuclear localization signals (NLS), human codon-optimization of the remaining NLS and P2A bicistronic linker sequences, and repositioning of the U6-driven sgRNA cassette (Fig. 1a). These changes resulted in a ~10-fold increase in functional viral titer over lentiCRISPRv11 (Fig. 1b).
Figure 1
New lentiviral CRISPR designs produce viruses with higher functional titer.
To further increase viral titer, we also cloned a two-vector system, in which Cas9 (lentiCas9-Blast) and sgRNA (lentiGuide-Puro) are delivered using separate viral vectors with distinct antibiotic selection markers (Fig. 1a). LentiGuide-Puro has a ~100-fold increase in functional viral titer over the original lentiCRISPRv1 (Fig. 1b). Both single and dual-vector systems mediate efficient knock-out of a genomically-integrated copy of EGFP in human cells (Supplementary Fig. 1). Whereas the dual vector system enables generation of Cas9-expressing cell lines which can be subsequently used for screens using lentiGuide-Puro, the single vector lentiCRISPRv2 may be better suited for in vivo or primary cell screening applications.
In addition to the vector improvements, we designed and synthesized new human and mouse GeCKOv2 sgRNA libraries (Supplementary Methods) with several improvements (Table 1): First, for both human and mouse libraries, to target all genes with a uniform number of sgRNAs, we selected 6 sgRNAs per gene distributed over 3-4 constitutively expressed exons. Second, to further minimize off-target genome modification, we improved the calculation of off-target scores based on specificity analysis5. Third, to inactivate microRNAs (miRNAs) which play a key role in transcriptional regulation, we added sgRNAs to direct mutations to the pre-miRNA hairpin structure6. Finally, we targeted ~1000 additional genes not included in the original GeCKO library.
Table 1
Comparison of new GeCKO v2 human and mouse sgRNA libraries with existing CRISPR libraries.
Both libraries, mouse and human, are divided into 2 sub-libraries — containing 3 sgRNAs targeting each gene in the genome, as well as 1000 non-targeting control sgRNAs. Screens can be performed by combining both sub-libraries, yielding 6 sgRNAs per gene, for higher coverage. Alternatively, individual sub-libraries can be used in situations where cell numbers are limiting (eg. primary cells, in vivo screens). The human and mouse libraries have been cloned into lentiCRISPRv2 and into lentiGuide-Puro and deep sequenced to ensure uniform representation (Supplementary Fig. 2, 3). These new lentiCRISPR vectors and human and mouse libraries further improve the GeCKO reagents for diverse screening applications. Reagents are available to the academic community through Addgene and associated protocols, support forums, and computational tools are available via the Zhang lab website (www.genome-engineering.org).
...read more
Topics: CRISPR interference (56%), Cas9 (55%), CRISPR (55%)
2,826 Citations