Institution
Science for Life Laboratory
Facility•Stockholm, Sweden•
About: Science for Life Laboratory is a facility organization based out in Stockholm, Sweden. It is known for research contribution in the topics: Population & Gene. The organization has 2811 authors who have published 5180 publications receiving 231686 citations. The organization is also known as: SciLifeLab.
Topics: Population, Gene, Genome, Cancer, Genome-wide association study
Papers
More filters
••
TL;DR: The Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available, providing a unified solution for transcriptome reconstruction in any sample.
Abstract: Massively parallel sequencing of cDNA has enabled deep and efficient probing of transcriptomes. Current approaches for transcript reconstruction from such data often rely on aligning reads to a reference genome, and are thus unsuitable for samples with a partial or missing reference genome. Here we present the Trinity method for de novo assembly of full-length transcripts and evaluate it on samples from fission yeast, mouse and whitefly, whose reference genome is not yet available. By efficiently constructing and analyzing sets of de Bruijn graphs, Trinity fully reconstructs a large fraction of transcripts, including alternatively spliced isoforms and transcripts from recently duplicated genes. Compared with other de novo transcriptome assemblers, Trinity recovers more full-length transcripts across a broad range of expression levels, with a sensitivity similar to methods that rely on genome alignments. Our approach provides a unified solution for transcriptome reconstruction in any sample, especially in the absence of a reference genome.
15,665 citations
••
TL;DR: In this paper, a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level.
Abstract: Resolving the molecular details of proteome variation in the different tissues and organs of the human body will greatly increase our knowledge of human biology and disease. Here, we present a map of the human tissue proteome based on an integrated omics approach that involves quantitative transcriptomics at the tissue and organ level, combined with tissue microarray-based immunohistochemistry, to achieve spatial localization of proteins down to the single-cell level. Our tissue-based analysis detected more than 90% of the putative protein-coding genes. We used this approach to explore the human secretome, the membrane proteome, the druggable proteome, the cancer proteome, and the metabolic functions in 32 different tissues and organs. All the data are integrated in an interactive Web-based database that allows exploration of individual proteins, as well as navigation of global expression patterns, in all major tissues and organs in the human body.
9,745 citations
••
TL;DR: SignalP 4.0 was the best signal-peptide predictor for all three organism types but was not in all cases as good as SignalP 3.0 according to cleavage-site sensitivity or signal- peptide correlation when there are no transmembrane proteins present.
Abstract: We benchmarked SignalP 4.0 against SignalP 3.0 and ten other signal peptide prediction algorithms (Fig. 1). We compared prediction performance using the Matthews correlation coefficient16, for which each sequence was counted as a true or false positive or negative. To test SignalP 4.0 performance, we did not use data that had been used in training the networks or selecting the optimal architecture, and the test data did not contain homologs to the training and optimization data (Supplementary Methods). The test set for SignalP 3.0 was also independent of the training set because we removed sequences used to construct SignalP 3.0 and their homologs from the benchmark data. For other algorithms more recent than SignalP 3.0, the benchmark data may include data used to train the methods, possibly leading to slight overestimations of their performance. Our results show that SignalP 4.0 was the best signal-peptide predictor for all three organism types (Fig. 1). This comes at a price, however, because SignalP 4.0 was not in all cases as good as SignalP 3.0 according to cleavage-site sensitivity or signal-peptide correlation when there are no transmembrane proteins present (Supplementary Results). An ideal method would have the best SignalP 4.0: discriminating signal peptides from transmembrane regions
8,370 citations
••
Technical University of Madrid1, Stanford University2, Elsevier3, VU University Amsterdam4, National Institutes of Health5, University of Leicester6, Harvard University7, Beijing Genomics Institute8, Maastricht University9, Wageningen University and Research Centre10, University of Oxford11, Heriot-Watt University12, University of Manchester13, University of California, San Diego14, Leiden University Medical Center15, Leiden University16, Federal University of São Paulo17, Science for Life Laboratory18, Bayer19, Swiss Institute of Bioinformatics20, Cray21, University Medical Center Groningen22, Erasmus University Rotterdam23
TL;DR: The FAIR Data Principles as mentioned in this paper are a set of data reuse principles that focus on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals.
Abstract: There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
7,602 citations
••
Broad Institute1, Commonwealth Scientific and Industrial Research Organisation2, Massachusetts Institute of Technology3, Hebrew University of Jerusalem4, Science for Life Laboratory5, Pittsburgh Supercomputing Center6, Oklahoma State University–Stillwater7, Griffith University8, University of Wisconsin-Madison9, Dresden University of Technology10, California Institute for Quantitative Biosciences11, Flanders Institute for Biotechnology12, Parco Tecnologico Padano13, United States Department of Agriculture14, Purdue University15, Indiana University16
TL;DR: This protocol provides a workflow for genome-independent transcriptome analysis leveraging the Trinity platform and presents Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes.
Abstract: De novo assembly of RNA-seq data enables researchers to study transcriptomes without the need for a genome sequence; this approach can be usefully applied, for instance, in research on 'non-model organisms' of ecological and evolutionary importance, cancer samples or the microbiome. In this protocol we describe the use of the Trinity platform for de novo transcriptome assembly from RNA-seq data in non-model organisms. We also present Trinity-supported companion utilities for downstream applications, including RSEM for transcript abundance estimation, R/Bioconductor packages for identifying differentially expressed transcripts across samples and approaches to identify protein-coding genes. In the procedure, we provide a workflow for genome-independent transcriptome analysis leveraging the Trinity platform. The software, documentation and demonstrations are freely available from http://trinityrnaseq.sourceforge.net. The run time of this protocol is highly dependent on the size and complexity of data to be analyzed. The example data set analyzed in the procedure detailed herein can be processed in less than 5 h.
6,369 citations
Authors
Showing all 2841 results
Name | H-index | Papers | Citations |
---|---|---|---|
André G. Uitterlinden | 199 | 1229 | 156747 |
George M. Church | 172 | 900 | 120514 |
Jens Nielsen | 149 | 1752 | 104005 |
Vijay K. Kuchroo | 144 | 525 | 86936 |
Kohei Miyazono | 135 | 515 | 68706 |
Carl-Henrik Heldin | 131 | 520 | 67528 |
David P. Lane | 129 | 568 | 90787 |
Erik Ingelsson | 124 | 538 | 85407 |
Elisabetta Dejana | 122 | 430 | 48254 |
Mathias Uhlén | 117 | 861 | 68387 |
Clive Ballard | 117 | 736 | 61663 |
Christer Betsholtz | 104 | 357 | 56771 |
Kjell Öberg | 102 | 518 | 38262 |
Peter ten Dijke | 101 | 286 | 40776 |
Ulf Gyllensten | 100 | 368 | 59219 |