scispace - formally typeset
Search or ask a question
Author

Chris Boursnell

Other affiliations: Wellcome Trust Sanger Institute
Bio: Chris Boursnell is an academic researcher from University of Cambridge. The author has contributed to research in topics: Breast cancer & Transcriptome. The author has an hindex of 7, co-authored 11 publications receiving 14500 citations. Previous affiliations of Chris Boursnell include Wellcome Trust Sanger Institute.

Papers
More filters
Journal ArticleDOI
TL;DR: The definition and use of family-specific, manually curated gathering thresholds are explained and some of the features of domains of unknown function (also known as DUFs) are discussed, which constitute a rapidly growing class of families within Pfam.
Abstract: Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).

14,075 citations

Journal ArticleDOI
TL;DR: TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies using only the sequenced reads and the assembly as input and it is revealed that variance in the quality of the input data explains 43% of the variance inThe quality of published de noVO transcriptome assembly assemblies.
Abstract: TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies Using only the sequenced reads and the assembly as input, we show that multiple common artifacts of de novo transcriptome assembly can be readily detected These include chimeras, structural errors, incomplete assembly, and base errors TransRate evaluates these errors to produce a diagnostic quality score for each contig, and these contig scores are integrated to evaluate whole assemblies Thus, TransRate can be used for de novo assembly filtering and optimization as well as comparison of assemblies generated using different methods from the same input reads Applying the method to a data set of 155 published de novo transcriptome assemblies, we deconstruct the contribution that assembly method, read length, read quantity, and read quality make to the accuracy of de novo transcriptome assemblies and reveal that variance in the quality of the input data explains 43% of the variance in the quality of published de novo transcriptome assemblies Because TransRate is reference-free, it is suitable for assessment of assemblies of all types of RNA, including assemblies of long noncoding RNA, rRNA, mRNA, and mixed RNA samples

585 citations

Posted ContentDOI
27 Jun 2015-bioRxiv
TL;DR: TransRate can accurately evaluate assemblies of conserved and novel RNA molecules of any kind in any species and is more accurate than comparable methods and demonstrates its use on a variety of data.
Abstract: TransRate is a tool for reference-free quality assessment of de novo transcriptome assemblies. Using only sequenced reads as the input, TransRate measures the quality of individual contigs and whole assemblies, enabling assembly optimization and comparison. TransRate can accurately evaluate assemblies of conserved and novel RNA molecules of any kind in any species. We show that it is more accurate than comparable methods and demonstrate its use on a variety of data.

118 citations

Journal ArticleDOI
TL;DR: Imaging of hyperpolarized [1-13C]pyruvate metabolism in breast cancer is feasible and demonstrated significant intertumoral and intratumoral metabolic heterogeneity, where lactate labeling correlated with MCT1 expression and hypoxia.
Abstract: Our purpose is to investigate the feasibility of imaging tumor metabolism in breast cancer patients using 13C magnetic resonance spectroscopic imaging (MRSI) of hyperpolarized 13C label exchange between injected [1-13C]pyruvate and the endogenous tumor lactate pool. Treatment-naive breast cancer patients were recruited: four triple-negative grade 3 cancers; two invasive ductal carcinomas that were estrogen and progesterone receptor-positive (ER/PR+) and HER2/neu-negative (HER2-), one grade 2 and one grade 3; and one grade 2 ER/PR+ HER2- invasive lobular carcinoma (ILC). Dynamic 13C MRSI was performed following injection of hyperpolarized [1-13C]pyruvate. Expression of lactate dehydrogenase A (LDHA), which catalyzes 13C label exchange between pyruvate and lactate, hypoxia-inducible factor-1 (HIF1α), and the monocarboxylate transporters MCT1 and MCT4 were quantified using immunohistochemistry and RNA sequencing. We have demonstrated the feasibility and safety of hyperpolarized 13C MRI in early breast cancer. Both intertumoral and intratumoral heterogeneity of the hyperpolarized pyruvate and lactate signals were observed. The lactate-to-pyruvate signal ratio (LAC/PYR) ranged from 0.021 to 0.473 across the tumor subtypes (mean ± SD: 0.145 ± 0.164), and a lactate signal was observed in all of the grade 3 tumors. The LAC/PYR was significantly correlated with tumor volume (R = 0.903, P = 0.005) and MCT 1 (R = 0.85, P = 0.032) and HIF1α expression (R = 0.83, P = 0.043). Imaging of hyperpolarized [1-13C]pyruvate metabolism in breast cancer is feasible and demonstrated significant intertumoral and intratumoral metabolic heterogeneity, where lactate labeling correlated with MCT1 expression and hypoxia.

118 citations

Journal ArticleDOI
TL;DR: Insight is provided into the molecular function of this cell type in C3 species Arabidopsis thaliana, and characteristics of BS cells that are probably ancestral to both C3 and C4 plants are identified.
Abstract: Leaves of angiosperms are made up of multiple distinct cell types. While the function of mesophyll cells, guard cells, phloem companion cells and sieve elements are clearly described, this is not the case for the bundle sheath (BS). To provide insight into the role of the BS in the C3 species Arabidopsis thaliana, we labelled ribosomes in this cell type with a FLAG tag. We then used immunocapture to isolate these ribosomes, followed by sequencing of resident mRNAs. This showed that 5% of genes showed specific splice forms in the BS, and that 15% of genes were preferentially expressed in these cells. The BS translatome strongly implies that the BS plays specific roles in sulfur transport and metabolism, glucosinolate biosynthesis and trehalose metabolism. Much of the C4 cycle is differentially expressed between the C3 BS and the rest of the leaf. Furthermore, the global patterns of transcript residency on BS ribosomes overlap to a greater extent with cells of the root pericycle than any other cell type. This analysis provides the first insight into the molecular function of this cell type in C3 species, and also identifies characteristics of BS cells that are probably ancestral to both C3 and C4 plants.

80 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
TL;DR: This version of MAFFT has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update.
Abstract: We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.

27,771 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters and offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets.
Abstract: Motivation: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. Results: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. Availability: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch Contact: [email protected] Supplementary information:Supplementary data are available at Bioinformatics online.

17,301 citations

Journal ArticleDOI
TL;DR: A new program called Clustal Omega is described, which can align virtually any number of protein sequences quickly and that delivers accurate alignments, and which outperforms other packages in terms of execution time and quality.
Abstract: Multiple sequence alignments are fundamental to many sequence analysis methods. Most alignments are computed using the progressive alignment heuristic. These methods are starting to become a bottleneck in some analysis pipelines when faced with data sets of the size of many thousands of sequences. Some methods allow computation of larger data sets while sacrificing quality, and others produce high-quality alignments, but scale badly with the number of sequences. In this paper, we describe a new program called Clustal Omega, which can align virtually any number of protein sequences quickly and that delivers accurate alignments. The accuracy of the package on smaller test cases is similar to that of the high-quality aligners. On larger data sets, Clustal Omega outperforms other packages in terms of execution time and quality. Clustal Omega also has powerful features for adding sequences to and exploiting information in existing alignments, making use of the vast amount of precomputed information in public databases like Pfam.

12,489 citations