scispace - formally typeset
Search or ask a question
Author

Lye Meng Markillie

Bio: Lye Meng Markillie is an academic researcher from Environmental Molecular Sciences Laboratory. The author has contributed to research in topics: Biology & Medicine. The author has an hindex of 21, co-authored 49 publications receiving 2036 citations. Previous affiliations of Lye Meng Markillie include Pacific Northwest National Laboratory.
Topics: Biology, Medicine, Transcriptome, Proteome, Receptor


Papers
More filters
Journal ArticleDOI
TL;DR: This study sequenced DNA from complex sediment and planktonic consortia from an aquifer adjacent to the Colorado River and reconstructed the first complete genomes for Archaea using cultivation-independent methods, which dramatically expand genomic sampling of the domain Archaea and clarify taxonomic designations within a major superphylum.

463 citations

Journal ArticleDOI
TL;DR: A high-throughput methodology to characterize an organism's dynamic proteome based on the combination of global enzymatic digestion, high-resolution liquid chromatographic separations, and analysis by Fourier transform ion cyclotron resonance mass spectrometry is developed.
Abstract: Understanding biological systems and the roles of their constituents is facilitated by the ability to make quantitative, sensitive, and comprehensive measurements of how their proteome changes, e.g., in response to environmental perturbations. To this end, we have developed a high-throughput methodology to characterize an organism's dynamic proteome based on the combination of global enzymatic digestion, high-resolution liquid chromatographic separations, and analysis by Fourier transform ion cyclotron resonance mass spectrometry. The peptides produced serve as accurate mass tags for the proteins and have been used to identify with high confidence >61% of the predicted proteome for the ionizing radiation-resistant bacterium Deinococcus radiodurans. This fraction represents the broadest proteome coverage for any organism to date and includes 715 proteins previously annotated as either hypothetical or conserved hypothetical.

409 citations

Journal ArticleDOI
01 Jul 2011
TL;DR: A new approach for mapping and analysing sequencing reads is introduced that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%, andrapolations to higher sequencing depths highlight the need for efficient complementary steps.
Abstract: Motivation: Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means. Results: We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error <20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision. Contact: rnaseq10@boku.ac.at Supplementary information: Supplementary data are available at Bioinformatics online.

141 citations

Journal ArticleDOI
TL;DR: A simple and general targeted mutagenesis method was developed to generate catalase (katA) and superoxide dismutase (sodA) mutants that were shown to be more sensitive to ionizing radiation than the wild type.
Abstract: Deinococcus radiodurans R1 is extremely resistant to both oxidative stress and ionizing radiation. A simple and general targeted mutagenesis method was developed to generate catalase (katA) and superoxide dismutase (sodA) mutants. Both mutants were shown to be more sensitive to ionizing radiation than the wild type.

138 citations

Journal ArticleDOI
TL;DR: Results indicate that CST1 functions as a key structural component that confers essential sturdiness to the T. gondii tissue cyst critical for persistence of bradyzoite forms.
Abstract: Toxoplasma gondii infects up to one third of the world's population. A key to the success of T. gondii as a parasite is its ability to persist for the life of its host as bradyzoites within tissue cysts. The glycosylated cyst wall is the key structural feature that facilitates persistence and oral transmission of this parasite. Because most of the antibodies and reagents that recognize the cyst wall recognize carbohydrates, identification of the components of the cyst wall has been technically challenging. We have identified CST1 (TGME49_064660) as a 250 kDa SRS (SAG1 related sequence) domain protein with a large mucin-like domain. CST1 is responsible for the Dolichos biflorus Agglutinin (DBA) lectin binding characteristic of T. gondii cysts. Deletion of CST1 results in reduced cyst number and a fragile brain cyst phenotype characterized by a thinning and disruption of the underlying region of the cyst wall. These defects are reversed by complementation of CST1. Additional complementation experiments demonstrate that the CST1-mucin domain is necessary for the formation of a normal cyst wall structure, the ability of the cyst to resist mechanical stress, and binding of DBA to the cyst wall. RNA-seq transcriptome analysis demonstrated dysregulation of bradyzoite genes within the various cst1 mutants. These results indicate that CST1 functions as a key structural component that confers essential sturdiness to the T. gondii tissue cyst critical for persistence of bradyzoite forms.

131 citations


Cited by
More filters
Journal ArticleDOI
13 Mar 2003-Nature
TL;DR: The ability of mass spectrometry to identify and, increasingly, to precisely quantify thousands of proteins from complex samples can be expected to impact broadly on biology and medicine.
Abstract: Recent successes illustrate the role of mass spectrometry-based proteomics as an indispensable tool for molecular and cellular biology and for the emerging field of systems biology. These include the study of protein-protein interactions via affinity-based isolations on a small and proteome-wide scale, the mapping of numerous organelles, the concurrent description of the malaria parasite genome and proteome, and the generation of quantitative protein profiles from diverse species. The ability of mass spectrometry to identify and, increasingly, to precisely quantify thousands of proteins from complex samples can be expected to impact broadly on biology and medicine.

6,597 citations

Journal ArticleDOI
TL;DR: A statistical model is presented for computing probabilities that proteins are present in a sample on the basis of peptides assigned to tandem mass (MS/MS) spectra acquired from a proteolytic digest of the sample, and it is shown to produce probabilities that are accurate and have high power to discriminate correct from incorrect protein identifications.
Abstract: A statistical model is presented for computing probabilities that proteins are present in a sample on the basis of peptides assigned to tandem mass (MS/MS) spectra acquired from a proteolytic digest of the sample. Peptides that correspond to more than a single protein in the sequence database are apportioned among all corresponding proteins, and a minimal protein list sufficient to account for the observed peptide assignments is derived using the expectation−maximization algorithm. Using peptide assignments to spectra generated from a sample of 18 purified proteins, as well as complex H. influenzae and Halobacterium samples, the model is shown to produce probabilities that are accurate and have high power to discriminate correct from incorrect protein identifications. This method allows filtering of large-scale proteomics data sets with predictable sensitivity and false positive identification error rates. Fast, consistent, and transparent, it provides a standard for publishing large-scale protein identif...

4,544 citations

Journal ArticleDOI
TL;DR: This work speculates on the reasons behind this large discrepancy between the expectations arising from proteomics and the realities of clinical diagnostics and suggests approaches by which protein-disease associations may be more effectively translated into diagnostic tools in the future.

4,062 citations

Journal ArticleDOI
TL;DR: A linear dynamic range over 2 orders of magnitude is demonstrated by using the number of spectra (spectral sampling) acquired for each protein by the data-dependent acquisition of peptides eluting into the mass spectrometer.
Abstract: Proteomic analysis of complex protein mixtures using proteolytic digestion and liquid chromatography in combination with tandem mass spectrometry is a standard approach in biological studies. Data-dependent acquisition is used to automatically acquire tandem mass spectra of peptides eluting into the mass spectrometer. In more complicated mixtures, for example, whole cell lysates, data-dependent acquisition incompletely samples among the peptide ions present rather than acquiring tandem mass spectra for all ions available. We analyzed the sampling process and developed a statistical model to accurately predict the level of sampling expected for mixtures of a specific complexity. The model also predicts how many analyses are required for saturated sampling of a complex protein mixture. For a yeast-soluble cell lysate 10 analyses are required to reach a 95% saturation level on protein identifications based on our model. The statistical model also suggests a relationship between the level of sampling observed for a protein and the relative abundance of the protein in the mixture. We demonstrate a linear dynamic range over 2 orders of magnitude by using the number of spectra (spectral sampling) acquired for each protein.

2,506 citations

Journal ArticleDOI
TL;DR: All of the major steps in RNA-seq data analysis are reviewed, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping.
Abstract: RNA-sequencing (RNA-seq) has a wide variety of applications, but no single analysis pipeline can be used in all cases. We review all of the major steps in RNA-seq data analysis, including experimental design, quality control, read alignment, quantification of gene and transcript levels, visualization, differential gene expression, alternative splicing, functional analysis, gene fusion detection and eQTL mapping. We highlight the challenges associated with each step. We discuss the analysis of small RNAs and the integration of RNA-seq with other functional genomics techniques. Finally, we discuss the outlook for novel technologies that are changing the state of the art in transcriptomics.

1,963 citations