Author
Zhiqi Hao
Bio: Zhiqi Hao is an academic researcher from Thermo Fisher Scientific. The author has contributed to research in topics: Mass spectrometry & Electron-transfer dissociation. The author has an hindex of 4, co-authored 5 publications receiving 575 citations.
Papers
More filters
••
TL;DR: pLink as mentioned in this paper is a software for data analysis of cross-linked proteins coupled with mass-spectrometry analysis, which is compatible with multiple homo- or hetero-bifunctional cross-linkers.
Abstract: pLink, software for data analysis of cross-linked proteins coupled with mass spectrometry, estimates false discovery rate and enables analysis of protein complexes without extensive purification. We have developed pLink, software for data analysis of cross-linked proteins coupled with mass-spectrometry analysis. pLink reliably estimates false discovery rate in cross-link identification and is compatible with multiple homo- or hetero-bifunctional cross-linkers. We validated the program with proteins of known structures, and we further tested it on protein complexes, crude immunoprecipitates and whole-cell lysates. We show that it is a robust tool for protein-structure and protein-protein–interaction studies.
528 citations
••
TL;DR: It is shown that As_SRP-1 has two major functions, one of which functions in cis to support major sperm protein (MSP)-based cytoskeletal assembly in the spermatid that releases it, thereby facilitating sperm motility acquisition and the other released from an activated sperm inhibits the activation of surrounding sperMatids.
Abstract: Spermiogenesis is a series of poorly understood morphological, physiological and biochemical processes that occur during the transition of immotile spermatids into motile, fertilization-competent spermatozoa. Here, we identified a Serpin (serine protease inhibitor) family protein (As_SRP-1) that is secreted from spermatids during nematode Ascaris suum spermiogenesis (also called sperm activation) and we showed that As_SRP-1 has two major functions. First, As_SRP-1 functions in cis to support major sperm protein (MSP)-based cytoskeletal assembly in the spermatid that releases it, thereby facilitating sperm motility acquisition. Second, As_SRP-1 released from an activated sperm inhibits, in trans, the activation of surrounding spermatids by inhibiting vas deferens-derived As_TRY-5, a trypsin-like serine protease necessary for sperm activation. Because vesicular exocytosis is necessary to create fertilization-competent sperm in many animal species, components released during this process might be more important modulators of the physiology and behavior of surrounding sperm than was previously appreciated.
50 citations
••
TL;DR: A combined ETD/CAD approach leads to the recognition of more peptides and proteins than are achieved using peptide analysis by CAD- or ETD-based tandem mass spectrometry alone.
38 citations
••
TL;DR: This work has developed an approach for charge-state determination of peptides from their tandem mass spectra obtained in fragmentations via electron-transfer dissociation (ETD) reactions, and discusses the cost associated with the possible misclassification of charge states.
Abstract: Tandem mass spectrometry in combination with liquid chromatography has emerged as a powerful tool for characterization of complex protein mixtures in a high-throughput manner. One of the bioinformatics challenges posed by the mass spectral data analysis is the determination of precursor charge when unit mass resolution is used for detecting fragment ions. The charge-state information is used to filter database sequences before they are correlated to experimental data. In the absence of the accurate charge state, several charge states are assumed. This dramatically increases database search times. To address this problem, we have developed an approach for charge-state determination of peptides from their tandem mass spectra obtained in fragmentations via electron-transfer dissociation (ETD) reactions. Protein analysis by ETD is thought to enhance the range of amino acid sequences that can be analyzed by mass spectrometry-based proteomics. One example is the improved capability to characterize phosphorylated peptides. Our approach to charge-state determination uses a combination of signal processing and statistical machine learning. The signal processing employs correlation and convolution analyses to determine precursor masses and charge states of peptides. We discuss applicability of these methods to spectra of different charge states. We note that in our applications correlation analysis outperforms the convolution in determining peptide charge states. The correlation analysis is best suited for spectra with prevalence of complementary ions. It is highly specific but is dependent on quality of spectra. The linear discriminant analysis (LDA) approach uses a number of other spectral features to predict charge states. We train LDA classifier on a set of manually curated spectral data from a mixture of proteins of known identity. There are over 5000 spectra in the training set. A number of features, pertinent to spectra of peptides obtained via ETD reactions, have been used in the training. The loading coefficients of LDA indicate the relative importance of different features for charge-state determination. We have applied our model to a test data set generated from a mixture of 49 proteins. We search the spectra with and without use of the charge-state determination. The charge-state determination helps to significantly save the database search times. We discuss the cost associated with the possible misclassification of charge states.
19 citations
••
TL;DR: In this paper , a new peak detection (NPD) method is proposed for sensitive and unbiased detection of new or changing site-specific attributes between a sample and reference that is not possible with conventional UV or fluorescence detection-based methods.
3 citations
Cited by
More filters
••
TL;DR: The progress of proteomics has been driven by the development of new technologies for peptide/protein separation, mass spectrometry analysis, isotope labeling for quantification, and bioinformatics data analysis.
Abstract: According to Genome Sequencing Project statistics (http://www.ncbi.nlm.nih.gov/genomes/static/gpstat.html), as of Feb 16, 2012, complete gene sequences have become available for 2816 viruses, 1117 prokaryotes, and 36 eukaryotes.1–2 The availability of full genome sequences has greatly facilitated biological research in many fields, and has greatly contributed to the growth of proteomics.
Proteins are important because they are the direct bio-functional molecules in the living organisms. The term “proteomics” was coined from merging “protein” and “genomics” in the 1990s.3–4 As a post-genomic discipline, proteomics encompasses efforts to identify and quantify all the proteins of a proteome, including expression, cellular localization, interactions, post-translational modifications (PTMs), and turnover as a function of time, space and cell type, thus making the full investigation of a proteome more challenging than sequencing a genome. There are possibly 100,000 protein forms encoded by the approximate 20,235 genes of the human genome,5 and determining the explicit function of each form will be a challenge.
The progress of proteomics has been driven by the development of new technologies for peptide/protein separation, mass spectrometry analysis, isotope labeling for quantification, and bioinformatics data analysis. Mass spectrometry has emerged as a core tool for large-scale protein analysis. In the past decade, there has been a rapid advance in the resolution, mass accuracy, sensitivity and scan rate of mass spectrometers used to analyze proteins. In addition, hybrid mass analyzers have been introduced recently (e.g. Linear Ion Trap-Orbitrap series6–7) which have significantly improved proteomic analysis.
“Bottom-up” protein analysis refers to the characterization of proteins by analysis of peptides released from the protein through proteolysis. When bottom-up is performed on a mixture of proteins it is called shotgun proteomics,8–10 a name coined by the Yates lab because of its analogy to shotgun genomic sequencing.11 Shotgun proteomics provides an indirect measurement of proteins through peptides derived from proteolytic digestion of intact proteins. In a typical shotgun proteomics experiment, the peptide mixture is fractionated and subjected to LC-MS/MS analysis. Peptide identification is achieved by comparing the tandem mass spectra derived from peptide fragmentation with theoretical tandem mass spectra generated from in silico digestion of a protein database. Protein inference is accomplished by assigning peptide sequences to proteins. Because peptides can be either uniquely assigned to a single protein or shared by more than one protein, the identified proteins may be further scored and grouped based on their peptides. In contrast, another strategy, termed ‘top-down’ proteomics, is used to characterize intact proteins (Figure 1). The top-down approach has some potential advantages for PTM and protein isoform determination and has achieved notable success. Intact proteins have been measured up to 200 kDa,12 and a large scale study has identified more than 1,000 proteins by multi-dimensional separations from complex samples.13 However, the top-down method has significant limitations compared with shotgun proteomics due to difficulties with protein fractionation, protein ionization and fragmentation in the gas phase. By relying on the analysis of peptides, which are more easily fractionated, ionized and fragmented, shotgun proteomics can be more universally adopted for protein analysis. In fact, a hybrid of bottom-up and top-down methodologies and instrumentation has been introduced as middle-down proteomics.14 Essentially, middle-down proteomics analyzes larger peptide fragments than bottom-up proteomics, minimizing peptide redundancy between proteins. Additionally the large peptide fragments yield similar advantages as top-down proteomics, such as gaining further insight into post-translational modifications, without the analytical challenges of analyzing intact proteins. Shotgun proteomics has become a workhorse for the analysis of proteins and their modifications and will be increasingly combined with top-down methods in the future.
Figure 1
Proteomic strategies: bottom-up vs. top-down vs. middle-down. The bottom-up approach analyzes proteolytic peptides. The top-down method measures the intact proteins. The middle-down strategy analyzes larger peptides resulted from limited digestion or ...
In the past decade shotgun proteomics has been widely used by biologists for many different research experiments, advancing biological discoveries. Some applications include, but are not limited to, proteome profiling, protein quantification, protein modification, and protein-protein interaction. There have been several reviews nicely summarizing mass spectrometry history,15 protein quantification with mass spectrometry,16 its biological applications,5,17–26 and many recent advances in methodology.27–32 In this review, we try to provide a full and updated survey of shotgun proteomics, including the fundamental techniques and applications that laid the foundation along with those developed and greatly improved in the past several years.
1,184 citations
••
TL;DR: This manuscript provides a comprehensive review of the peptide and protein identification process using tandem mass spectrometry (MS/MS) data generated in shotgun proteomic experiments, and includes a detailed analysis of the issues affecting the interpretation of data at the protein level.
500 citations
••
TL;DR: A strategy for forming and purifying a functional human β2AR–β-arrestin-1 complex is devised that provides a framework for better understanding the basis of GPCR regulation by arrestins.
Abstract: Single-particle electron microscopy and hydrogen–deuterium exchange mass spectrometry are used to characterize the structure and dynamics of a G-protein-coupled receptor–arrestin complex. Much has been learned about the structure of G-protein-coupled receptors (GCPRs) over the past seven years, but we still don't know what an activated GPCR looks like when it is bound to a β-arrestin. (Arrestins are cellular mediators with a broad range of functions, many of them involving GPCRs.) In this study the authors use single-particle electron microscopy and hydrogen–deuterium exchange mass spectrometry to characterize the structure and dynamics of a GPCR–arrestin complex. Their data support a 'biphasic' mechanism, in which the arrestin initially interacts with the phosphorylated carboxy terminus of the GPCR before re-arranging to more fully engage the membrane protein in a signalling-competent conformation. G-protein-coupled receptors (GPCRs) are critically regulated by β-arrestins, which not only desensitize G-protein signalling but also initiate a G-protein-independent wave of signalling1,2,3,4,5. A recent surge of structural data on a number of GPCRs, including the β2 adrenergic receptor (β2AR)–G-protein complex, has provided novel insights into the structural basis of receptor activation6,7,8,9,10,11. However, complementary information has been lacking on the recruitment of β-arrestins to activated GPCRs, primarily owing to challenges in obtaining stable receptor–β-arrestin complexes for structural studies. Here we devised a strategy for forming and purifying a functional human β2AR–β-arrestin-1 complex that allowed us to visualize its architecture by single-particle negative-stain electron microscopy and to characterize the interactions between β2AR and β-arrestin 1 using hydrogen–deuterium exchange mass spectrometry (HDX-MS) and chemical crosslinking. Electron microscopy two-dimensional averages and three-dimensional reconstructions reveal bimodal binding of β-arrestin 1 to the β2AR, involving two separate sets of interactions, one with the phosphorylated carboxy terminus of the receptor and the other with its seven-transmembrane core. Areas of reduced HDX together with identification of crosslinked residues suggest engagement of the finger loop of β-arrestin 1 with the seven-transmembrane core of the receptor. In contrast, focal areas of raised HDX levels indicate regions of increased dynamics in both the N and C domains of β-arrestin 1 when coupled to the β2AR. A molecular model of the β2AR–β-arrestin signalling complex was made by docking activated β-arrestin 1 and β2AR crystal structures into the electron microscopy map densities with constraints provided by HDX-MS and crosslinking, allowing us to obtain valuable insights into the overall architecture of a receptor–arrestin complex. The dynamic and structural information presented here provides a framework for better understanding the basis of GPCR regulation by arrestins.
424 citations
••
TL;DR: This work elucidates the architecture and assembly pathway across three classes of mSWI/SNF complexes-canonical BRG1/BRM-associated factor (BAF), polybromo-associated BAF (PBAF, and newly defined ncBAF complexes) and defines the requirement of each subunit for complex formation and stability.
418 citations
••
TL;DR: An integrated workflow that robustly identifies cross-links from endogenous protein complexes in human cellular lysates is described, based on the application of mass spectrometry (MS)-cleavable cross-linkers, sequential collision-induced dissociation (CID)–tandem MS (MS/MS) and electron-transfer Dissociation (ETD)-MS/ MS acquisitions, and a dedicated search engine, XlinkX.
Abstract: A crosslinking-mass spectrometry strategy, including a new proteome database search engine called XlinkX, enables the identification of inter- and intra-protein cross-links in cell lysates on a proteome-wide scale. We describe an integrated workflow that robustly identifies cross-links from endogenous protein complexes in human cellular lysates. Our approach is based on the application of mass spectrometry (MS)-cleavable cross-linkers, sequential collision-induced dissociation (CID)–tandem MS (MS/MS) and electron-transfer dissociation (ETD)-MS/MS acquisitions, and a dedicated search engine, XlinkX, which allows rapid cross-link identification against a complete human proteome database. This approach allowed us to detect 2,426 unique cross-links at a 5% FDR (2,013 intraprotein and 413 interprotein cross-links) or 1,822 cross-links at a 1% FDR (1,622 intraprotein and 200 interprotein cross-links), indicating the detection of 326 or 134 protein-protein interactions at 5% FDR or 1% FDR, respectively, in HeLa cell lysates. We validated the confidence of our cross-linking results by using a target-decoy strategy and mapping the observed cross-link distances onto existing high-resolution structures. Our data provided new structural information about many protein assemblies and captured dynamic interactions of the ribosome in contact with different elongation factors.
394 citations