scispace - formally typeset
Search or ask a question

Showing papers in "Journal of biomolecular techniques in 2013"


Journal ArticleDOI
TL;DR: Hair samples proved to be a good source of genomic DNA for PCR-based methods and can also be used for the genomic disorder analysis in addition to the forensic analysis as a result of the ease of sample collection in a noninvasive manner, lower sample volume requirements, and good storage capability.
Abstract: Isolation of DNA from blood and buccal swabs in adequate quantities is an integral part of forensic research and analysis. The present study was performed to determine the quality and the quantity of DNA extracted from four commonly available samples and to estimate the time duration of the ensuing PCR amplification. Here, we demonstrate that hair and urine samples can also become an alternate source for reliably obtaining a small quantity of PCR-ready DNA. We developed a rapid, cost-effective, and noninvasive method of sample collection and simple DNA extraction from buccal swabs, urine, and hair using the phenol-chloroform method. Buccal samples were subjected to DNA extraction, immediately or after refrigeration (4–6°C) for 3 days. The purity and the concentration of the extracted DNA were determined spectrophotometerically, and the adequacy of DNA extracts for the PCR-based assay was assessed by amplifying a 1030-bp region of the mitochondrial D-loop. Although DNA from all the samples was suitable for PCR, the blood and hair samples provided a good quality DNA for restriction analysis of the PCR product compared with the buccal swab and urine samples. In the present study, hair samples proved to be a good source of genomic DNA for PCR-based methods. Hence, DNA of hair samples can also be used for the genomic disorder analysis in addition to the forensic analysis as a result of the ease of sample collection in a noninvasive manner, lower sample volume requirements, and good storage capability.

156 citations


Journal ArticleDOI
TL;DR: This study compared outcomes from two leading companies, Agilent Technologies and Roche NimbleGen, which offer custom-targeted genomic enrichment methods and compared samples for their ability to sequence single-nucleotide polymorphisms (SNPs) as a test of the ability to capture both chromosomes from the sample.
Abstract: Isolating high-priority segments of genomes greatly enhances the efficiency of next-generation sequencing (NGS) by allowing researchers to focus on their regions of interest. For the 2010-11 DNA Sequencing Research Group (DSRG) study, we compared outcomes from two leading companies, Agilent Technologies (Santa Clara, CA, USA) and Roche NimbleGen (Madison, WI, USA), which offer custom-targeted genomic enrichment methods. Both companies were provided with the same genomic sample and challenged to capture identical genomic locations for DNA NGS. The target region totaled 3.5 Mb and included 31 individual genes and a 2-Mb contiguous interval. Each company was asked to design its best assay, perform the capture in replicates, and return the captured material to the DSRG-participating laboratories. Sequencing was performed in two different laboratories on Genome Analyzer IIx systems (Illumina, San Diego, CA, USA). Sequencing data were analyzed for sensitivity, specificity, and coverage of the desired regions. The success of the enrichment was highly dependent on the design of the capture probes. Overall, coverage variability was higher for the Agilent samples. As variant discovery is the ultimate goal for a typical targeted sequencing project, we compared samples for their ability to sequence single-nucleotide polymorphisms (SNPs) as a test of the ability to capture both chromosomes from the sample. In the targeted regions, we detected 2546 SNPs with the NimbleGen samples and 2071 with Agilent's. When limited to the regions that both companies included as baits, the number of SNPs was ∼1000 for each, with Agilent and NimbleGen finding a small number of unique SNPs not found by the other.

138 citations


Journal ArticleDOI
TL;DR: It is demonstrated that reads shorter than the theoretical minimum length are of lower overall quality and not simply truncated reads, as well as some variation detected in loading conditions, sequencing yield, and homopolymer length accuracy.
Abstract: As part of the DNA Sequencing Research Group of the Association of Biomolecular Resource Facilities, we have tested the reproducibility of the Roche/454 GS-FLX Titanium System at five core facilities. Experience with the Roche/454 system ranged from 340 sequencing runs performed. All participating sites were supplied with an aliquot of a common DNA preparation and were requested to conduct sequencing at a common loading condition. The evaluation of sequencing yield and accuracy metrics was assessed at a single site. The study was conducted using a laboratory strain of the Dutch elm disease fungus Ophiostoma novo-ulmi strain H327, an ascomycete, vegetatively haploid fungus with an estimated genome size of 30-50 Mb. We show that the Titanium System is reproducible, with some variation detected in loading conditions, sequencing yield, and homopolymer length accuracy. We demonstrate that reads shorter than the theoretical minimum length are of lower overall quality and not simply truncated reads. The O. novo-ulmi H327 genome assembly is 31.8 Mb and is comprised of eight chromosome-length linear scaffolds, a circular mitochondrial conti of 66.4 kb, and a putative 4.2-kb linear plasmid. We estimate that the nuclear genome encodes 8613 protein coding genes, and the mitochondrion encodes 15 genes and 26 tRNAs.

60 citations


Journal ArticleDOI
TL;DR: The optimal methods and results for embedding and cryosectioning whole-body ZF for MALDI-MSI are described and a supportive hydrogel with the consistency of cartilage is found to be the optimal embedding medium.
Abstract: Mass spectrometry imaging (MSI) methods and protocols have become widely adapted to a variety of tissues and species. However, the MSI literature lacks information on whole-body cryosection preparation for the zebrafish (ZF; Danio rerio), a model organism routinely used in developmental and toxicity studies. The optimal medium for embedding and cryosectioning a whole-body or soft tissue specimen for traditional histological studies is a synthetic polymer mixture that is incompatible with MSI due to peak interference and ion suppression. We describe methods and results for embedding and cryosectioning whole-body ZF from multiple trials to optimize medium compatibility with MSI. A natural polymer commonly used in the food processing industry was determined to be the best embedding medium. The natural polymer medium does not interfere with MSI data collection, aids in tissue stability, is readily available for purchase, and is easy to prepare and handle during cryosectioning. Additionally, we optimized the matrix application step by decreasing the matrix cluster interference commonly caused by alpha-Cyano-4-hydroxycinnamic acid (CHCA) by adding a salt to the solvent spray solution. The optimized methods developed in our laboratory produced high quality cryosections as well as high quality MS images of sectioned ZF.

59 citations


Journal ArticleDOI
TL;DR: The results show that the Edwards method works better than the CTAB method for extracting DNA from tissues of Petunia hybrida and that buds and reproductive tissue, in general, yielded higher DNA concentrations than other tissues.
Abstract: Extraction of DNA from plant tissue is often problematic, as many plants contain high levels of secondary metabolites that can interfere with downstream applications, such as the PCR. Removal of these secondary metabolites usually requires further purification of the DNA using organic solvents or other toxic substances. In this study, we have compared two methods of DNA purification: the cetyltrimethylammonium bromide (CTAB) method that uses the ionic detergent hexadecyltrimethylammonium bromide and chloroform-isoamyl alcohol and the Edwards method that uses the anionic detergent SDS and isopropyl alcohol. Our results show that the Edwards method works better than the CTAB method for extracting DNA from tissues of Petunia hybrida. For six of the eight tissues, the Edwards method yielded more DNA than the CTAB method. In four of the tissues, this difference was statistically significant, and the Edwards method yielded 27-80% more DNA than the CTAB method. Among the different tissues tested, we found that buds, 4 days before anthesis, had the highest DNA concentrations and that buds and reproductive tissue, in general, yielded higher DNA concentrations than other tissues. In addition, DNA extracted using the Edwards method was more consistently PCR-amplified than that of CTAB-extracted DNA. Based on these results, we recommend using the Edwards method to extract DNA from plant tissues and to use buds and reproductive structures for highest DNA yields.

43 citations


Journal ArticleDOI
TL;DR: The use of FA/AF improved online RP-LC separations and led to significant increases in peptide identifications with improved protein sequence coverage.
Abstract: A major challenge facing current mass spectrometry (MS)-based proteomics research is the large concentration range displayed in biological systems, which far exceeds the dynamic range of commonly available mass spectrometers. One approach to overcome this limitation is to improve online reversed-phase liquid chromatography (RP-LC) separation methodologies. LC mobile-phase modifiers are used to improve peak shape and increase sample load tolerance. Trifluoroacetic acid (TFA) is a commonly used mobile-phase modifier, as it produces peptide separations that are far superior to other additives. However, TFA leads to signal suppression when incorporated with electrospray ionization (ESI), and thus, other modifiers, such as formic acid (FA), are used for LC-MS applications. FA exhibits significantly less signal suppression, but is not as effective of a modifier as TFA. An alternative mobile-phase modifier is the combination of FA and ammonium formate (AF), which has been shown to improve peptide separations. The ESI-MS compatibility of this modifier has not been investigated, particularly for proteomic applications. This work compares the separation metrics of mobile phases modified with FA and FA/AF and explores the use of FA/AF for the LC-MS analysis of tryptic digests. Standard tryptic-digest peptides were used for comparative analysis of peak capacity and sample load tolerance. The compatibility of FA/AF in proteomic applications was examined with the analysis of soluble proteins from canine prostate carcinoma tissue. Overall, the use of FA/AF improved online RP-LC separations and led to significant increases in peptide identifications with improved protein sequence coverage.

36 citations


Journal ArticleDOI
TL;DR: This work reports on the use of qPCR, combined with two controls, to identify and quantify indoor fungal contaminants with a greater degree of confidence than has been achieved previously.
Abstract: The goal of this project is to improve the quantification of indoor fungal pollutants via the specific application of quantitative PCR (qPCR). Improvement will be made in the controls used in current qPCR applications. This work focuses on the use of two separate controls within a standard qPCR reaction. The first control developed was the internal standard control gene, benA. This gene encodes for β-tubulin and was selected based on its single-copy nature. The second control developed was the standard control plasmid, which contained a fragment of the ribosomal RNA (rRNA) gene and produced a specific PCR product. The results confirm the multicopy nature of the rRNA region in several filamentous fungi and show that we can quantify fungi of unknown genome size over a range of spore extractions by inclusion of these two standard controls. Advances in qPCR have led to extremely sensitive and quantitative methods for single-copy genes; however, it has not been well established that the rRNA can be used to quantitate fungal contamination. We report on the use of qPCR, combined with two controls, to identify and quantify indoor fungal contaminants with a greater degree of confidence than has been achieved previously. Advances in indoor environmental health have demonstrated that contamination of the built environment by the filamentous fungi has adverse impacts on the health of building occupants. This study meets the need for more accurate and reliable methods for fungal identification and quantitation in the indoor environment.

33 citations


Journal ArticleDOI
TL;DR: Optimize DDA settings were applied to the analysis of Trypanosome brucei peptides, yielding peptide identifications at a rate almost five times faster than previously used methodologies, which significantly improves protein identification workflows that use typical available instrumentation.
Abstract: Recent developments in chromatography, such as ultra-HPLC and superficially porous particles, offer significantly improved peptide separation. The narrow peak widths, often only several seconds, can permit a 15-min liquid chromatography run to have a similar peak capacity as a 60-min run using traditional HPLC approaches. In theory, these larger peak capacities should provide higher protein coverage and/or more protein identifications when incorporated into a proteomic workflow. We initially observed a decrease in protein coverage when implementing these faster chromatographic approaches, due to data-dependent acquisition (DDA) settings that were not properly set to match the narrow peak widths resulting from newly implemented, fast separation techniques. Oversampling of high-intensity peptides lead to low protein-sequence coverage, and tandem mass spectra (MS/MS) from lower-intensity peptides were of poor quality, as automated MS/MS events were occurring late on chromatographic peaks. These observations led us to optimize DDA settings to use these fast separations. Optimized DDA settings were applied to the analysis of Trypanosome brucei peptides, yielding peptide identifications at a rate almost five times faster than previously used methodologies. The described approach significantly improves protein identification workflows that use typical available instrumentation.

23 citations


Journal Article
TL;DR: The SomaLogic proteomics platform is ideally suited to triaging biomarker candidates against large clinical sample collections, and has a large number of additional studies completed, in design or sample accrual.
Abstract: Biomarkers are fundamental to nearly every step in the drug discovery and development process, from target validation in the laboratory, to patient stratification in the clinic. Recently, genomic discovery tools have had success in this area, but MS-based proteomic methods less so. This is due to difficulties in developing high throughput assays (ELISA, etc.) for triaging biomarker candidates against large clinical sample collections. However, the SomaLogic proteomics platform is ideally suited to this task. At the heart of the detection technology are SOMAmers (Slow Off-rate Modified Aptamers). They are modified DNA aptamers with high affinity (10∧–9 to 10∧–12 M) and high specificity for their cognate analytes. The assay is highly multiplexed, quantifying >1100 proteins simultaneously from a single 65 uL sample. Sensitivity of the array is generally comparable to sandwich ELISA performance (median LLOQ 100 fM, LoD 40 fM). Samples from a wide variety of sources are amenable to analysis – from serum to CSF, cell/tumor extracts, synovial fluid, etc. Biomarker signatures can be defined in as little as 5 weeks and clinically actionable diagnostics in as little as 6 months. We are currently engaged in a number of clinical discovery applications (Phase 0–4) and have a large number of additional studies completed, in design or sample accrual. The technology and these applications will be discussed.

16 citations


Journal ArticleDOI
TL;DR: The microwave irradiation was investigated as a simple, unbiased, and easy-to-multiplex way to fragment genomic DNA randomly and the result was improved by amplification prior to emPCR.
Abstract: An unconventional approach for DNA fragmentation was investigated to explore its feasibility as an alternative to the existing DNA fragmentation techniques for next-generation DNA sequencing application. Current methods are based on strong-force liquid shearing or specialized enzymatic treatments. There are shortcomings for these platforms yet to be addressed, including aerosolization of genomic materials, which may result in the cross-contamination and biohazards; the difficulty in multiplexing; and the potential sequence biases. In this proof-of-concept study, we investigated the microwave irradiation as a simple, unbiased, and easy-to-multiplex way to fragment genomic DNA randomly. In addition, heating DNA at high temperature was attempted for the same purpose and for comparison. Adaptive focused acoustic sonication was used as the control. The yield and functionality for the DNA fragments and DNA fragment libraries were analyzed to assess the feasibility and use of the proposed approach. Both microwave irradiation and thermal heating can fragment genomic DNA to the size ranges suitable for next-generation sequencing (NGS) shotgun library preparation. However, both treatments caused severe reduction in PCR amplification efficiency, which led to low production in emulsion PCR (emPCR). The result was improved by amplification prior to emPCR. Further improvements, such as DNA strand repairing, are needed for the method to be applied practically in NGS.

14 citations


Journal ArticleDOI
TL;DR: High-Resolution Melt (HRM) analysis is simpler to use than most other methods and provides comparable or more accurate discrimination between the two sibling species but requires a specialized melt-analysis instrument and software.
Abstract: There is a need for more cost-effective options to more accurately discriminate among members of the Anopheles gambiae complex, particularly An. gambiae and Anopheles arabiensis. These species are morphologically indistinguishable in the adult stage, have overlapping distributions, but are behaviorally and ecologically different, yet both are efficient vectors of malaria in equatorial Africa. The method described here, High-Resolution Melt (HRM) analysis, takes advantage of minute differences in DNA melting characteristics, depending on the number of incongruent single nucleotide polymorphisms in an intragenic spacer region of the X-chromosome-based ribosomal DNA. The two species in question differ by an average of 13 single-nucleotide polymorphisms giving widely divergent melting curves. A real-time PCR system, Bio-Rad CFX96, was used in combination with a dsDNA-specific dye, EvaGreen, to detect and measure the melting properties of the amplicon generated from leg-extracted DNA of selected mosquitoes. Results with seven individuals from pure colonies of known species, as well as 10 field-captured individuals unambiguously identified by DNA sequencing, demonstrated that the method provided a high level of accuracy. The method was used to identify 86 field mosquitoes through the assignment of each to the two common clusters with a high degree of certainty. Each cluster was defined by individuals from pure colonies. HRM analysis is simpler to use than most other methods and provides comparable or more accurate discrimination between the two sibling species but requires a specialized melt-analysis instrument and software.

Journal ArticleDOI
TL;DR: If handled appropriately, core facilities at universities can contribute to this partnership by offering services and access to high-end instrumentation to both nonprofit organizations and commercial organizations, which can be a win-win situation for both organizations that will support research and bolster the American economy.
Abstract: This article addresses the growing interest among U.S. scientific organizations and federal funding agencies in strengthening research partnerships between American universities and the private sector. It outlines how core facilities at universities can contribute to this partnership by offering services and access to high-end instrumentation to both nonprofit organizations and commercial organizations. We describe institutional policies (best practices) and procedures (terms and conditions) that are essential for facilitating and enabling such partnerships. In addition, we provide an overview of the relevant federal regulations that apply to external use of academic core facilities and offer a set of guidelines for handling them. We conclude by encouraging directors and managers of core facilities to work with the relevant organizational offices to promote and nurture such partnerships. If handled appropriately, we believe such partnerships can be a win-win situation for both organizations that will support research and bolster the American economy.

Journal ArticleDOI
TL;DR: UBC and B2M represent reliable reference genes for RT-qPCR studies in the rat ischemic wound model and are unaffected by sustained tissue ischemia, and the geometric mean of these two stable genes provides an accurate normalization factor.
Abstract: Reference genes are often used in RT-quantitative PCR (qPCR) analysis to normalize gene expression levels to a gene that is expressed stably across study groups. They ultimately serve as a control in RT-qPCR analysis, producing more accurate interpretation of results. Whereas many reference genes have been used in various wound-healing studies, the most stable reference gene for ischemic wound-healing analysis has yet to be identified. The goal of this study was to determine systematically the most stable reference gene for studying gene expression in a rat ischemic wound-healing model using RT-qPCR. Twelve commonly used reference genes were analyzed using RT-qPCR and geNorm data analysis to determine stability across normal and ischemic skin tissue. It was ultimately determined that Ubiquitin C (UBC) and β-2 Microglobulin (B2M) are the most stably conserved reference genes across normal and ischemic skin tissue. UBC and B2M represent reliable reference genes for RT-qPCR studies in the rat ischemic wound model and are unaffected by sustained tissue ischemia. The geometric mean of these two stable genes provides an accurate normalization factor. These results provide insight on dependence of reference-gene stability on experimental parameters and the importance of such reference-gene investigations.

Journal ArticleDOI
TL;DR: Solid-phase enzymatic dephosphorylation proved to be a viable tool to condition O-GlcNAcylated peptide in mixtures with phosphopeptides for selective affinity purification and is expected to accommodate additional chemistries to expand the scope of solid-phase serial derivatization for protein structural characterization.
Abstract: A rugged sample-preparation method for comprehensive affinity enrichment of phosphopeptides from protein digests has been developed. The method uses a series of chemical reactions to incorporate efficiently and specifically a thiol-functionalized affinity tag into the analyte by barium hydroxide catalyzed β-elimination with Michael addition using 2-aminoethanethiol as nucleophile and subsequent thiolation of the resulting amino group with sulfosuccinimidyl-2-(biotinamido) ethyl-1,3-dithiopropionate. Gentle oxidation of cysteine residues, followed by acetylation of α- and e-amino groups before these reactions, ensured selectivity of reversible capture of the modified phosphopeptides by covalent chromatography on activated thiol sepharose. The use of C18 reversed-phase supports as a miniaturized reaction bed facilitated optimization of the individual modification steps for throughput and completeness of derivatization. Reagents were exchanged directly on the supports, eliminating sample transfer between the reaction steps and thus, allowing the immobilized analyte to be carried through the multistep reaction scheme with minimal sample loss. The use of this sample-preparation method for phosphopeptide enrichment was demonstrated with low-level amounts of in-gel-digested protein. As applied to tryptic digests of α-S1- and β-casein, the method enabled the enrichment and detection of the phosphorylated peptides contained in the mixture, including the tetraphosphorylated species of β-casein, which has escaped chemical procedures reported previously. The isolates proved highly suitable for mapping the sites of phosphorylation by collisionally induced dissociation. β-Elimination, with consecutive Michael addition, expanded the use of the solid-phase-based enrichment strategy to phosphothreonyl peptides and to phosphoseryl/phosphothreonyl peptides derived from proline-directed kinase substrates and to their O-sulfono- and O-linked β-N-acetylglucosamine (O-GlcNAc)-modified counterparts. Solid-phase enzymatic dephosphorylation proved to be a viable tool to condition O-GlcNAcylated peptide in mixtures with phosphopeptides for selective affinity purification. Acetylation, as an integral step of the sample-preparation method, precluded reduction in recovery of the thiolation substrate caused by intrapeptide lysine-dehydroalanine cross-link formation. The solid-phase analytical platform provides robustness and simplicity of operation using equipment readily available in most biological laboratories and is expected to accommodate additional chemistries to expand the scope of solid-phase serial derivatization for protein structural characterization.

Journal ArticleDOI
TL;DR: The C-terminal peptides were selectively retrieved from the affinity support and proved highly suitable for structural characterization by collisionally induced dissociation and is expected to be readily expanded to gel-separated proteins.
Abstract: A sample preparation method for protein C-terminal peptide isolation has been developed. In this strategy, protein carboxylate glycinamidation was preceded by carboxyamidomethylation and optional α- and ϵ-amine acetylation in a one-pot reaction, followed by tryptic digestion of the modified protein. The digest was adsorbed on ZipTipC18 pipette tips for sequential peptide α- and ϵ-amine acetylation and 1-ethyl-(3-dimethylaminopropyl) carbodiimide-mediated carboxylate condensation with ethylenediamine. Amino group-functionalized peptides were scavenged on N-hydroxysuccinimide-activated agarose, leaving the C-terminal peptide in the flow-through fraction. The use of reversed-phase supports as a venue for peptide derivatization enabled facile optimization of the individual reaction steps for throughput and completeness of reaction. Reagents were exchanged directly on the support, eliminating sample transfer between the reaction steps. By this sequence of solid-phase reactions, the C-terminal peptide could be uniquely recognized in mass spectra of unfractionated digests of moderate complexity. The use of the sample preparation method was demonstrated with low-level amounts of a model protein. The C-terminal peptides were selectively retrieved from the affinity support and proved highly suitable for structural characterization by collisionally induced dissociation. The sample preparation method provides for robustness and simplicity of operation using standard equipment readily available in most biological laboratories and is expected to be readily expanded to gel-separated proteins.

Journal ArticleDOI
TL;DR: An automated phosphopeptide enrichment strategy is described using titanium dioxide (TiO2)-packed, fused silica capillaries for use with liquid chromatography (LC)-mass spectrometry (MS)/MS-based, label-free proteomics workflows, providing high degree of analytical reproducibility over large sample sets with complex experimental designs.
Abstract: An automated phosphopeptide enrichment strategy is described using titanium dioxide (TiO2)-packed, fused silica capillaries for use with liquid chromatography (LC)-mass spectrometry (MS)/MS-based, label-free proteomics workflows. To correlate an optimum peptide:TiO2 loading ratio between different particle types, the ratio of phenyl phosphate-binding capacities was used. The optimum loading for the column was then verified through replicate enrichments of a range of quantities of digested rat brain tissue cell lysate. Fractions were taken during sample loading, multiple wash steps, and the elution steps and analyzed by LC-MS/MS to gauge the efficiency and reproducibility of the enrichment. Greater than 96% of the total phosphopeptides were detected in the elution fractions, indicating efficient trapping of the phosphopeptides on the first pass of enrichment. The quantitative reproducibility of the automated setup was also improved greatly with phosphopeptide intensities from replicate enrichments exhibiting a median coefficient of variation (CV) of 5.8%, and 80% of the identified phosphopeptides had CVs below 11.1%, while maintaining >85% specificity. By providing this high degree of analytical reproducibility, this method allows for label-free phosphoproteomics over large sample sets with complex experimental designs (multiple biological conditions, multiple biological replicates, multiple time-points, etc.), including large-scale clinical cohorts.

Journal ArticleDOI
TL;DR: This work compared three published extraction procedures for subsequent applications to 2DE and found only the one involving grinding in liquid N2 and TCA-acetone precipitation led to proper resolution after 2DE, showing a good level of reproducibility at technical and biological levels.
Abstract: As it is well-established that protein extraction constitutes a crucial step for two-dimensional electrophoresis (2DE), this work was done as a prerequisite to further the study of alterations in the proteome in gills of the shore crab Carcinus maenas under contrasted environmental conditions. Because of the presence of a chitin layer, shore crab gills have an unusual structure. Consequently, they are considered as a hard tissue and represent a challenge for optimal protein extraction. In this study, we compared three published extraction procedures for subsequent applications to 2DE: the first one uses homogenization process, the second one included an additional TCA-acetone precipitation step, and finally, the third one associated grinding in liquid nitrogen (N2) and TCA-acetone precipitation. Extracted proteins were then resolved using 1DE and 2DE. Although interesting patterns were obtained using 1DE with the three methods, only the one involving grinding in liquid N2 and TCA-acetone precipitation led to proper resolution after 2DE, showing a good level of reproducibility at technical (85%) and biological (84%) levels. This last method is therefore proposed for analysis of gill proteomes in the shore crab.

Journal ArticleDOI
TL;DR: The solid-phase approach proved highly suitable to prepare substrates from low-level amounts of protein digests for phosphorylation-site determination by chemical-targeted proteolysis and was used as an orthogonal method to confirm the identity of phosphopeptides in proteolytic mixtures.
Abstract: We previously adapted the β-elimination/Michael addition chemistry to solid-phase derivatization on reversed-phase supports, and demonstrated the utility of this reaction format to prepare phosphoseryl peptides in unfractionated protein digests for mass spectrometric identification and facile phosphorylation-site determination. Here, we have expanded the use of this technique to β-N-acetylglucosamine peptides, modified at serine/threonine, phosphothreonyl peptides, and phosphoseryl/phosphothreonyl peptides, followed in sequence by proline. The consecutive β-elimination with Michael addition was adapted to optimize the solid-phase reaction conditions for throughput and completeness of derivatization. The analyte remained intact during derivatization and was recovered efficiently from the silica-based, reversed-phase support with minimal sample loss. The general use of the solid-phase approach for enzymatic dephosphorylation was demonstrated with phosphoseryl and phosphothreonyl peptides and was used as an orthogonal method to confirm the identity of phosphopeptides in proteolytic mixtures. The solid-phase approach proved highly suitable to prepare substrates from low-level amounts of protein digests for phosphorylation-site determination by chemical-targeted proteolysis. The solid-phase protocol provides for a simple, robust, and efficient tool to prepare samples for phosphopeptide identification in MALDI mass maps of unfractionated protein digests, using standard equipment available in most biological laboratories. The use of a solid-phase analytical platform is expected to be readily expanded to prepare digest from O-glycosylated- and O-sulfonated proteins for mass spectrometry-based structural characterization.

Journal Article
TL;DR: The long readlength capabilities of the PacBio® RS to sequence full-length cDNA molecules derived from human polyA RNA are demonstrated and a comparison of genome-based alignment approaches using existing aligners, GMAP and BLAT, for identifying novel transcripts is presented.
Abstract: Transcriptome sequencing using short read technologies (RNA-seq) provides valuable information on transcript abundance and rare transcripts. While short reads can be used to infer alternative splicing and variable transcription start sites, the use of short reads for these research questions creates computational problems due to uneven read coverage, complex splicing and potential sequencing bias. Here, we demonstrate the long readlength capabilities of the PacBio® RS to sequence full-length cDNA molecules derived from human polyA RNA. Our library preparation method generates sequencing libraries highly enriched in full-length cDNA molecules. Because the PacBio® RS uses a circular sequencing structure, reads are putatively full-length if either both ends of the SMRT adapter or both the 5′ and 3′ cDNA library adaptor primers are seen. By mapping putative full-length reads against the Gencode database, we show that we recovered many full-length transcripts spanning a range of 500 – 6,000 bp in length. In addition, we identified potential alternative isoforms of known genes. We also present a comparison of genome-based alignment approaches using existing aligners, GMAP and BLAT, for identifying novel transcripts. Finally, we describe two published error correction methods, PacBioToCA and LSC, for improving PacBio read accuracy using Illumina short reads. We report our findings on detecting novel splicing events and full-length transcript characterization in a human sample, showing that PacBio® RS sequencing technology can assist researchers in better characterizing the transcriptome in its native, full-length form and help unlock combinatorial RNA processing regulation not observed in previous RNA-seq experiments.

Journal ArticleDOI
TL;DR: A microarray developed for high-resolution genotyping of genes that are candidates for involvement in environmentally driven aspects of lung cancer oncogenesis and/or tumor growth illustrates techniques for managing large panels of candidate genes and optimizing marker selection, aided by a new bioinformatics pipeline component, Tagger Batch Assistant.
Abstract: A microarray (LungCaGxE), based on Illumina BeadChip technology, was developed for high-resolution genotyping of genes that are candidates for involvement in environmentally driven aspects of lung cancer oncogenesis and/or tumor growth. The iterative array design process illustrates techniques for managing large panels of candidate genes and optimizing marker selection, aided by a new bioinformatics pipeline component, Tagger Batch Assistant. The LungCaGxE platform targets 298 genes and the proximal genetic regions in which they are located, using ∼13,000 DNA single nucleotide polymorphisms (SNPs), which include haplotype linkage markers with a minimum allele frequency of 1% and additional specifically targeted SNPs, for which published reports have indicated functional consequences or associations with lung cancer or other smoking-related diseases. The overall assay conversion rate was 98.9%; 99.0% of markers with a minimum Illumina design score of 0.6 successfully generated allele calls using genomic DNA from a study population of 1873 lung-cancer patients and controls.

Journal Article
TL;DR: The ability of Boreal Genomics' Aurora instrument to provide pure, high molecular weight (HMW) DNA 250-1,100 kb in length, ideally suited for optical mapping is demonstrated.
Abstract: Optical mapping generates an ordered restriction map from single, long DNA molecules. By overlapping restriction maps from multiple molecules, a physical map of entire chromosomes and genomes is constructed, greatly facilitating genome assembly in next generation sequencing projects, comparative genomics and strain typing. However, optical mapping relies on a method of preparing high quality DNA >250 kb in length, which can be challenging from some organisms and sample types. Here we demonstrate the ability of Boreal Genomics' Aurora instrument to provide pure, high molecular weight (HMW) DNA 250-1,100 kb in length, ideally suited for optical mapping. The Aurora performs electrophoretic DNA purification within an agarose gel in reusable cartridges, protecting long DNA molecules from shearing forces associated with liquid handling steps common to other purification methods. DNA can be purified directly from intact cells embedded and lysed within an agarose gel, preserving the highest molecular weight DNA possible while achieving exceptional levels of purity. The Aurora delivers DNA in a buffer solution, where DNA can be condensed and protected from shearing during recovery with a pipette. DNA is then returned to its regular coiled state by simple dilution prior to optical mapping. Here we present images showing HMW DNA purification taking place in the Aurora and subsequent images of single DNA molecules on OpGen's Argus® Optical Mapping System. Future work will focus on further optimizing Aurora HMW DNA purification to bias DNA recovery in favor of only the longest molecules in a sample, maximizing the benefits of optical mapping.

Journal Article
TL;DR: There have been a number of recent improvements to further extend the length of PacBio® RS reads, allowing greatly improved and, in some cases, completed assemblies for genomes that have been considered impossible to assemble in the past.
Abstract: PacBio's SMRT® Sequencing produces the longest read lengths of any sequencing technology currently available There have been a number of recent improvements to further extend the length of PacBio® RS reads With an exponential read length distribution, there are many reads greater than 10 kb, and some reads at or beyond 20 kb These improvements include library prep methods for generating >10 kb libraries, a new XL polymerase, magnetic bead loading, stage start, new XL sequencing kits, and increasing data collection time to 120 minutes per SMRT Cell Each of these features will be described, with data illustrating the associated gains in performance With these developments, we are able to obtain greatly improved and, in some cases, completed assemblies for genomes that have been considered impossible to assemble in the past, because they include repeats or low complexity regions spanning many kilobases Long read lengths are valuable in other areas as well In a single read, we can obtain sequence covering an entire viral segment, read through multi-kilobase amplicons with expanded repeats, and identify splice variants in long, full-length cDNA sequences Examples of these applications will be shown

Journal Article
TL;DR: It is found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in non-expressing lineages, while promoters for genes expressed preferentially at later stages are often CG poor and employ DNA methylation upon repression.
Abstract: Epigenetic mechanisms have been proposed as crucial for regulating mammalian development, but their precise function is only partially understood. To investigate the epigenetic control of embryonic development, we differentiated human embryonic stem cells into mesendoderm, neural progenitor cells, trophoblast-like cells, and mesenchymal stem cells and systematically characterized DNA methylation, chromatin modifications, and the transcriptome in each lineage. Strikingly, we found that promoters that are active in early developmental stages tend to be CG rich and mainly engage H3K27me3 upon silencing in non-expressing lineages. By contrast, promoters for genes expressed preferentially at later stages are often CG poor and employ DNA methylation upon repression. Interestingly, the early developmental regulatory genes are often located in large genomic domains that are generally devoid of DNA methylation in most lineages, as we termed DNA methylation valleys (DMVs). Our results suggest that distinct epigenetic mechanisms regulate early and late stages of ES cell differentiation.

Journal Article
TL;DR: The Proton system delivers high-quality individual exome datasets rapidly and can be used for trio analysis to detect shared germline SNPs with high confidence and is for research use only and not for use in diagnostic procedures.
Abstract: Rapid, accurate, and inexpensive sequencing of exomes is critical to understand DNA variation in human disease. Ion Torrent has developed a benchtop research semiconductor sequencer, the Ion Proton™, that uses a novel CMOS chip with 165 million 1.3mm-diameter microwells, automatically templated sub-micron particles, and integrated hardware and software that enables acquisition of ~5 billion data points per second over a 2-4 hour runtime with on-instrument signal processing. To illustrate the speed, accuracy, and ease-of-use of the Proton system, analysis of a HapMap familial trio of exomes will be presented. Exome libraries are obtained with high-specificity hybridization probes targeting ~50 Mb of human exons that span 21,700 annotated protein-coding genes, microRNA, key non-coding RNA genes, and 44,000 predicted microRNA binding sites. Exome reads map on-target 75-83% between runs and 10.6 Gb of aligned data, obtained from a single P1 chip, yielded 141X average depth with 30X coverage of 90% of targeted bases. Read mapping, coverage analysis, variant calling and annotation are done with Torrent Suite and Ion Reporter™ software. Each trio dataset yielded ~30,000 SNP calls from single runs that exceeded 9 Gb of aligned data. The observed Het:Hom ratio of 1.4-1.5 matches the published range of 1.25-1.7 for European ethnicity and the observed Ts:Tv ratio of 2.9 agrees well with the published range of 2.8-3.1 for human exomes. The SNP concordance with dbSNP137 is greater than 98% and Het and Hom concordances with Complete Genomics data are 98% and 96%, respectively. Mendelian inheritance analysis indicates that error for Hets is 0.6% with no errors for homozygotic SNPs. The Proton system delivers high-quality individual exome datasets rapidly and can be used for trio analysis to detect shared germline SNPs with high confidence. The Ion Proton™ System is for research use only and not for use in diagnostic procedures.

Journal Article
TL;DR: IGOR is a cloud-based platform for researchers and facilities to manage NGS data, design and run complex analysis pipelines, and efficiently collaborate on projects, and frees up the time of core laboratories to emphasize and focus on the research questions that ultimately guide them.
Abstract: Technical challenges facing researchers performing next-generation sequencing (NGS) analysis threaten to slow the pace of discovery and delay clinical applications of genomics data. Particularly for core laboratories, these challenges include: (1) Computation and storage have to scale with the vase amount of data generated. (2) Analysis pipelines are complex to design, set up, and share. (3) Collaboration, reproducibility, and sharing are hampered by privacy concerns and the sheer volume of data involved. Based on hands-on experience from large-scale NGS projects such as the 1000 Genomes Project, Seven Bridges Genomics has developed IGOR, a comprehensive cloud platform for NGS Data analysis that fully addresses these challenges: IGOR is a cloud-based platform for researchers and facilities to manage NGS data, design and run complex analysis pipelines, and efficiently collaborate on projects. Over a dozen curated and peer-reviewed NGS data analysis pipelines are publicly available for free, including alignment, variant calling, and RNA-Seq. All pipelines are based on open source tools and built to peer-reviewed specifications in close collaboration with researchers at leading institutions such as the Harvard Stem Cell Institute. Without any command-line knowledge, NGS pipelines can be built and customized in an intuitive graphical editor choosing from over 50 open source tools. When executing pipelines, IGOR automatically takes care of all resource management. Resources are seamlessly and automatically made available from Amazon Web Services and optimized for time and cost. Collaboration is facilitated through a project structure that allows researchers working in and across institutions to share files and pipelines. Fine-grained permissions allow detailed access control on a user-by-user basis for each project. Pipelines can be embedded and accessed through web pages akin to YouTube videos. Extensive batch processing and parallelization capabilities mean that hundreds of samples can be analyzed in the same amount of time that a single sample can be processed. Using file metadata, batch processing can be automated, e.g., by file, library, sample or lane. The IGOR platform enables NGS research as a “turnkey” solution: Researchers can set up and run complex pipelines without expertise in command-line utilities or cloud computing. From a lab and facility perspective, the cloud-based architecture also eliminates the need to set up and maintain a large-scale infrastructure, typically resulting in at least 50% cost savings on infrastructure. By facilitating collaboration and easing analysis replication, the IGOR platform frees up the time of core laboratories to emphasize and focus on the research questions that ultimately guide them.

Journal Article
TL;DR: Data interpretation by means of clustering, statistical, and data analysis approaches have shown protein, lipid, and metabolite data to be complimentary and confirmative, which is further supported from the resulting pathway analysis output.
Abstract: Drug toxicity is a major reason for the failure of candidate pharmaceuticals during their development. It is therefore important to realize the potential for toxicity in a timely fashion. Many xenobiotics are bioactivated into toxic metabolites by cytochromes P450 (CYP). However, the activity of these enzymes typically falls in in-vitro systems. Recently, a transformed human hepatocyte cell line (THLE) became available in which the metabolic activity of specific CYP isoforms is maintained. THLE cells could be an ideal system in which to examine the potential toxicity of candidate pharmaceuticals. The baseline effect of the addition of CYP2E1 into THLE hepatocytes has been characterized to better understand the biochemistry of this model system. Dedicated and independent sample preparation protocols were applied in order to isolate metabolites lipids and proteins. Three independent replicates of THLE null or THLE +2E1 cells were investigated for all analyte classes. Proteins were recovered and digested with trypsin overnight. The same LC-MS Omics Research Platform was used for all experiments and generic, application dependent LC conditions applied throughout. In all instances, MS data were acquired using a data independent analysis (DIA) approach, whereby the energy applied to the collision cell was switched between a low and elevated energy state during alternate scans. For the proteomics experiments, ion mobility separation (IM) was incorporated into the analytical schema (IM-DIA). Multi-omic data were processed and searched using TransOmics software, allowing for normalized label-free quantitation. Pathway analysis and systems biology experiments were conducted to interrogate the datasets further using various bioinformatics tools. Comparison of the correlation variance and fold change between the two groups illustrates significant analyte expression. Data interpretation by means of clustering, statistical, and data analysis approaches have shown protein, lipid, and metabolite data to be complimentary and confirmative, which is further supported from the resulting pathway analysis output.

Journal Article
TL;DR: New technology that allows for RNA-seq from a panel of directed amplicons using an AmpliSeq™ approach with Ion Torrent semiconductor sequencing is demonstrated, demonstrating that the technique produces results that are technically reproducible, quantitative, and have excellent correlation with qPCR using TaqMan® assays.
Abstract: As Next Generation Sequencing matures, it is quickly moving into translational research applications where it has promise to be a useful tool for diagnosing and treating in a clinical setting. RNA profiling using NGS (RNA-seq) is one of the applications where this potential is currently being realized. RNA-seq experiments have traditionally started with a whole-transcriptome library preparation that produces a sequencing template from all RNA species in a sample. However, in many cases, only a handful of the genes present are necessary to make a clinically relevant diagnosis. We have demonstrated new technology that allows for RNA-seq from a panel of directed amplicons using an AmpliSeq™ approach with Ion Torrent semiconductor sequencing. This approach offers many advantages over microarray or qPCR such as faster turnaround and data analysis, sample multiplexing, lower RNA inputs, and ability to use degraded or FFPE-derived samples. In addition, the technique simultaneously provides quantitative gene expression information and gene sequence at the single nucleotide level. We have compiled three gene panels for testing the method including a cancer panel, apoptosis panel, and a panel derived from the Micro Array Quality Control (MAQC) consortium. Starting with 10ng of total RNA, cDNA is made, followed by amplification using primers designed for targeted genes. Resulting amplicons are prepared for sequencing using the AmpliSeqTM technology and sequenced on the Ion Torrent PGM. We demonstrate that the technique produces results that are technically reproducible, quantitative, and have excellent correlation with qPCR using TaqMan® assays. Employing barcodes, we have also tested multiple samples on a single chip thereby increasing the cost-effectiveness of the tool for clinical and research use.

Journal Article
TL;DR: Samples amplified with the Clontech and NuGEN methods performed well across all criteria and the Sigma kit-derived samples showed overrepresentation of genes categorized as snoRNAs, snRNAs and pseudogenes, while the current Miltenyi Biotec protocol yielded low library complexity and increase of multi-mapped reads.
Abstract: Library construction for whole transcriptome sequencing (RNA-Seq) from low-quantity RNA samples requires additional amplification after reverse transcription into cDNA. Several approaches have been developed to minimize variability and biases. This study aimed at quantifying and characterizing transcripts amplified using four commercial kits: (1) NuGEN Ovation RNA-Seq System V2, (2) Clontech SMARTer Ultra Low RNA Kit for Illumina Sequencing, (3) Sigma Transplex WTA2-SEQ Kit and (4) Miltenyi Biotec μMACS SuperAmp Kit II for NGS. The amplification reactions for each method were started with input amounts of Universal Mouse Reference RNA (UMRR) equivalent to approximately 10 to 300 cells. Resulting libraries built according to Illumina's TruSeq procedure were compared to unamplified references prepared from 1 μg of UMR total RNA that was either enriched for poly-A RNA or depleted of rRNA. Sequencing data were evaluated for read alignment, library complexity, transcript coverage, and gene expression with regard to sensitivity and dynamic range. The Sigma kit-derived samples showed overrepresentation of genes categorized as snoRNAs, snRNAs and pseudogenes. The current Miltenyi Biotec protocol yielded low library complexity and increase of multi-mapped reads. Samples amplified with the Clontech and NuGEN methods performed well across all criteria. These kits will be tested further using cells as starting material.

Journal ArticleDOI
TL;DR: B bis(2,2,6,6-tetramethyl-4-piperidyl) sebacate (marketed under the name Tinuvin 770) is identified as a major contaminant in applications using liquid chromatography coupled with mass spectrometry (LC-MS).
Abstract: The superior sensitivity of current mass spectrometers makes them prone to contamination issues, which can have deleterious effects on sample analysis. Here, bis(2,2,6,6-tetramethyl-4-piperidyl) sebacate (marketed under the name Tinuvin 770) is identified as a major contaminant in applications using liquid chromatography coupled with mass spectrometry (LC-MS). Tinuvin 770 is often added to laboratory and medical plastics as a UV stabilizer. One particular lot of microcentrifuge tubes was found to have an excess of this compound that would leach into samples and drastically interfere with LC-MS data acquisition. Further analysis found that Tinuvin 770 readily leached into polar and nonpolar solvents from the contaminated tube lot. Efforts to remove Tinuvin 770 from contaminated samples were unsuccessful. A prescreening method using MALDI-TOF MS is presented to prevent system contamination and sample loss.

Journal Article
TL;DR: It is found that the majority of the human transcriptome can be found with each method and platform, and thousands of transcriptionally active regions (TARs) beyond existing gene annotations are discovered, which demonstrate that conservative annotation sets are inappropriate for analysis, versus larger annotation sets.
Abstract: RNA sequencing is a rich assay for delineating the transcriptome but few RNA-Seq standard data sets exist to help quantification of gene or splice form expression. Moreover, each next-generation sequencing (NGS) platform has unique aspects of library synthesis, sequencing, alignment, and data processing. Little is known about cross-site reproducibility, technical variance and interoperability of NGS platforms for RNA-Seq. The goals of the ABRF-NGS study are to evaluate the performance of NGS platforms and to identify optimal methods and best practices. The study includes five ABRF Research Groups and over 20 core facility laboratories. To address RNA-Seq issues, we performed sequencing on five NGS platforms at multiple sites using two standardized RNA samples with synthetic RNA spike-ins. Platforms tested included Illumina HiSeq 2000/2500, Roche 454 GS FLX, Life Technology Ion PGM and Proton, and PacBio. We evaluated a wide range of variables, including varying input amount (1-1000 ng), alternate library preparation methods, specific size fractionation (1, 2, and 3 kb), and performance on degraded RNA (using heat, sonication, and RNase A). We used a set of 18,250 rt-PCR reactions as an orthogonal tool to gauge the linear and dynamic range of the RNA-Seq results. Our results show that unique transcripts and isoforms are revealed by each method and NGS platform. We found that the majority of the human transcriptome can be found with each method and platform. We also discovered thousands of transcriptionally active regions (TARs) beyond existing gene annotations, which demonstrate that conservative annotation sets are inappropriate for analysis, versus larger annotation sets. Moreover, while we see high correlation of RNA-Seq within sites, we observed that “site effect” is the largest variance factor outside of biological sources. Additionally, we observed that the “bioinformatics noise” of aligners and annotations contributes substantial variance, underscoring the need for data provenance for long-term studies.