scispace - formally typeset

Journal ArticleDOI

A comparison of multivariate analysis techniques and variable selection strategies in a laser-induced breakdown spectroscopy bacterial classification

01 Sep 2013-Spectrochimica Acta Part B: Atomic Spectroscopy (Elsevier)-Vol. 87, pp 161-167

TL;DR: The partial least squares discriminant analysis was more effective at distinguishing between highly similar spectra from closely related bacterial genera and may be the preferred multivariate technique in future species-level or strain-level classifications.

Abstractarticle i nfo Laser-induced breakdown spectroscopy has been used to obtain spectral fingerprints from live bacterial spec- imens from thirteen distinct taxonomic bacterial classes representative of five bacterial genera. By taking sums, ratios, and complex ratios of measured atomic emission line intensities three unique sets of indepen- dent variables (models) were constructed to determine which choice of independent variables provided op- timal genus-level classification of unknown specimens utilizing a discriminant function analysis. A model composed of 80 independent variables constructed from simple and complex ratios of the measured emission line intensities was found to provide the greatest sensitivity and specificity. This model was then used in a partial least squares discriminant analysis to compare the performance of this multivariate technique with a discriminant function analysis. The partial least squares discriminant analysis possessed a higher true positive rate, possessed a higher false positive rate, and was more effective at distinguishing between highly similar spectra from closely related bacterial genera. This suggests it may be the preferred multivariate tech- nique in future species-level or strain-level classifications.

Summary (2 min read)

1. Introduction

  • Since the initial demonstrations of bacterial identification with laser-induced breakdown spectroscopy (LIBS) in 2003, significant progress has been made in the use of multivariate chemometric analyses to classify unknown bacterial LIBS spectra.[1-4].
  • Over the last five years the authors and others have demonstrated a sensitive and specific identification of live bacterial biospecimens utilizing a discriminant function analysis (DFA) to classify LIBS spectra.[5-8].
  • The intensities of strong specific elemental atomic emission lines normalized by the total observed spectral power have been utilized as independent variables in this multivariate analysis. [9].
  • And this is an ongoing area of investigation.
  • Model performance was quantified by calculating truth tables (and the resulting sensitivity and specificity) from the external validation tests.

2.1. Experimental Setup

  • The LIBS apparatus used to obtain the bacterial spectra, as well as their bacterial sample preparation and mounting protocols, have been described at length elsewhere.
  • Five spectra were acquired at each sampling location, thus twenty-five laser pulses were used to obtain this spectrum.
  • The bacteria were chosen to represent a fairly wide taxonomic range.
  • The 32 distinct experiments that were performed yielded the 32 data sets shown in column three of Table 1.
  • No data “outliers” were omitted from their data sets and efforts were made to maximize the number of spectra from any one bacterial deposition rather than to standardize the number of spectra taken.

2.2 Models for Chemometric Analysis (Lines, RM1, and RM2)

  • The three independent variable models that were tested are referred to here as the “lines” model, ratio model one (RM1), and ratio model two (RM2).
  • The lines model was the simplest of the three, having been used in all their previous work.
  • This approach has been utilized with success by Gottfried et al. to discriminate LIBS spectra obtained from explosives residues.
  • The first thirteen variables were merely the intensities of the thirteen strong emission lines used in the lines model (indicated by an asterisk).
  • It was decided that when the dimensionality of the original data was not reduced significantly then the benefits of performing a down-selection were reduced and the more appropriate model would be to use the entire spectrum.

2.3 Chemometric Analysis Techniques

  • Two multivariate chemometric analysis techniques were compared for discrimination between different bacterial genera based on the LIBS emission spectra.
  • This is known as external validation, because each spectrum was tested against a library where no other spectra acquired at the same time or under the same conditions were present.
  • PLS-DA takes a set of independent variables as determined by their models and constructs latent variables to maximize the variance between the two groups.
  • The identity of unknown spectra was then predicted based on this discrimination line in the pre-compiled library.
  • All unknown samples were classified in a PLS-DA test specific for each genus, and if the test group was classified as belonging to the “no group” for each model, it remained unknown and was not classified as belonging to any genus.

3. Results and Discussion

  • In each of the DFA results, four discriminant functions (DF1 through DF4) were constructed to determine the classification of each spectrum.
  • The “unknown” bacterial spectra are represented by the “x” symbols and 34 of 34 unknown spectra were correctly classified as Mycobacterium, even though the model contained no other spectra from strain TA.
  • An investigation of the PLS-DA was conducted to compare the number of LV’s and the corresponding rates of true positives and true negatives.

4. Discussion

  • A comparison of the DFA performed with the three different models consisting of lines, RM1, and RM2 showed that RM2 yielded the overall highest true positive and true negative rates with true positive rates of 95%, 54%, 95%, and 88% for the four genera and true negative rates of 91%, 99%, 99%, and 99%.
  • The sensitivity and specificity were obtained by averaging the results from the 31 tests and the standard deviation is reported as the uncertainty.
  • This was merely a result of there being only two representative Staphylococci data sets to include in the analysis, as can be seen in Table 1, with one of these data sets being among the earliest experiments performed in the construction of the spectral library.
  • Therefore both analyses can perform both functions, if necessary.
  • It may therefore be true that a DFA is more effective in genus-level discrimination on bacterial specimens with a wide range of potential identities, but discrimination at the species- or strain-level once the genus is accurately identified may require the use of PLS-DA.

5. Conclusion

  • The authors have shown that a sensitive and specific genus level classification of LIBS spectra from live bacterial specimens can be performed with a DFA or a PLS-DA using several different independent variable models.
  • All results were obtained using external-validation tests.
  • The number of latent variables required for efficient classification using this model was investigated, and chosen to be 20 in all subsequent tests.
  • More precise identification at the species-level or strain-level may be subsequently performed with a PLS-DA, which demonstrated improved performance at discriminating highly similar spectra.
  • It is likely 18 that computational processing power would easily allow such a verification, as the classification of one unknown spectrum against a pre-compiled library model is performed rapidly by both techniques.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

Citations
More filters

Journal ArticleDOI
TL;DR: This review describes and compares the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures, and presents examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world.
Abstract: Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure.

217 citations


Cites methods from "A comparison of multivariate analys..."

  • ...Such model can be used to provide predictions for a new (‘unknown’) object based on the values of measured variables in that object (Putnam et al. 2013)....

    [...]


Journal ArticleDOI
Abstract: One of the most widely cited advantages of laser-induced breakdown spectroscopy (LIBS) is that it does not require sample preparation, but this may also be the biggest factor holding it back from becoming a mature analytical technique like LA-ICP-MS, ICP-OES, or XRF. While there are certain specimen types that have enjoyed excellent LIBS results without any sample treatment (mostly homogeneous solids such as metals, glass, and polymers), the possible applications of LIBS have been greatly expanded through the use of sample preparation techniques that have resulted in analytical performance (i.e., limits of detection, accuracy, and repeatability) on par with XRF, ICP-OES, and often ICP-MS. This review highlights the work of many LIBS researchers who have developed, adapted, and improved upon sample preparation techniques for various specimen types in order to improve the quality of the analytical data that LIBS can produce in a large number of research domains. Strategies, not only for solids, but also liquids, gases, and aerosols are discussed, including newly developed nanoparticle enhancement and biological imaging and tagging techniques.

117 citations


Journal ArticleDOI
TL;DR: This review attempts to give a critical overview of the diverse progress of the field, focusing on the results of the last five years, of laser-induced breakdown spectroscopy.
Abstract: Laser-induced breakdown spectroscopy (LIBS) has become an established analytical atomic spectrometry technique and is valued for its very compelling set of advantageous analytical and technical characteristics. It is a rapid, versatile, non-contact technique, which is capable of providing qualitative and quantitative analytical information for practically any sample, in a virtually non-destructive way, without any substantial sample preparation. The instrumentation is simple, robust, compact, and even enables remote analysis. This review attempts to give a critical overview of the diverse progress of the field, focusing on the results of the last five years. The advancement of LIBS instrumentation and data evaluation is discussed in detail and selected results of some prominent applications are also described.

109 citations


Cites background from "A comparison of multivariate analys..."

  • ...Successful discrimination of pathogenic from non-pathogenic bacteria has been achieved, including some multi-drug-resistant strains of bacteria including Staphylococcus aureus and other strains causing hospital-acquired infections (HAI) [112, 132]....

    [...]


Journal ArticleDOI
TL;DR: This work critically assess and elaborate on the approaches to utilize PCA in LIBS data processing, and derives some implications and suggests advice in data preprocessing, visualization, dimensionality reduction, model building, classification, quantification and non-conventional multivariate mapping.
Abstract: An implementation of a fast, robust, and effective algorithm is inevitable in modern multivariate data analysis (MVDA). The principal component analysis (PCA) algorithm is becoming popular not only in the spectroscopic community because it complies with the qualities mentioned above. PCA is, therefore, often used for the processing of detected multivariate signal (characteristic spectra). Over the past decade, PCA has been adopted by the Laser-Induced Breakdown Spectroscopy (LIBS) community and the number of scientific articles referring to PCA steadily increases. The interest in PCA is not caused only by the basic need to obtain a fast data visualization on a lower dimensional scale and to inspect the most prominent variables. Most recently, PCA has also been applied to yield unconventional data analyses, i.e. processing of large scale LIBS maps. However, a rapid development of LIBS-related instrumentation and applications has led to some non-uniform methodologies in the implementation and utilization of MVDA, including PCA. Thus, in this work, we critically assess and elaborate on the approaches to utilize PCA in LIBS data processing. The aim of this article is also to derive some implications and to suggest advice in data preprocessing, visualization, dimensionality reduction, model building, classification, quantification and non-conventional multivariate mapping. This review reflects also other MVDA algorithms than PCA and consequently, presented conclusions and recommendations can be generalized.

74 citations


Journal ArticleDOI
Abstract: Laser‐induced breakdown spectroscopy (LIBS) is a new type of elemental analytical technology with the advantages of real‐time, online, and noncontact as well as enabling the simultaneous analysis of multiple elements. It has become a frontier analytical technique in spectral analysis. However, the issue of how to improve the accuracy of qualitative and quantitative analyses by extracting useful information from a large amount of complex LIBS data remains the main problem for the LIBS technique. Chemometrics is a chemical subdiscipline of multi‐interdisciplinary methods; it offers advantages in data processing, signal analysis, and pattern recognition. It can solve some complicated problems that are difficult for traditional chemical methods. In this paper, we reviewed the research progress of chemometrics methods in LIBS for spectral data preprocessing as well as for qualitative and quantitative analyses in the most recent 5 years (2012‐2016).

52 citations


Cites methods from "A comparison of multivariate analys..."

  • ...The 3 models based on sums, ratios, and complex ratios of measured atomic emission‐line intensities were constructed by down‐selected– independent variables, and PLS‐DA was effective at distinguishing between highly similar spectra from closely related bacterial genera.(47) The PLS‐DA was used to investigate the possibility of discriminating healthy and carious tooth tissues based on atomic and ionic emission lines in the LIBS spectra of teeth; it showed excellent discrimination and prediction of unknown tooth tissues....

    [...]


References
More filters

Journal ArticleDOI
01 Jan 1973
Abstract: Offers an applications-oriented approach to multivariate data analysis, focusing on the use of each technique, rather than its mathematical derivation. The text introduces a six-step framework for organizing and discussing techniques with flowcharts for each. Well-suited for the non-statistician, this applications-oriented introduction to multivariate analysis focuses on the fundamental concepts that affect the use of specific techniques rather than the mathematical derivation of the technique. Provides an overview of several techniques and approaches that are available to analysts today - e.g., data warehousing and data mining, neural networks and resampling/bootstrapping. Chapters are organized to provide a practical, logical progression of the phases of analysis and to group similar types of techniques applicable to most situations. Table of Contents 1. Introduction. I. PREPARING FOR A MULTIVARIATE ANALYSIS. 2. Examining Your Data. 3. Factor Analysis. II. DEPENDENCE TECHNIQUES. 4. Multiple Regression. 5. Multiple Discriminant Analysis and Logistic Regression. 6. Multivariate Analysis of Variance. 7. Conjoint Analysis. 8. Canonical Correlation Analysis. III. INTERDEPENDENCE TECHNIQUES. 9. Cluster Analysis. 10. Multidimensional Scaling. IV. ADVANCED AND EMERGING TECHNIQUES. 11. Structural Equation Modeling. 12. Emerging Techniques in Multivariate Analysis. Appendix A: Applications of Multivariate Data Analysis. Index.

37,069 citations


01 Jan 2009

10,146 citations


"A comparison of multivariate analys..." refers methods in this paper

  • ...DFA is a multivariate analysis technique that uses independent variables (atomic emission intensities) to calculate a dependant variable (bacterial identity) to classify or discriminate between two or more groups [21]....

    [...]


Book
01 Oct 2010
Abstract: Partial least squares (PLS) was not originally designed as a tool for statistical discrimination. In spite of this, applied scientists routinely use PLS for classification and there is substantial empirical evidence to suggest that it performs well in that role. The interesting question is: why can a procedure that is principally designed for overdetermined regression problems locate and emphasize group structure? Using PLS in this manner has heurestic support owing to the relationship between PLS and canonical correlation analysis (CCA) and the relationship, in turn, between CCA and linear discriminant analysis (LDA). This paper replaces the heuristics with a formal statistical explanation. As a consequence, it will become clear that PLS is to be preferred over PCA when discrimination is the goal and dimension reduction is needed. Copyright © 2003 John Wiley & Sons, Ltd.

1,799 citations


"A comparison of multivariate analys..." refers background in this paper

  • ...The PLS-DA then calculates a discrimination line (or this can be user-determined) to predict the class of each spectrum based on Bayesian statistics by minimizing the number of false positives and negatives [22]....

    [...]


Journal ArticleDOI
TL;DR: This review discusses the application of laser-induced breakdown spectroscopy (LIBS) to the problem of explosive residue detection and demonstrates the tremendous potential of LIBS for real-time detection of explosives residues at standoff distances.
Abstract: In this review we discuss the application of laser-induced breakdown spectroscopy (LIBS) to the problem of detection of residues of explosives. Research in this area presented in open literature is reviewed. Both laboratory and field-tested standoff LIBS instruments have been used to detect explosive materials. Recent advances in instrumentation and data analysis techniques are discussed, including the use of double-pulse LIBS to reduce air entrainment in the analytical plasma and the application of advanced chemometric techniques such as partial least-squares discriminant analysis to discriminate between residues of explosives and non-explosives on various surfaces. A number of challenges associated with detection of explosives residues using LIBS have been identified, along with their possible solutions. Several groups have investigated methods for improving the sensitivity and selectivity of LIBS for detection of explosives, including the use of femtosecond-pulse lasers, supplemental enhancement of the laser-induced plasma emission, and complementary orthogonal techniques. Despite the associated challenges, researchers have demonstrated the tremendous potential of LIBS for real-time detection of explosives residues at standoff distances.

276 citations


Journal ArticleDOI
TL;DR: LIBS data from the individual laser shots were analyzed by principal-components analysis and were found to contain adequate information to afford discrimination among the different biomaterials.
Abstract: Laser-induced breakdown spectroscopy (LIBS) has been used to study bacterial spores, molds, pollens, and proteins. Biosamples were prepared and deposited onto porous silver substrates. LIBS data from the individual laser shots were analyzed by principal-components analysis and were found to contain adequate information to afford discrimination among the different biomaterials. Additional discrimination within the three bacilli studied appears feasible.

209 citations


Frequently Asked Questions (2)
Q1. What are the contributions in "A comparison of multivariate analysis techniques and variable selection strategies in a laser-induced breakdown spectroscopy bacterial classification" ?

In this paper, the authors compared the use of three different down-selected variable models consisting of emission intensities, the sum of observed ∼4 intensities from the elements P, Ca, Mg, Na, C, and complex ratios of those intensities. 

Such a confirmation will need to be investigated in future work.