scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Orthogonal signal correction of near-infrared spectra

14 Dec 1998-Chemometrics and Intelligent Laboratory Systems (Elsevier)-Vol. 44, Iss: 1, pp 175-185
TL;DR: It is shown how a variant of PLS can be used to achieve a signal correction that is as close to orthogonal as possible to a given Y-vector or Y-matrix and is applied to four different data sets of multivariate calibration.
About: This article is published in Chemometrics and Intelligent Laboratory Systems.The article was published on 1998-12-14. It has received 1003 citations till now. The article focuses on the topics: Noise (signal processing) & Orthogonality.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, a generic preprocessing method for multivariate data, called orthogonal projections to latent structures (O-PLS), is described, which removes variation from X (descriptor variables) that is not correlated to Y (property variables, e.g. yield, cost or toxicity).
Abstract: A generic preprocessing method for multivariate data, called orthogonal projections to latent structures (O-PLS), is described. O-PLS removes variation from X (descriptor variables) that is not correlated to Y (property variables, e.g. yield, cost or toxicity). In mathematical terms this is equivalent to removing systematic variation in X that is orthogonal to Y. In an earlier paper, Wold et al. (Chemometrics Intell. Lab. Syst. 1998; 44: 175-185) described orthogonal signal correction (OSC). In this paper a method with the same objective but with different means is described. The proposed O-PLS method analyzes the variation explained in each PLS component. The non-correlated systematic variation in X is removed, making interpretation of the resulting PLS model easier and with the additional benefit that the non-correlated variation itself can be analyzed further. As an example, near-infrared (NIR) reflectance spectra of wood chips were analyzed. Applying O-PLS resulted in reduced model complexity with preserved prediction ability, effective removal of non-correlated variation in X and, not least, improved interpretational ability of both correlated and non-correlated variation in the NIR spectra.

2,096 citations

Journal ArticleDOI
TL;DR: In this paper, class-orthogonal variation can be exploited to augment classificaiton analysis (OPLS-DA) for the purpose of discriminant analysis, and the OPLS method can be used to augment classification.
Abstract: The characteristics of the OPLS method have been investigated for the purpose of discriminant analysis (OPLS-DA). We demonstrate how class-orthogonal variation can be exploited to augment classific ...

1,179 citations


Cites background or methods from "Orthogonal signal correction of nea..."

  • ...Despite the fairly unambiguous concept of OSC, a multitude of implementations occur in the literature [14–16]....

    [...]

  • ...Orthogonal signal correction (OSC) [14] is a methodology initially developed for spectral data pre-processing by Wold et al....

    [...]

Book ChapterDOI
TL;DR: A review on the state of soil visible-near infrared (vis-NIR) spectroscopy is provided in this article, focusing on important soil attributes such as soil organic matter (SOM), minerals, texture, nutrients, water, pH, and heavy metals.
Abstract: This chapter provides a review on the state of soil visible–near infrared (vis–NIR) spectroscopy Our intention is for the review to serve as a source of up-to-date information on the past and current role of vis–NIR spectroscopy in soil science It should also provide critical discussion on issues surrounding the use of vis–NIR for soil analysis and on future directions To this end, we describe the fundamentals of visible and infrared diffuse reflectance spectroscopy and spectroscopic multivariate calibrations A review of the past and current role of vis–NIR spectroscopy in soil analysis is provided, focusing on important soil attributes such as soil organic matter (SOM), minerals, texture, nutrients, water, pH, and heavy metals We then discuss the performance and generalization capacity of vis–NIR calibrations, with particular attention on sample pretratments, covariations in data sets, and mathematical data preprocessing Field analyses and strategies for the practical use of vis–NIR are considered We conclude that the technique is useful to measure soil water and mineral composition and to derive robust calibrations for SOM and clay content Many studies show that we also can predict properties such as pH and nutrients, although their robustness may be questioned For future work we recommend that research should focus on: (i) moving forward with more theoretical calibrations, (ii) better understanding of the complexity of soil and the physical basis for soil reflection, and (iii) applications and the use of spectra for soil mapping and monitoring, and for making inferences about soils quality, fertility and function To do this, research in soil spectroscopy needs to be more collaborative and strategic The development of the Global Soil Spectral Library might be a step in the right direction

1,063 citations


Cites background from "Orthogonal signal correction of nea..."

  • ..., 1989) and orthogonal signal correction (OSC) (Wold et al., 1998)....

    [...]

  • ...9D) with or without detrending (Barnes et al., 1989) and orthogonal signal correction (OSC) (Wold et al., 1998)....

    [...]

Journal ArticleDOI
TL;DR: These studies show for the first time a technique capable of providing an accurate, noninvasive and rapid diagnosis of coronary heart disease that can be used clinically, either in population screening or to allow effective targeting of treatments such as statins.
Abstract: Although a wide range of risk factors for coronary heart disease have been identified from population studies, these measures, singly or in combination, are insufficiently powerful to provide a reliable, noninvasive diagnosis of the presence of coronary heart disease. Here we show that pattern-recognition techniques applied to proton nuclear magnetic resonance (1H-NMR) spectra of human serum can correctly diagnose not only the presence, but also the severity, of coronary heart disease. Application of supervised partial least squares-discriminant analysis to orthogonal signal-corrected data sets allows >90% of subjects with stenosis of all three major coronary vessels to be distinguished from subjects with angiographically normal coronary arteries, with a specificity of >90%. Our studies show for the first time a technique capable of providing an accurate, noninvasive and rapid diagnosis of coronary heart disease that can be used clinically, either in population screening or to allow effective targeting of treatments such as statins.

1,011 citations

Journal ArticleDOI
TL;DR: Multivariate statistical modeling of the spectra shows that the genetic predisposition of the 129S6 mouse to impaired glucose homeostasis and NAFLD is associated with disruptions of choline metabolism, and indicates that gut microbiota may play an active role in the development of insulin resistance.
Abstract: Here, we study the intricate relationship between gut microbiota and host cometabolic phenotypes associated with dietary-induced impaired glucose homeostasis and nonalcoholic fatty liver disease (NAFLD) in a mouse strain (129S6) known to be susceptible to these disease traits, using plasma and urine metabotyping, achieved by 1H NMR spectroscopy. Multivariate statistical modeling of the spectra shows that the genetic predisposition of the 129S6 mouse to impaired glucose homeostasis and NAFLD is associated with disruptions of choline metabolism, i.e., low circulating levels of plasma phosphatidylcholine and high urinary excretion of methylamines (dimethylamine, trimethylamine, and trimethylamine-N-oxide), coprocessed by symbiotic gut microbiota and mammalian enzyme systems. Conversion of choline into methylamines by microbiota in strain 129S6 on a high-fat diet reduces the bioavailability of choline and mimics the effect of choline-deficient diets, causing NAFLD. These data also indicate that gut microbiota may play an active role in the development of insulin resistance.

1,000 citations

References
More filters
Journal ArticleDOI
TL;DR: The main features of the CoMFA approach, exemplified by analyses of the affinities of 21 varied steroids to corticosteroid and testosterone-binding globulins, and a number of advances in the methodology of molecular graphics are described.
Abstract: Comparative molecular field analysis (CoMFA) is a promising new approach to structure/activity correlation. Its characteristic features are (1) representation of ligand molecules by their steric and electrostatic fields, sampled at the intersections of a three-dimensional lattice, (2) a new ‘field fit” technique, allowing optimal mutual alignment within a series, by minimizing the RMS field differences between molecules, (3) data analysis by partial least squares (PLS), using cross-validation to maximize the likelihood that the results have predictive validity, and (4) graphic representation of results, as contoured three-dimensional coefficient plots. CoMFA is exemplified by analyses of the affinities of 21 varied steroids to corticosteroidand testosterone-binding globulins. Also described are the sensitivities of results to the nature of the field and the definition of the lattice and, for comparison, analyses of the same data using various combinations of other parameters. From these results, a set of ten steroid-binding affinity values unknown to us during the CoMFA analysis were well predicted. A major goal in chemical research is to predict the behavior of new molecules, using relationships derived from analysis of the properties of previously tested molecules. Relationships derived primarily by empirical analysis of a data table, whose columns are numerical property values and whose rows are compounds, usually taking the form of a linear equation, are called quantitative structure/activity relationships (QSAR).I Especially in biological applications, it has long been agreed that the most relevant numerical property values would be shape-dependent. Work on comparative molecular field analysis (CoMFA) began 12 years ago with two additional observations: (1) at the molecular level, the interactions which produce an observed biological effect are usually non-covalent; and ( 2 ) molecular mechanics force fields, most of which treat noncovalent (non-bonded) interactions only as steric and electrostatic forces, can account precisely for a great variety of observed molecular properties.2 Thus it seems reasonable that a suitable sampling of the steric and electrostatic fields surrounding a set of ligand (drug) molecules might provide all the information necessary for understanding their observed biological properties. However, the emergence of a practical CoMFA methodology had to await a new method of data analysis, partial least squares (PLS),3 which can derive robust linear equations from tables having many more columns than rows, and a number of advances in the methodology of molecular graphics. Other “3D-QSAR” methodologies have been described. The molecular shape (MS) approaches, developed independently by Simon et aL4 and by H ~ p f i n g e r , ~ compare net, rather than location-dependent, differences in molecular connectivities, volumes, and/or fields. A second approach, the “distance geometry” method of Crippen,6 provides validation of a ”site-point” hypothesis, a list of binding set coordinates and properties that must be proposed by the investigator. A prototype version of the CoMFA method is called “DYLOMMS”.7 In related work, for exploring binding modes of ligands to receptors, Goodford* advocates the display of probe-interaction “grids”, similar to thme used in CoMFA, while Hansch, Blaney, Langridge, et aL9 have shown the complementarity of QSAR and molecular graphics in understanding enzyme inhibitor data. Below we describe the main features of the CoMFA approach, exemplifying its use by analyzing the binding affinities of 21 varied steroid structures to human corticosteroid-binding globulins (CBG) and testosterone-binding globulins10 (TBG). In this series, the comparative rigidity of the steroid nucleus allows the conformational variable to be neglected, and the in vitro, particularly simple, character of the test system minimizes the importance of nonreceptor-related, hence non-shape-related, compound differences on the experimental observations.” We then investigated the *Author to whom all correspondence should be addressed. 0002-7863/88/15 10-5959$01.50/0 sensitivity of the excellent results obtained to critical model assumptions. For the purpose of comparison, we have also analysed these steroid binding data using both classical and other ”molecular shape” parameters, in various combinations. Finally, toward the end of this work, we were informed of additional corticosteroid binding data,12 and thus were able to test the ability of our model to predict the binding constants of ten more, structurally diverse, steroids. Computational Methods CoMFA Methodology. The overall data flow of a CoMFA analysis appears in Figure I . Its top two panels show how the data table is constructed from the field values at the lattice intersections. These automatically calculated parameters are the energies of steric (van der Waals 6-12) and electrostatic (Coulombic, with a 1 / r dielectric) interaction between the compound of interest, and a “probe atom” placed at the various intersections of a regular three-dimensional lattice, large enough to surround all of the compounds in the series, and with a 2.0 A separation between lattice point unless otherwise stated. The van der Waals A / B values were taken from the standard Tripos force field” and the atomic charges were calculated by the method of Gasteiger and Mar~i l i . ’~ Unless stated otherwise, the probe atom had the van der Waals properties of sp3 carbon and a charge of +1.0. Wherever the prove atom experiences a steric repulsion greater than “cutoff“ (30 kcal/mol ( I ) Martin, Y. C. Quantitative Drug Design; Marcel Dekker: New York, 1978. (2) Burkert, U.; Allinger, N. L. Molecular Mechanics; American Chemical Society: Washington, DC, 1982. (3) Wold, S . ; Ruhe, A,; Wold, H.; Dunn, W. J., 111 SIAM J . Sci. Stat. Comput. 1984, 5 , 135. (4) Simon, Z.; Badileuscu, I.; Racovitan, T. J. Theor. Biol. 1977,66,485. Simon, Z . ; Dragomir, N.; Plauchithiu, M. G.; Holban, S . ; Glatt, H.; Kerek, F. Eur. J . Med. Chem. 1980, 15, 521. ( 5 ) Hopfinger, A. J. J . Am. Chem. SOC. 1980, 102, 7196. (6) Chose, A. K.; Crippen, G. M. J . Med. Chem. 1985, 28, 333 and references therein. (7) Cramer, R. D., 111; Milne, M. Abstracts of the ACS Meeting, April 1979, COMP 44. Wise, M.; Cramer, R. D.; Smith, D. M.; Exman, I. In Quantitative Approaches to Drug Design; Dearden, J. C., Ed.; Elsevier: Amsterdam, 1983; p 145. Wise, M. in Molecular Graphics and Drug Design; Burgen, A. S . V., Roberts, G. C. K., Tute, M. S., Elsevier: New York, 1986; pp 183-194. Cramer, R. D., 111; Bunce, J. D. In QSAR in Drug Design and Toxicology; Hadzi, D., Jerman-Blazic, B., Eds.; Elsevier: New York, 1987; P 3. (8) Goodford, P. J. J . Med. Chem. 1985, 28, 849. (9) Hansch, C.; Hathaway, B. A.; Guo, Z. R.; Selassie, C. D.; Dietrich, S . W.; Blaney, J. M.; Langridge, R.; Volz, K. W.; Kaufman, B. T. J . Med. Chem. 1984, 27, 129. (10) Dunn, J. F.; Nisula, B. C.; Rodbard, D. J . Clin. Endocrin. Metab. 1981, 63. ( I 1 ) Cramer, R. D., I11 Quant. Struct. Acf . Pharmacol., Chem. Biol. 1983, 2, 7, 13. Yunger, L. M.; Cramer, R. D., 111 Quant. Struc. Act. Relat. Pharmacol., Chem. Biol. 1983, 2, 149. (12) Westphal, U. Steroid-Protein Interactions I I ; Springer-Verlag: Berlin, 1986. ( 1 3) Vinter, J. G.; Davis, A.; Saunder, M. R. J . Comp-Aided Mol. Design 1987, 1, 31. (14) Gasteiger, J.; Marsili, M. Tetrahedron 1980, 36, 3219.

3,655 citations

Journal ArticleDOI
TL;DR: In this article, the standard normal variate (SNV) and de-trending (DT) approaches are applied to individual NIR diffuse reflectance spectra to remove the multiplicative interferences of scatter and particle size.
Abstract: Particle size, scatter, and multi-collinearity are long-standing problems encountered in diffuse reflectance spectrometry. Multiplicative combinations of these effects are the major factor inhibiting the interpretation of near-infrared diffuse reflectance spectra. Sample particle size accounts for the majority of the variance, while variance due to chemical composition is small. Procedures are presented whereby physical and chemical variance can be separated. Mathematical transformations—standard normal variate (SNV) and de-trending (DT)—applicable to individual NIR diffuse reflectance spectra are presented. The standard normal variate approach effectively removes the multiplicative interferences of scatter and particle size. De-trending accounts for the variation in baseline shift and curvilinearity, generally found in the reflectance spectra of powdered or densely packed samples, with the use of a second-degree polynomial regression. NIR diffuse NIR diffuse reflectance spectra transposed by these methods are free from multi-collinearity and are not confused by the complexity of shape encountered with the use of derivative spectroscopy.

3,062 citations

Journal ArticleDOI
TL;DR: The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values.
Abstract: The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values. The probes include water, the methyl group, amine nitrogen, carboxy oxygen, and hydroxyl. Contour surfaces at appropriate energy levels are calculated for each probe and displayed by computer graphics together with the protein structure. Contours at negative energy levels delineate contours also enable other regions of attraction between probe and protein and are found at known ligand binding clefts in particular. The contours also enable other regions of attraction to be identified and facilitate the interpretation of protein-ligand energetics. They may, therefore, be of value for drug design.

2,676 citations

Journal ArticleDOI
TL;DR: In this paper, the mathematical and statistical structure of PLS regression is developed and the PLS decomposition of the data matrices involved in model building is analyzed. But the PLP regression algorithm can be interpreted in a model building setting.
Abstract: In this paper we develop the mathematical and statistical structure of PLS regression We show the PLS regression algorithm and how it can be interpreted in model building The basic mathematical principles that lie behind two block PLS are depicted We also show the statistical aspects of the PLS method when it is used for model building Finally we show the structure of the PLS decompositions of the data matrices involved

1,778 citations

Journal ArticleDOI
TL;DR: In this article, a multi-wavelength concept for optical correction (Multiplicative Scatter Correction, MSC) is proposed for separating the chemical light absorption from the physical light scatter.
Abstract: This paper is concerned with the quantitative analysis of multicomponent mixtures by diffuse reflectance spectroscopy. Near-infrared reflectance (NIRR) measurements are related to chemical composition but in a nonlinear way, and light scatter distorts the data. Various response linearizations of reflectance (R) are compared (R with Saunderson correction for internal reflectance, log 1/R, and Kubelka-Munk transformations and its inverse). A multi-wavelength concept for optical correction (Multiplicative Scatter Correction, MSC) is proposed for separating the chemical light absorption from the physical light scatter. Partial Least Squares (PLS) regression is used as the multivariate linear calibration method for predicting fat in meat from linearized and scatter-corrected NIRR data over a broad concentration range. All the response linearization methods improved fat prediction when used with the MSC; corrected log 1/R and inverse Kubelka-Munk transformations yielded the best results. The MSC provided simpler calibration models with good correspondence to the expected physical model of meat. The scatter coefficients obtained from the MSC correlated with fat content, indicating that fat affects the NIRR of meat with an additive absorption component and a multiplicative scatter component.

1,309 citations