scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A new efficient method for determining the number of components in PARAFAC models

01 May 2003-Journal of Chemometrics (John Wiley & Sons, Ltd)-Vol. 17, Iss: 5, pp 274-286
TL;DR: The core consistency diagnostic (CORCONDIA) as discussed by the authors is a diagnostic for determining the appropriate number of components for multiway models, which is based on scrutinizing the appropriateness of the structural model based on the data and the estimated parameters.
Abstract: A new diagnostic called the core consistency diagnostic (CORCONDIA) is suggested for determining the proper number of components for multiway models. It applies especially to the parallel factor analysis (PARAFAC) model, but also to other models that can be considered as restricted Tucker3 models. It is based on scrutinizing the ‘appropriateness’ of the structural model based on the data and the estimated parameters of gradually augmented models. A PARAFAC model (employing dimension-wise combinations of components for all modes) is called appropriate if adding other combinations of the same components does not improve the fit considerably. It is proposed to choose the largest model that is still sufficiently appropriate. Using examples from a range of different types of data, it is shown that the core consistency diagnostic is an effective tool for determining the appropriate number of components in e.g. PARAFAC models. However, it is also shown, using simulated data, that the theoretical understanding of CORCONDIA is not yet complete. Copyright © 2003 John Wiley & Sons, Ltd.
Citations
More filters
Journal ArticleDOI
TL;DR: This survey provides an overview of higher-order tensor decompositions, their applications, and available software.
Abstract: This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or $N$-way array. Decompositions of higher-order tensors (i.e., $N$-way arrays with $N \geq 3$) have applications in psycho-metrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, and elsewhere. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decomposition: CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank-one tensors, and the Tucker decomposition is a higher-order form of principal component analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox, Tensor Toolbox, and Multilinear Engine are examples of software packages for working with tensors.

9,227 citations

Book
12 Oct 2009
TL;DR: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF), including NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD).
Abstract: This book provides a broad survey of models and efficient algorithms for Nonnegative Matrix Factorization (NMF) This includes NMFs various extensions and modifications, especially Nonnegative Tensor Factorizations (NTF) and Nonnegative Tucker Decompositions (NTD) NMF/NTF and their extensions are increasingly used as tools in signal and image processing, and data analysis, having garnered interest due to their capability to provide new insights and relevant information about the complex latent relationships in experimental data sets It is suggested that NMF can provide meaningful components with physical interpretations; for example, in bioinformatics, NMF and its extensions have been successfully applied to gene expression, sequence analysis, the functional characterization of genes, clustering and text mining As such, the authors focus on the algorithms that are most useful in practice, looking at the fastest, most robust, and suitable for large-scale models Key features: Acts as a single source reference guide to NMF, collating information that is widely dispersed in current literature, including the authors own recently developed techniques in the subject area Uses generalized cost functions such as Bregman, Alpha and Beta divergences, to present practical implementations of several types of robust algorithms, in particular Multiplicative, Alternating Least Squares, Projected Gradient and Quasi Newton algorithms Provides a comparative analysis of the different methods in order to identify approximation error and complexity Includes pseudo codes and optimized MATLAB source codes for almost all algorithms presented in the book The increasing interest in nonnegative matrix and tensor factorizations, as well as decompositions and sparse representation of data, will ensure that this book is essential reading for engineers, scientists, researchers, industry practitioners and graduate students across signal and image processing; neuroscience; data mining and data analysis; computer science; bioinformatics; speech processing; biomedical engineering; and multimedia

2,136 citations

Journal ArticleDOI
TL;DR: In this paper, a toolbox for MATLAB is presented to support improved visualisation and sensitivity analyses of PARAFAC models in fluorescence spectroscopy, demonstrated using a dissolved organic matter (DOM) fluorescence dataset.
Abstract: PARAllel FACtor analysis (PARAFAC) is increasingly used to decompose fluorescence excitation emission matrices (EEMs) into their underlying chemical components. In the ideal case where fluorescence conforms to Beers Law, this process can lead to the mathematical identification and quantification of independently varying fluorophores. However, many practical and analytical hurdles stand between EEM datasets and their chemical interpretation. This article provides a tutorial in the practical application of PARAFAC to fluorescence datasets, demonstrated using a dissolved organic matter (DOM) fluorescence dataset. A new toolbox for MATLAB is presented to support improved visualisation and sensitivity analyses of PARAFAC models in fluorescence spectroscopy.

1,210 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used parallel factor analysis (PARAFAC) of fluorescence spectra collected on trans-oceanic cruises in the Pacific and Atlantic oceans to investigate the optical characteristics of dissolved organic matter in waters with limited freshwater influence (salinity > 30).

658 citations


Additional excerpts

  • ...Additional diagnostics included with the NWAY(Andersson and Bro, 2002) and PLS- Matlab toolboxes, particularly core-consistency (Bro and Kiers, 2003) and influence plots were also examined....

    [...]

  • ...Additional diagnostics included with the NWAY- (Andersson and Bro, 2002) and PLS- Matlab toolboxes, particularly core-consistency (Bro and Kiers, 2003) and influence plots were also examined....

    [...]

Journal ArticleDOI
TL;DR: While the multimodel comparisons provide a compelling demonstration of PARAFAC's ability to distill chemical information from EEMs, deficiencies identified through this process have broad implications for interpreting and reusing (D)OM-PARAFAC models.
Abstract: Organic matter (OM) is a ubiquitous constituent of natural waters quantifiable at very low levels using fluorescence spectroscopy. This technique has recognized potential in a range of applications where the ability to monitor water quality in real time is desirable, such as in water treatment systems. This study used PARAFAC to characterize a large (n = 1479) and diverse excitation emission matrix (EEM) data set from six recycled water treatment plants in Australia, for which sources of variability included geography, season, treatment processes, pH and fluorometer settings. Five components were identified independently in four or more plants, none of which were generated during the treatment process nor were typically entirely removed. PARAFAC scores could be obtained from EEMs by simple regression. The results have important implications for online monitoring of OM fluorescence in treatment plants, affecting choices regarding experimental design, instrumentation and the optimal wavelengths for tracking...

589 citations

References
More filters
Journal ArticleDOI
TL;DR: The multi-way decomposition method PARAFAC is a generalization of PCA to higher order arrays, but some of the characteristics of the method are quite different from the ordinary two-way case.

2,468 citations

Journal ArticleDOI
TL;DR: This communication is by no means an attempt to summarize or review the extensive work done in multiway data analysis but is intended solely for informing the reader of the existence, functionality, and applicability of the N-way Toolbox for MATLAB.

1,173 citations

Book
01 Jan 1954
TL;DR: In this article, the authors present a course to make the student understand the advanced instrumentation available for chemical analysis, and the student would be able to choose the instrument needed for analysis.
Abstract: Course Educational Objectives: To make the student understand the advanced instrumentation available for chemical analysis. Course Outcomes: After studying this course the student would be able to choose the instrument needed for analysis. UNIT-I (12 Lectures) AN INTRODUCTION TO INSTRUMENTAL METHODS: Terms Associated With Chemical Analysis, Classification Of Instrumental Techniques, A Review Of The Important Considerations In Analytical Methods, Basic Functions of Instrumentation, Important Considerations in Evaluating an Instrumental Method.

867 citations

Journal ArticleDOI
TL;DR: In this paper, an approach for handling retention time shifts in resolving chromatographic data using the PARAFAC2 model is presented, where the matrix of elution profiles preserve its inner product structure from sample to sample.
Abstract: SUMMARY This paper offers an approach for handling retention time shifts in resolving chromatographic data using the PARAFAC2 model. In Part I of this series an algorithm for PARAFAC2 was developed and extended toN-way arrays. It was discussed that the PARAFAC2 model has a number of attractive features. It is unique under mild conditions though it puts fewer restrictions on the data than the well-known PARAFAC1 model. This has important implications for the modeling of chromatographic data in which retention time shifts can be regarded as a violation of the assumption of parallel proportional profiles underlying the PARAFAC1 model. The PARAFAC2 model does not assume parallel proportional elution profiles, but only that the matrix of elution profiles preserve its ‘inner-product structure’ from sample to sample. This means that the cross-products of the matrix holding the elution profiles in its columns remain constant. Here an application using chromatographic separation based on the molecular size of thick juice samples from the beet sugar industry illustrates the benefit of using the PARAFAC2 model. Copyright © 1999 John Wiley & Sons, Ltd.

295 citations

Journal ArticleDOI
TL;DR: In this article, the authors used the unique parallel factor analysis (PARAFAC) model to decompose complex mixture signals into contributions from individual chemical components to analyze, understand, predict and monitor the quality based on a chemical foundation.

217 citations