Home
/
Authors
/
Richard G. Brereton

Author

Richard G. Brereton

Other affiliations: University of Cambridge, Austrian Academy of Sciences

Bio: Richard G. Brereton is an academic researcher from University of Bristol. The author has contributed to research in topics: Chemometrics & Principal component analysis. The author has an hindex of 41, co-authored 237 publications receiving 10827 citations. Previous affiliations of Richard G. Brereton include University of Cambridge & Austrian Academy of Sciences.

Papers published on a yearly basis

2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Support Vector Machines for classification and regression

[...]

Richard G. Brereton¹, Gavin R. Lloyd¹•Institutions (1)

University of Bristol¹

25 Jan 2010-Analyst

TL;DR: The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described, including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.

...read moreread less

Abstract: The increasing interest in Support Vector Machines (SVMs) over the past 15 years is described. Methods are illustrated using simulated case studies, and 4 experimental case studies, namely mass spectrometry for studying pollution, near infrared analysis of food, thermal analysis of polymers and UV/visible spectroscopy of polyaromatic hydrocarbons. The basis of SVMs as two-class classifiers is shown with extensive visualisation, including learning machines, kernels and penalty functions. The influence of the penalty error and radial basis function radius on the model is illustrated. Multiclass implementations including one vs. all, one vs. one, fuzzy rules and Directed Acyclic Graph (DAG) trees are described. One-class Support Vector Domain Description (SVDD) is described and contrasted to conventional two- or multi-class classifiers. The use of Support Vector Regression (SVR) is illustrated including its application to multivariate calibration, and why it is useful when there are outliers and non-linearities.

...read moreread less

1,899 citations

Book•

Chemometrics: Data Analysis for the Laboratory and Chemical Plant

[...]

Richard G. Brereton

12 Mar 2003

TL;DR: The concept and need for Principal Components Analysis, a method forsupervised Pattern Recognition: Cluster Analysis, and its application in Chemistry are explained.

...read moreread less

Abstract: Preface. Supplementary Information. Acknowledgements. 1. INTRODUCTION. Points of View. Software and Calculations. Further Reading. References. 2. EXPERIMENTAL DESIGN. Introduction. Basic Principles. Factorial Designs. Central Composite or Response Surface Designs. Mixture Designs. Simplex Optimisation. Problems. 3. SIGNAL PROCESSING. Sequential Signals in Chemistry. Basics. Linear Filters. Correlograms and Time Series Analysis. Fourier Transform Techniques. Topical Methods. Problems. 4. PATTERN RECOGNITION. Introduction. The Concept and Need for Principal Components Analysis. Principal Components Analysis: the Method. Unsupervised Pattern Recognition: Cluster Analysis. Supervised Pattern Recognition. Multiway Pattern Recognition. Problems. 5. CALIBRATION. Introduction. Univariate Calibration. Multiple Linear Regression. Principal Components Regression. Partial Least Squares. Model Validation. Problems. 6. EVOLUTIONARY SIGNALS. Introduction. Exploratory Data Analysis and Preprocessing. Determining Composition. Resolution. Problems. Appendices A.1 Vectors and Matrices. A.2 Algorithms. A.3 Basic Statistical Concepts. A.4 Excel for Chemometrics. A.5 Matlab for Chemometrics. Index

...read moreread less

1,411 citations

Journal Article•DOI•

Partial least squares discriminant analysis: taking the magic away

[...]

Richard G. Brereton¹, Gavin R. Lloyd²•Institutions (2)

University of Bristol¹, Gloucestershire Hospitals NHS Foundation Trust²

01 Apr 2014-Journal of Chemometrics

TL;DR: Partial least squares discriminant analysis (PLS-DA) has been available for nearly 20 years yet is poorly understood by most users as mentioned in this paper, however, despite these limitations, PLS-DA can provide good insight into the causes of discrimination via weights and loadings, which gives it a unique role in exploratory data analysis, for example in metabolomics via visualization of significant variables such as metabolites or spectroscopic peaks.

...read moreread less

Abstract: Partial least squares discriminant analysis (PLS-DA) has been available for nearly 20 years yet is poorly understood by most users. By simple examples, it is shown graphically and algebraically that for two equal class sizes, PLS-DA using one partial least squares (PLS) component provides equivalent classification results to Euclidean distance to centroids, and by using all nonzero components to linear discriminant analysis. Extensions where there are unequal class sizes and more than two classes are discussed including common pitfalls and dilemmas. Finally, the problems of overfitting and PLS scores plots are discussed. It is concluded that for classification purposes, PLS-DA has no significant advantages over traditional procedures and is an algorithm full of dangers. It should not be viewed as a single integrated method but as step in a full classification procedure. However, despite these limitations, PLS-DA can provide good insight into the causes of discrimination via weights and loadings, which gives it a unique role in exploratory data analysis, for example in metabolomics via visualisation of significant variables such as metabolites or spectroscopic peaks. Copyright © 2014 John Wiley & Sons, Ltd.

...read moreread less

578 citations

Journal Article•DOI•

Introduction to multivariate calibration in analytical chemistry

[...]

Richard G. Brereton¹•Institutions (1)

University of Bristol¹

01 Jan 2000-Analyst

563 citations

Book•

Applied Chemometrics for Scientists

[...]

Richard G. Brereton

02 Apr 2007

TL;DR: This book focuses on the development of Chemometrics through the application of unsupervised pattern recognition to the study of Spectroscopy and its applications in medicine and science.

...read moreread less

Abstract: Preface. 1 Introduction. 1.1 Development of Chemometrics. 1.2 Application Areas. 1.3 How to Use this Book. 1.4 Literature and Other Sources of Information. References. 2 Experimental Design. 2.1 Why Design Experiments in Chemistry? 2.2 Degrees of Freedom and Sources of Error. 2.3 Analysis of Variance and Interpretation of Errors. 2.4 Matrices, Vectors and the Pseudoinverse. 2.5 Design Matrices. 2.6 Factorial Designs. 2.7 An Example of a Factorial Design. 2.8 Fractional Factorial Designs. 2.9 Plackett-Burman and Taguchi Designs. 2.10 The Application of a Plackett-Burman Design to the Screening of Factors Influencing a Chemical Reaction. 2.11 Central Composite Designs. 2.12 Mixture Designs. 2.13 A Four Component Mixture Design Used to Study Blending of Olive Oils. 2.14 Simplex Optimization. 2.15 Leverage and Confidence in Models. 2.16 Designs for Multivariate Calibration. References. 3 Statistical Concepts. 3.1 Statistics for Chemists. 3.2 Errors. 3.3 Describing Data. 3.4 The Normal Distribution. 3.5 Is a Distribution Normal? 3.6 Hypothesis Tests. 3.7 Comparison of Means: the t-Test. 3.8 F-Test for Comparison of Variances. 3.9 Confidence in Linear Regression. 3.10 More about Confidence. 3.11 Consequences of Outliers and How to Deal with Them. 3.12 Detection of Outliers. 3.13 Shewhart Charts. 3.14 More about Control Charts. References. 4 Sequential Methods. 4.1 Sequential Data. 4.2 Correlograms. 4.3 Linear Smoothing Functions and Filters. 4.4 Fourier Transforms. 4.5 Maximum Entropy and Bayesian Methods. 4.6 Fourier Filters. 4.7 Peakshapes in Chromatography and Spectroscopy. 4.8 Derivatives in Spectroscopy and Chromatography. 4.9 Wavelets. References. 5 Pattern Recognition. 5.1 Introduction. 5.2 Principal Components Analysis. 5.3 Graphical Representation of Scores and Loadings. 5.4 Comparing Multivariate Patterns. 5.5 Preprocessing. 5.6 Unsupervised Pattern Recognition: Cluster Analysis. 5.7 Supervised Pattern Recognition. 5.8 Statistical Classification Techniques. 5.9 K Nearest Neighbour Method. 5.10 How Many Components Characterize a Dataset? 5.11 Multiway Pattern Recognition. References. 6 Calibration. 6.1 Introduction. 6.2 Univariate Calibration. 6.3 Multivariate Calibration and the Spectroscopy of Mixtures. 6.4 Multiple Linear Regression. 6.5 Principal Components Regression. 6.6 Partial Least Squares. 6.7 How Good is the Calibration and What is the Most Appropriate Model? 6.8 Multiway Calibration. References. 7 Coupled Chromatography. 7.1 Introduction. 7.2 Preparing the Data. 7.3 Chemical Composition of Sequential Data. 7.4 Univariate Purity Curves. 7.5 Similarity Based Methods. 7.6 Evolving and Window Factor Analysis. 7.7 Derivative Based Methods. 7.8 Deconvolution of Evolutionary Signals. 7.9 Noniterative Methods for Resolution. 7.10 Iterative Methods for Resolution. 8 Equilibria, Reactions and Process Analytics. 8.1 The Study of Equilibria using Spectroscopy. 8.2 Spectroscopic Monitoring of Reactions. 8.3 Kinetics and Multivariate Models for the Quantitative Study of Reactions 8.4 Developments in the Analysis of Reactions using On-line Spectroscopy. 8.5 The Process Analytical Technology Initiative. References. 9 Improving Yields and Processes Using Experimental Designs. 9.1 Introduction. 9.2 Use of Statistical Designs for Improving the Performance of Synthetic Reactions. 9.3 Screening for Factors that Influence the Performance of a Reaction. 9.4 Optimizing the Process Variables. 9.5 Handling Mixture Variables using Simplex Designs. 9.6 More about Mixture Variables. 10 Biological and Medical Applications of Chemometrics. 10.1 Introduction. 10.2 Taxonomy. 10.3 Discrimination. 10.4 Mahalanobis Distance. 10.5 Bayesian Methods and Contingency Tables. 10.6 Support Vector Machines. 10.7 Discriminant Partial Least Squares. 10.8 Micro-organisms. 10.9 Medical Diagnosis using Spectroscopy. 10.10 Metabolomics using Coupled Chromatography and Nuclear Magnetic Resonance. References. 11 Biological Macromolecules. 11.1 Introduction. 11.2 Sequence Alignment and Scoring Matches. 11.3 Sequence Similarity. 11.4 Tree Diagrams. 11.5 Phylogenetic Trees. References. 12 Multivariate Image Analysis. 12.1 Introduction. 12.2 Scaling Images. 12.3 Filtering and Smoothing the Image. 12.4 Principal Components for the Enhancement of Images. 12.5 Regression of Images. 12.6 Alternating Least Squares as Employed in Image Analysis. 12.7 Multiway Methods In Image Analysis. References. 13 Food. 13.1 Introduction. 13.2 How to Determine the Origin of a Food Product using Chromatography. 13.3 Near Infrared Spectroscopy. 13.4 Other Information. 13.5 Sensory Analysis: Linking Composition to Properties. 13.6 Varimax Rotation. 13.7 Calibrating Sensory Descriptors to Composition. References. Index.

...read moreread less

496 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

NMRPipe: a multidimensional spectral processing system based on UNIX pipes

[...]

Frank Delaglio¹, Stephan Grzesiek¹, Geerten W. Vuister², Guang Zhu³, John Pfeifer¹, Ad Bax¹ - Show less +2 more•Institutions (3)

National Institutes of Health¹, Utrecht University², Hong Kong University of Science and Technology³

01 Nov 1995-Journal of Biomolecular NMR

TL;DR: The asynchronous pipeline scheme provides other substantial advantages, including high flexibility, favorable processing speeds, choice of both all-in-memory and disk-bound processing, easy adaptation to different data formats, simpler software development and maintenance, and the ability to distribute processing tasks on multi-CPU computers and computer networks.

...read moreread less

Abstract: The NMRPipe system is a UNIX software environment of processing, graphics, and analysis tools designed to meet current routine and research-oriented multidimensional processing requirements, and to anticipate and accommodate future demands and developments. The system is based on UNIX pipes, which allow programs running simultaneously to exchange streams of data under user control. In an NMRPipe processing scheme, a stream of spectral data flows through a pipeline of processing programs, each of which performs one component of the overall scheme, such as Fourier transformation or linear prediction. Complete multidimensional processing schemes are constructed as simple UNIX shell scripts. The processing modules themselves maintain and exploit accurate records of data sizes, detection modes, and calibration information in all dimensions, so that schemes can be constructed without the need to explicitly define or anticipate data sizes or storage details of real and imaginary channels during processing. The asynchronous pipeline scheme provides other substantial advantages, including high flexibility, favorable processing speeds, choice of both all-in-memory and disk-bound processing, easy adaptation to different data formats, simpler software development and maintenance, and the ability to distribute processing tasks on multi-CPU computers and computer networks.

...read moreread less

13,804 citations

Journal Article•DOI•

Collinearity: a review of methods to deal with it and a simulation study evaluating their performance

[...]

Carsten F. Dormann¹, Jane Elith¹, Sven Bacher¹, Carsten M. Buchmann¹, Gudrun Carl¹, Gabriel Carré¹, Jaime Ricardo García Márquez¹, Bernd Gruber¹, Bruno Lafourcade¹, Pedro J. Leitão¹, Tamara Münkemüller¹, Colin J. McClean¹, Patrick E. Osborne¹, Björn Reineking¹, Boris Schröder¹, Andrew K. Skidmore¹, Damaris Zurell¹, Sven Lautenbach¹ - Show less +14 more•Institutions (1)

Helmholtz Centre for Environmental Research - UFZ¹

01 Jan 2013-Ecography

TL;DR: It was found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection and the value of GLM in combination with penalised methods and thresholds when omitted variables are considered in the final interpretation.

...read moreread less

Abstract: Collinearity refers to the non independence of predictor variables, usually in a regression-type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold-based pre-selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor-response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine-learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold-based pre-selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the ‘folk lore’-thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre-analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them.

...read moreread less

6,199 citations

Journal Article•DOI•

Regression Diagnostics: Identifying Influential Data and Sources of Collinearity

[...]

W. W. Muir¹•Institutions (1)

University of Strathclyde¹

01 May 1981

TL;DR: This chapter discusses Detecting Influential Observations and Outliers, a method for assessing Collinearity, and its applications in medicine and science.

...read moreread less

Abstract: 1. Introduction and Overview. 2. Detecting Influential Observations and Outliers. 3. Detecting and Assessing Collinearity. 4. Applications and Remedies. 5. Research Issues and Directions for Extensions. Bibliography. Author Index. Subject Index.

...read moreread less

4,948 citations

物件導向軟體之架構(Object-Oriented Software Construction)探討

[...]

簡聰富

01 Dec 1989

4,898 citations

“Bioinformatics” 특집을 내면서

[...]

장병탁, 김삼묘, 허철구

01 Aug 2000

TL;DR: Assessment of medical technology in the context of commercialization with Bioentrepreneur course, which addresses many issues unique to biomedical products.

...read moreread less

Abstract: BIOE 402. Medical Technology Assessment. 2 or 3 hours. Bioentrepreneur course. Assessment of medical technology in the context of commercialization. Objectives, competition, market share, funding, pricing, manufacturing, growth, and intellectual property; many issues unique to biomedical products. Course Information: 2 undergraduate hours. 3 graduate hours. Prerequisite(s): Junior standing or above and consent of the instructor.

...read moreread less

4,833 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse