Skip to:Content
|
Bottom
Cover image for Introduction to multivariate statistical analysis in chemometrics
Title:
Introduction to multivariate statistical analysis in chemometrics
Personal Author:
Publication Information:
Boca Raton, FL : CRC Press, 2009
Physical Description:
xiii, 321 p. : ill. ; 25 cm.
ISBN:
9781420059472
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010207100 QD75.4.C45 V37 2009 Open Access Book Book
Searching...
Searching...
30000010231388 QD75.4.C45 V37 2009 Open Access Book Book
Searching...

On Order

Summary

Summary

Using formal descriptions, graphical illustrations, practical examples, and R software tools, Introduction to Multivariate Statistical Analysis in Chemometrics presents simple yet thorough explanations of the most important multivariate statistical methods for analyzing chemical data. It includes discussions of various statistical methods, such as principal component analysis, regression analysis, classification methods, and clustering.

Written by a chemometrician and a statistician, the book reflects the practical approach of chemometrics and the more formally oriented one of statistics. To enable a better understanding of the statistical methods, the authors apply them to real data examples from chemistry. They also examine results of the different methods, comparing traditional approaches with their robust counterparts. In addition, the authors use the freely available R package to implement methods, encouraging readers to go through the examples and adapt the procedures to their own problems.

Focusing on the practicality of the methods and the validity of the results, this book offers concise mathematical descriptions of many multivariate methods and employs graphical schemes to visualize key concepts. It effectively imparts a basic understanding of how to apply statistical methods to multivariate scientific data.


Table of Contents

Prefacep. ix
Acknowledgmentsp. xi
Authorsp. xiii
Chapter 1 Introductionp. 1
1.1 Chemoinformatics-Chemometrics-Statisticsp. 1
1.2 This Bookp. 3
1.3 Historical Remarks about Chemometricsp. 4
1.4 Bibliographyp. 6
1.5 Starting Examplesp. 8
1.5.1 Univariate versus Bivariate Classificationp. 8
1.5.2 Nitrogen Content of Cereals Computed from NIR Datap. 9
1.5.3 Elemental Composition of Archaeological Glassesp. 10
1.6 Univariate Statistics-A Reminderp. 12
1.6.1 Empirical Distributionsp. 12
1.6.2 Theoretical Distributionsp. 16
1.6.3 Central Valuep. 19
1.6.4 Spreadp. 20
1.6.5 Statistical Testsp. 22
Referencesp. 25
Chapter 2 Multivariate Datap. 31
2.1 Definitionsp. 31
2.2 Basic Preprocessingp. 33
2.2.1 Data Transformationp. 34
2.2.2 Centering and Scalingp. 35
2.2.3 Normalizationp. 36
2.2.4 Transformations for Compositional Datap. 37
2.3 Covariance and Correlationp. 38
2.3.1 Overviewp. 38
2.3.2 Estimating Covariance and Correlationp. 40
2.4 Distances and Similaritiesp. 44
2.5 Multivariate Outlier Identificationp. 47
2.6 Linear Latent Variablesp. 50
2.6.1 Overviewp. 50
2.6.2 Projection and Mappingp. 51
2.6.3 Examplep. 53
2.7 Summaryp. 56
Referencesp. 58
Chapter 3 Principal Component Analysisp. 59
3.1 Conceptsp. 59
3.2 Number of PCA Componentsp. 63
3.3 Centering and Scalingp. 64
3.4 Outliers and Data Distributionp. 66
3.5 Robust PCAp. 67
3.6 Algorithms for PCAp. 69
3.6.1 Mathematics of PCAp. 69
3.6.2 Jacobi Rotationp. 71
3.6.3 Singular Value Decompositionp. 72
3.6.4 NIPALSp. 73
3.7 Evaluation and Diagnosticsp. 75
3.7.1 Cross Validation for Determination of the Number of Principal Componentsp. 75
3.7.2 Explained Variance for Each Variablep. 77
3.7.3 Diagnostic Plotsp. 78
3.8 Complementary Methods for Exploratory Data Analysisp. 81
3.8.1 Factor Analysisp. 82
3.8.2 Cluster Analysis and Dendrogramp. 82
3.8.3 Kohonen Mappingp. 84
3.8.4 Sammon's Nonlinear Mappingp. 87
3.8.5 Multiway PCAp. 89
3.9 Examplesp. 91
3.9.1 Tissue Samples from Human Mummies and Fatty Acid Concentrationsp. 91
3.9.2 Polycyclic Aromatic Hydrocarbons in Aerosolp. 96
3.10 Summaryp. 99
Referencesp. 101
Chapter 4 Calibrationp. 103
4.1 Conceptsp. 103
4.2 Performance of Regression Modelsp. 108
4.2.1 Overviewp. 108
4.2.2 Overfitting and Underfittingp. 110
4.2.3 Performance Criteriap. 112
4.2.4 Criteria for Models with Different Numbers of Variablesp. 114
4.2.5 Cross Validationp. 115
4.2.6 Bootstrapp. 118
4.3 Ordinary Least-Squares Regressionp. 119
4.3.1 Simple OLSp. 119
4.3.2 Multiple OLSp. 124
4.3.2.1 Confidence Intervals and Statistical Tests in OLSp. 126
4.3.2.2 Hat Matrix and Full Cross Validation in OLSp. 129
4.3.3 Multivariate OLSp. 129
4.4 Robust Regressionp. 131
4.4.1 Overviewp. 131
4.4.2 Regression Diagnosticsp. 133
4.4.3 Practical Hintsp. 137
4.5 Variable Selectionp. 137
4.5.1 Overviewp. 137
4.5.2 Univariate and Bivariate Selection Methodsp. 139
4.5.3 Stepwise Selection Methodsp. 140
4.5.4 Best-Subset Regressionp. 141
4.5.5 Variable Selection Based on PCA or PLS Modelsp. 143
4.5.6 Genetic Algorithmsp. 143
4.5.7 Cluster Analysis of Variablesp. 146
4.5.8 Examplep. 146
4.6 Principal Component Regressionp. 148
4.6.1 Overviewp. 148
4.6.2 Number of PCA Componentsp. 150
4.7 Partial Least-Squares Regressionp. 150
4.7.1 Overviewp. 150
4.7.2 Mathematical Aspectsp. 154
4.7.3 Kernel Algorithm for PLSp. 157
4.7.4 NIPALS Algorithm for PLSp. 158
4.7.5 SIMPLS Algorithm for PLSp. 160
4.7.6 Other Algorithms for PLSp. 161
4.7.7 Robust PLSp. 162
4.8 Related Methodsp. 163
4.8.1 Canonical Correlation Analysisp. 163
4.8.2 Ridge and Lasso Regressionp. 166
4.8.3 Nonlinear Regressionp. 168
4.8.3.1 Basis Expansionsp. 168
4.8.3.2 Kernel Methodsp. 169
4.8.3.3 Regression Treesp. 170
4.8.3.4 Artificial Neural Networksp. 171
4.9 Examplesp. 172
4.9.1 GC Retention Indices of Polycyclic Aromatic Compoundsp. 172
4.9.1.1 Principal Component Regressionp. 173
4.9.1.2 Partial Least-Squares Regressionp. 177
4.9.1.3 Robust PLSp. 178
4.9.1.4 Ridge Regressionp. 179
4.9.1.5 Lasso Regressionp. 181
4.9.1.6 Stepwise Regressionp. 182
4.9.1.7 Summaryp. 184
4.9.2 Cereal Datap. 185
4.10 Summaryp. 188
Referencesp. 190
Chapter 5 Classificationp. 195
5.1 Conceptsp. 195
5.2 Linear Classification Methodsp. 197
5.2.1 Linear Discriminant Analysisp. 197
5.2.1.1 Bayes Discriminant Analysisp. 197
5.2.1.2 Fisher Discriminant Analysisp. 200
5.2.1.3 Examplep. 204
5.2.2 Linear Regression for Discriminant Analysisp. 205
5.2.2.1 Binary Classificationp. 205
5.2.2.2 Multicategory Classification with OLSp. 206
5.2.2.3 Multicategory Classification with PLSp. 207
5.2.3 Logistic Regressionp. 207
5.3 Kernel and Prototype Methodsp. 209
5.3.1 SIMCAp. 209
5.3.2 Gaussian Mixture Modelsp. 212
5.3.3 k-NN Classificationp. 214
5.4 Classification Treesp. 217
5.5 Artificial Neural Networksp. 221
5.6 Support Vector Machinep. 223
5.7 Evaluationp. 228
5.7.1 Principles and Misclassification Errorp. 228
5.7.2 Predictive Abilityp. 229
5.7.3 Confidence in Classification Answersp. 230
5.8 Examplesp. 231
5.8.1 Origin of Glass Samplesp. 231
5.8.1.1 Linear Discriminant Analysisp. 231
5.8.1.2 Logistic Regressionp. 233
5.8.1.3 Gaussian Mixture Modelsp. 234
5.8.1.4 k-NN Methodsp. 235
5.8.1.5 Classification Treesp. 236
5.8.1.6 Artificial Neural Networksp. 237
5.8.1.7 Support Vector Machinesp. 238
5.8.1.8 Overall Comparisonp. 238
5.8.2 Recognition of Chemical Substructures from Mass Spectrap. 240
5.9 Summaryp. 246
Referencesp. 247
Chapter 6 Cluster Analysisp. 251
6.1 Conceptsp. 251
6.2 Distance and Similarity Measuresp. 254
6.3 Partitioning Methodsp. 260
6.4 Hierarchical Clustering Methodsp. 263
6.5 Fuzzy Clusteringp. 266
6.6 Model-Based Clusteringp. 267
6.7 Cluster Validity and Clustering Tendency Measuresp. 270
6.8 Examplesp. 272
6.8.1 Chemotaxonomy of Plantsp. 272
6.8.2 Glass Samplesp. 278
6.9 Summaryp. 279
Referencesp. 281
Chapter 7 Preprocessingp. 283
7.1 Conceptsp. 283
7.2 Smoothing and Differentiationp. 283
7.3 Multiplicative Signal Correctionp. 284
7.4 Mass Spectral Featuresp. 287
7.4.1 Logarithmic Intensity Ratiosp. 288
7.4.2 Averaged Intensities of Mass Intervalsp. 288
7.4.3 Intensities Normalized to Local Intensity Sump. 288
7.4.4 Modulo-14 Summationp. 289
7.4.5 Autocorrelationp. 289
7.4.6 Spectra Typep. 289
7.4.7 Examplep. 289
Referencesp. 291
Appendix 1 Symbols and Abbreviationsp. 293
Appendix 2 Matrix Algebrap. 297
A.2.1 Definitionsp. 297
A.2.2 Addition and Subtraction of Matricesp. 298
A.2.3 Multiplication of Vectorsp. 298
A.2.4 Multiplication of Matricesp. 299
A.2.5 Matrix Inversionp. 300
A.2.6 Eigenvectorsp. 301
A.2.7 Singular Value Decompositionp. 302
Referencesp. 303
Appendix 3 Introduction to Rp. 305
A.3.1 General Information on Rp. 305
A.3.2 Installing Rp. 305
A.3.3 Starting Rp. 305
A.3.4 Working Directoryp. 306
A.3.5 Loading and Saving Datap. 306
A.3.6 Important R Functionsp. 306
A.3.7 Operators and Basic Functionsp. 307
Mathematical and Logical Operators, Comparisonp. 307
Special Elementsp. 308
Mathematical Functionsp. 308
Matrix Manipulationp. 308
Statistical Functionsp. 308
A.3.8 Data Typesp. 309
Missing Valuesp. 309
A.3.9 Data Structuresp. 309
A.3.10 Selection and Extraction from Data Objectsp. 310
Examples for Creating Vectorsp. 310
Examples for Selecting Elements from a Vector or Factorp. 310
Examples for Selecting Elements from a Matrix, Array, or Data Framep. 310
Examples for Selecting Elements from a Listp. 310
A.3.11 Generating and Saving Graphicsp. 311
Functions Relevant for Graphicsp. 311
Relevant Plot Parametersp. 311
Statistical Graphicsp. 311
Saving Graphic Outputp. 311
Referencesp. 312
Indexp. 313
Go to:Top of Page