Skip to:Content
|
Bottom
Cover image for Introduction to high-dimensional statistics
Title:
Introduction to high-dimensional statistics
Personal Author:
Series:
Monographs on statistics and applied probability ; 139
Publication Information:
Boca Raton : CRC Press, Taylor & Francis Group, 2015
Physical Description:
xv, 255 pages : illustrations ; 24 cm.
ISBN:
9781482237948

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010344150 QC20.7.D55 G57 2015 Open Access Book Book
Searching...

On Order

Summary

Summary

Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise.

Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for handling high-dimensional data. The book is intended to expose the reader to the key concepts and ideas in the most simple settings possible while avoiding unnecessary technicalities.

Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this highly accessible text:

Describes the challenges related to the analysis of high-dimensional data Covers cutting-edge statistical methods including model selection, sparsity and the lasso, aggregation, and learning theory Provides detailed exercises at the end of every chapter with collaborative solutions on a wikisite Illustrates concepts with simple but clear practical examples

Introduction to High-Dimensional Statistics is suitable for graduate students and researchers interested in discovering modern statistics for massive data. It can be used as a graduate text or for self-study.


Author Notes

Christophe Giraud was a student of the École Normale Supérieure de Paris, and he received a Ph.D in probability theory from the University Paris 6. He was assistant professor at the University of Nice from 2002 to 2008. He has been associate professor at the École Polytechnique since 2008 and professor at Paris Sud University (Orsay) since 2012. His current research focuses mainly on the statistical theory of high-dimensional data analysis and its applications to life sciences.


Table of Contents

Prefacep. xiii
Acknowledgmentsp. xv
1 Introductionp. 1
1.1 High-Dimensional Datap. 1
1.2 Curse of Dimensionalityp. 3
1.2.1 Lost in the Immensity of High-Dimensional Spacesp. 3
1.2.2 Fluctuations Cumulatep. 6
1.2.3 An Accumulation of Rare Events May Not Be Rarep. 10
1.2.4 Computational Complexityp. 13
1.3 High-Dimensional Statisticsp. 13
1.3.1 Circumventing the Curse of Dimensionalityp. 13
1.3.2 A Paradigm Shiftp. 16
1.3.3 Mathematics of High-Dimensional Statisticsp. 17
1.4 About This Bookp. 18
1.4.1 Statistics and Data Analysisp. 18
1.4.2 Purpose of This Bookp. 19
1.4.3 Overviewp. 19
1.5 Discussion and Referencesp. 21
1.5.1 Take-Home Messagep. 21
1.5.2 Referencesp. 21
1.6 Exercisesp. 21
1.6.1 Strange Geometry of High-Dimensional Spacesp. 21
1.6.2 Volume of a p-Dimensional Ballp. 22
1.6.3 Tails of a Standard Gaussian Distributionp. 22
1.6.4 Principal Component Analysisp. 23
1.6.5 Basics of Linear Regressionp. 24
1.6.6 Concentration of Square Norm of Gaussian Random Variablep. 24
2 Model Selectionp. 27
2.1 Statistical Settingp. 27
2.2 To Select among a Collection of Modelsp. 30
2.2.1 Models and Oraclep. 30
2.2.2 Model Selection Proceduresp. 33
2.3 Risk Bound for Model Selectionp. 37
2.3.1 Oracle Risk Boundp. 37
2.4 Optimalityp. 41
2.4.1 Minimax Optimalityp. 41
2.4.2 Frontier of Estimation in High Dimensionsp. 45
2.4.3 Minimal Penaltiesp. 45
2.5 Computational Issuesp. 46
2.6 Illustrationp. 47
2.7 An Alternative Point of View on Model Selectionp. 49
2.8 Discussion and Referencesp. 50
2.8.1 Take-Home Messagep. 50
2.8.2 Referencesp. 50
2.9 Exercisesp. 51
2.9.1 Orthogonal Designp. 51
2.9.2 Risk Bounds for the Different Sparsity Settingsp. 53
2.9.3 Collections of Nested Modelsp. 54
2.9.4 Segmentation with Dynamic Programmingp. 55
2.9.5 Goldenshluger-Lepski Methodp. 56
2.9.6 Minimax Lower Boundsp. 58
3 Aggregation of Estimatorsp. 61
3.1 Introductionp. 61
3.2 Gibbs Mixing of Estimatorsp. 61
3.3 Oracle Risk Boundp. 62
3.4 Numerical Approximation by Metropohs-Hastingsp. 65
3.5 Numerical Illustrationp. 68
3.6 Discussion and Referencesp. 69
3.6.1 Take-Home Messagep. 69
3.6.2 Referencesp. 69
3.7 Exercisesp. 69
3.7.1 Gibbs Distributionp. 69
3.7.2 Orthonormal Setting with Power Law Priorp. 70
3.7.3 Group-Sparse Settingp. 70
3.7.4 Gain of Combiningp. 70
3.7.5 Online Aggregationp. 71
4 Convex Criteriap. 73
4.1 Reminder on Convex Multivariate Functionsp. 73
4.1.1 Subdifferentialsp. 73
4.1.2 Two Useful Propertiesp. 74
4.2 Lasso Estimatorp. 74
4.2.1 Geometric Insightsp. 76
4.2.2 Analytic Insightsp. 76
4.2.3 Oracle Risk Boundp. 79
4.2.4 Computing the Lasso Estimatorp. 82
4.2.5 Removing the Bias of the Lasso Estimatorp. 86
4.3 Convex Criteria for Various Sparsity Patternsp. 87
4.3.1 Group-Lasso (Group Sparsity)p. 87
4.3.2 Sparse-Group Lasso (Sparse-Group Sparsity)p. 90
4.3.3 Fused-Lasso (Variation Sparsity)p. 91
4.4 Discussion and Referencesp. 91
4.4.1 Take-Home Messagep. 91
4.4.2 Referencesp. 91
4.5 Exercisesp. 92
4.5.1 When Is the Lasso Solution Unique?p. 92
4.5.2 Support Recovery via the Witness Approachp. 93
4.5.3 Lower Bound on the Compatibility Constantp. 94
4.5.4 On the Group-Lassop. 94
4.5.5 Dantzig Selectorp. 96
4.5.6 Projection on the l 1 -Ballp. 97
4.5.7 Ridge and Elastic-Netp. 98
5 Estimator Selectionp. 101
5.1 Estimator Selectionp. 102
5.2 Cross-Validation Techniquesp. 103
5.3 Complexity Selection Techniquesp. 105
5.3.1 Coordinate-Sparse Regressionp. 106
5.3.2 Group-Sparse Regressionp. 107
5.3.3 Multiple Structuresp. 108
5.4 Scaled-Invariant Criteriap. 308
5.5 References and Discussionp. 113
5.5.1 Take-Home Messagep. 113
5.5.2 Referencesp. 115
5.6 Exercisesp. 116
5.6.1 Expected V-Fold CV l 2 -Riskp. 116
5.6.2 Proof of Corollary 5.5p. 117
5.6.3 Some Properties of Penalty (5.4)p. 117
5.6.4 Selecting the Number of Steps for the Forward Algorithmp. 119
6 Multivariate Regressionp. 121
6.1 Statistical Settingp. 121
6.2 A Reminder on Singular Valuesp. 122
6.3 Low-Rank Estimationp. 123
6.3.1 If We Knew the Rank of A*p. 123
6.3.2 When the Rank of A* Is Unknownp. 126
6.4 Low Rank and Sparsityp. 129
6.4.1 Row-Sparse matricesp. 129
6.4.2 Criterion for Row-Sparse and Low-Rank Matricesp. 130
6.4.3 Convex Criterion for Low Rank Matricesp. 134
6.4.4 Convex Criterion for Sparse and Low-Rank Matricesp. 137
6.5 Discussion and Referencesp. 137
6.5.1 Take-Home Messagep. 137
6.5.2 Referencesp. 138
6.6 Exercisesp. 138
6.6.1 Hard-Thresholding of the Singular Valuesp. 138
6.6.2 Exact Rank Recoveryp. 138
6.6.3 Rank Selection with Unknown Variancep. 139
7 Graphical Modelsp. 141
7.1 Reminder on Conditional Independencep. 142
7.2 Graphical Modelsp. 142
7.2.1 Directed Acyclic Graphical Modelsp. 142
7.2.2 Nondirected Modelsp. 144
7.3 Gaussian Graphical Models (GGM)p. 147
7.3.1 Connection with the Precision Matrix and the Linear Regressionp. 147
7.3.2 Estimating g by Multiple Testingp. 148
7.3.3 Sparse Estimation of the Precision Matrixp. 149
7.3.4 Estimation of g by Regressionp. 150
7.4 Practical Issuesp. 154
7.5 Discussion and Referencesp. 155
7.5.1 Take-Home Messagep. 155
7.5.2 Referencesp. 155
7.6 Exercisesp. 156
7.6.1 Factorization in Directed Modelsp. 156
7.6.2 Moralization of a Directed Graphp. 156
7.6.3 Convexity of -log(det(K))p. 157
7.6.4 Block Gradient Descent with the l 1 /l 2 Penaltyp. 157
7.6.5 Gaussian Graphical Models with Hidden Variablesp. 157
7.6.6 Dantzig Estimation of Sparse Gaussian Graphical Modelsp. 158
7.6.7 Gaussian Copula Graphical Modelsp. 160
7.6.8 Restricted Isometry Constant for Gaussian Matricesp. 162
8 Multiple Testingp. 165
8.1 An Introductory Examplep. 165
8.1.1 Differential Expression of a Single Genep. 165
8.1.2 Differential Expression of Multiple Genesp. 166
8.2 Statistical Settingp. 167
8.2.1 p-Valuesp. 167
8.2.2 Multiple Testing Settingp. 168
8.2.3 Bonferroni Correctionp. 169
8.3 Controlling the False Discovery Ratep. 169
8.3.1 Heuristicsp. 169
8.3.2 Step-Up Proceduresp. 170
8.3.3 FDR Control under the WPRDS Propertyp. 173
8.4 Illustrationp. 175
8.5 Discussion and Referencesp. 177
8.5.1 Take-Home Messagep. 177
8.5.2 Referencesp. 177
8.6 Exercisesp. 177
8.6.1 FDR versus FWERp. 177
8.6.2 WPRDS Propertyp. 178
8.6.3 Positively Correlated Normal Test Statisticsp. 178
9 Supervised Classificationp. 181
9.1 Statistical Modelingp. 181
9.1.1 Bayes Classifierp. 181
9.1.2 Parametric Modelingp. 182
9.1.3 Semi-Parametric Modelingp. 183
9.1.4 Nonparametric Modelingp. 185
9.2 Empirical Risk Minimizationp. 185
9.2.1 Misclassification Probability of the Empirical Risk Minimizerp. 186
9.2.2 Vapnik-Chervonenkis Dimensionp. 190
9.2.3 Dictionary Selectionp. 193
9.3 From Theoretical to Practical Classifiersp. 195
9.3.1 Empirical Risk Convexificationp. 195
9.3.2 Statistical Propertiesp. 198
9.3.3 Support Vector Machinesp. 202
9.3.4 AdaBoostp. 206
9.3.5 Classifier Selectionp. 207
9.4 Discussion and Referencesp. 208
9.4.1 Take-Home Messagep. 208
9.4.2 Referencesp. 208
9.5 Exercisesp. 208
9.5.1 Linear Discriminant Analysisp. 208
9.5.2 VC Dimension of Linear Classifiers in Rdp. 209
9.5.3 Linear Classifiers with Margin Constraintsp. 209
9.5.4 Spectral Kernelp. 210
9.5.5 Computation of the SVM Classifierp. 210
9.5.6 Kernel Principal Component Analysis (KPCA)p. 211
Appendix A Gaussian Distributionp. 213
A.1 Gaussian Random Vectorsp. 213
A.2 Chi-Square Distributionp. 214
A.3 Gaussian Conditioningp. 215
Appendix B Probabilistic Inequalitiesp. 217
B.1 Basic Inequalitiesp. 217
B.2 Concentration Inequalitiesp. 218
B.2.1 McDiarmid Inequalityp. 218
B.2.2 Gaussian Concentration Inequalityp. 221
B.3 Symmetrization and Contraction Lemmasp. 223
B.3.1 Symmetrization Lemmap. 223
B.3.2 Contraction Principlep. 224
B.4 Birge's Inequalityp. 227
Appendix C Linear Aigebrap. 229
C.1 Singular Value Decomposition (SVD)p. 229
C.2 Moore-Penrose Pseudo-Inversep. 230
C.3 Matrix Normsp. 231
C.4 Matrix Analysisp. 232
Appendix D Subdifferentials of Convex Functionsp. 235
D.1 Subdifferentials and Subgradientsp. 235
D.2 Examples of Subdifferentialsp. 237
Appendix E Reproducing Kernel Hilbert Spacesp. 239
Notationsp. 243
Bibliographyp. 245
Indexp. 253
Go to:Top of Page