A Computational Approach to Statistical Learning

Select an Action

Place Hold(s)
Add to My Lists
Email
Print

Title:

Personal Author:

Arnold, Taylor, author

Series:

Texts in Statistical Science

Physical Description:

xiii, 361 pages : illustrations ; 25 cm.

ISBN:

9781138046375

Abstract:

A Computational Approach to Statistical Learning gives a novel introduction to predictive modeling by focusing on the algorithmic and numeric motivations behind popular statistical methods. The text contains annotated code to over 80 original reference functions. These functions provide minimal working implementations of common statistical learning algorithms. Every chapter concludes with a fully worked out application that illustrates predictive modeling tasks using a real-world dataset. The text begins with a detailed analysis of linear models and ordinary least squares

Subject Term:

Machine learning -- Mathematics

Mathematical statistics

Estimation theory

Added Author:

Kane, Michael (Michael John),

Lewis, Bryan W. (Bryan Wayne),

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000010369665	Q325.5 A76 2019	Open Access Book	Book	Searching... Unknown
Searching... PSZ KL	33000000003190	Q325.5 A76 2019	Open Access Book	Book	Searching... Unknown

On Order

Summary

The text begins with a detailed analysis of linear models and ordinary least squares. Subsequent chapters explore extensions such as ridge regression, generalized linear models, and additive models. The second half focuses on the use of general-purpose algorithms for convex optimization and their application to tasks in statistical learning. Models covered include the elastic net, dense neural networks, convolutional neural networks (CNNs), and spectral clustering. A unifying theme throughout the text is the use of optimization theory in the description of predictive models, with a particular focus on the singular value decomposition (SVD). Through this theme, the computational approach motivates and clarifies the relationships between various predictive models.

Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chamber's prize for statistical software in 2010.

Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba , doRedis , and threejs .

Author Notes

Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities and the American Council of Learned Societies. His first book, Humanities Data in R, was published in 2015.
Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health, DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chambers' prize for statistical software in 2010.
Bryan W. Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.

Preface	p. xi
1 Introduction	p. 1
1.1 Computational approach	p. 1
1.2 Statistical learning	p. 2
1.3 Example	p. 3
1.4 Prerequisites	p. 5
1.5 How to read this book	p. 6
1.6 Supplementary materials	p. 7
1.7 Formalisms and terminology	p. 7
1.8 Exercises	p. 9
2 Linear Models	p. 11
2.1 Introduction	p. 11
2.2 Ordinary least squares	p. 13
2.3 The normal equations	p. 15
2.4 Solving least squares with the singular value decomposition	p. 17
2.5 Directly solving the linear system	p. 19
2.6 (*) Solving linear models using the QR decomposition	p. 22
2.7 (*) Sensitivity analysis	p. 24
2.8 (*) Relationship between numerical and statistical error	p. 28
2.9 Implementation and notes	p. 31
2.10 Application: Cancer incidence rates	p. 32
2.11 Exercises	p. 40
3 Ridge Regression and Principal Component Analysis	p. 43
3.1 Variance in OLS	p. 43
3.2 Ridge regression	p. 46
3.3 (*) A Bayesian perspective	p. 53
3.4 Principal component analysis	p. 56
3.5 Implementation and notes	p. 63
3.6 Application: NYC taxicab data	p. 65
3.7 Exercises	p. 72
4 Linear Smoothers	p. 75
4.1 Non-Linearity	p. 75
4.2 Basis expansion	p. 76
4.3 Kernel regression	p. 81
4.4 Local regression	p. 85
4.5 Regression splines	p. 89
4.6 (*) Smoothing splines	p. 95
4.7 (*) B-splines	p. 100
4.8 Implementation and notes	p. 104
4.9 Application: U.S. census tract data	p. 105
4.10 Exercises	p. 120
5 Generalized Linear Models	p. 123
5.1 Classification with, linear models	p. 123
5.2 Exponential families	p. 128
5.3 Iteratively reweighted GLMs	p. 131
5.4 (*) Numerical issues	p. 135
5.5 (*) Multi-Class regression	p. 138
5.6 Implementation and notes	p. 139
5.7 Application: Chicago crime prediction	p. 140
5.8 Exercises	p. 148
6 Additive Models	p. 151
6.1 Multivariate linear smoothers	p. 151
6.2 Curse of dimensionality	p. 155
6.3 Additive models	p. 158
6.4 (*) Additive models as linear models	p. 163
6.5 (*) Standard errors in additive models	p. 166
6.6 Implementation and notes	p. 170
6.7 Application: NYC flights data	p. 172
6.8 Exercises	p. 178
7 Penalized Regression Models	p. 179
7.1 Variable selection	p. 179
7.2 Penalized regression with the l 0 - and l 1 -norms	p. 180
7.3 Orthogonal data matrix	p. 182
7.4 Convex optimization and the elastic net	p. 186
7.5 Coordinate descent	p. 188
7.6 (*) Active set screening using the KKT conditions	p. 193
7.7 (*) The generalized elastic net model	p. 198
7.8 Implementation and notes	p. 200
7.9 Application: Amazon product reviews	p. 201
7.10 Exercises	p. 206
8 Neural Networks	p. 207
8.1 Dense neural network architecture	p. 207
8.2 Stochastic gradient descent	p. 211
8.3 Backward propagation of errors	p. 213
8.4 Implementing backpropagation	p. 216
8.5 Recognizing handwritten digits	p. 224
8.6 (*) Improving SGD and regularization	p. 226
8.7 (*) Glassification with neural networks	p. 232
8.8 (*) Convolutional neural networks	p. 239
8.9 Implementation and notes	p. 249
8.10 Application: Image classification with EMNIST	p. 249
8.11 Exercises	p. 259
9 Dimensionality Reduction	p. 261
9.1 Unsupervised learning	p. 261
9.2 Kernel functions	p. 262
9.3 Kernel principal component analysis	p. 266
9.4 Spectral clustering	p. 272
9.5 t-Distributed stochastic neighbor embedding (t-SNE)	p. 277
9.6 Autoencoders	p. 282
9.7 Implementation and notes	p. 283
9.8 Application: Classifying and visualizing fashion MNIST	p. 284
9.9 Exercises	p. 295
10 Computation in Practice	p. 297
10.1 Reference implementations	p. 297
10.2 Sparse matrices	p. 298
10.3 Sparse generalized linear models	p. 304
10.4 Computation on row chunks	p. 307
10.5 Feature hashing	p. 311
10.6 Data quality issues	p. 318
10.7 Implementation and notes	p. 320
10.8 Application	p. 321
10.9 Exercises	p. 329
A Linear algebra and matrices	p. 331
A.l Vector spaces	p. 331
A.2 Matrices	p. 333
B Floating Point Arithmetic and Numerical Computation	p. 337
B.1 Floating point arithmetic	p. 337
B.2 Computational effort	p. 340
Bibliography	p. 343
Index	p. 359

Available:*

On Order

Summary

Summary

Author Notes

Table of Contents