Probability and Statistics for Data Science : Math + R + Data

Title:

Personal Author:

Matloff, Norman S., author

Physical Description:

xxxii, 412 pages : illustrations ; 24 cm.

ISBN:

9780367260934

Subject Term:

Probabilities -- Textbooks

Mathematical statistics -- Textbooks

Probabilities -- Data processing

Mathematical statistics -- Data processing

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000010371634	QA273 M384 2020	Open Access Book	Book	Searching... Unknown

Summary

Probability and Statistics for Data Science: Math + R + Data covers "math stat"--distributions, expected value, estimation etc.--but takes the phrase "Data Science" in the title quite seriously:

* Real datasets are used extensively.

* All data analysis is supported by R coding.

* Includes many Data Science applications, such as PCA, mixture distributions, random graph models, Hidden Markov models, linear and logistic regression, and neural networks.

* Leads the student to think critically about the "how" and "why" of statistics, and to "see the big picture."

* Not "theorem/proof"-oriented, but concepts and models are stated in a mathematically precise manner.

Prerequisites are calculus, some matrix algebra, and some experience in programming.

Norman Matloff is a professor of computer science at the University of California, Davis, and was formerly a statistics professor there. He is on the editorial boards of the Journal of Statistical Software and The R Journal . His book Statistical Regression and Classification: From Linear Models to Machine Learning was the recipient of the Ziegel Award for the best book reviewed in Technometrics in 2017. He is a recipient of his university's Distinguished Teaching Award.

Author Notes

Reviews 1

Choice Review

This text by Matloff (Univ. of California, Davis) affords an excellent introduction to statistics for the data science student. It is different from other mathematics books on probability and statistics for a number of reasons. Its examples are often drawn from data science applications such as hidden Markov models (HMMs) and remote sensing, to name a few. The text adopts the R language extensively, using real data throughout to help students start thinking about the how and why of statistics. All the models and concepts are explained well in precise mathematical terms (not presented as formal proofs), to help students gain an intuitive understanding. All the data sets are publicly available, allowing an instructor to delve further into the data to pursue additional examples. This is an applied mathematics book, designed to foster an intuitive understanding of things like probability by examining "long run" data sets. It is helpful if students have had some exposure to programming in Python, C, Java, or R, but no prior experience with R is assumed. A brief introduction to R is included as an appendix, while applied uses are interspersed throughout the text. Summing Up: Recommended. Upper-division undergraduates and graduate students. Students enrolled in two-year technical programs. --Devon B. Mason, Albright College

Available:*

On Order

Summary

Summary

Author Notes

Reviews 1

Choice Review