Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010303026 | HF5415.126 P88 2012 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R explains and demonstrates, via the accompanying open-source software, how advanced analytical tools can address various business problems. It also gives insight into some of the challenges faced when deploying these tools. Extensively classroom-tested, the text is ideal for students in customer and business analytics or applied data mining as well as professionals in small- to medium-sized organizations.
The book offers an intuitive understanding of how different analytics algorithms work. Where necessary, the authors explain the underlying mathematics in an accessible manner. Each technique presented includes a detailed tutorial that enables hands-on experience with real data. The authors also discuss issues often encountered in applied data mining projects and present the CRISP-DM process model as a practical framework for organizing these projects.
Showing how data mining can improve the performance of organizations, this book and its R-based software provide the skills and tools needed to successfully develop advanced analytics capabilities.
Author Notes
Dr. Daniel S. Putler is a Data Artisan in Residence at Alteryx, a business intelligence/analytics software company. Dr. Robert E. Krider is a professor of marketing in the Beedie School of Business at Simon Fraser University. He has also taught in Hong Kong, Shanghai, Portugal, and Germany. His research tackles questions of customer and competitor behavior in retailing and media industries.
Table of Contents
List of Figures | p. xiii |
List of Tables | p. xxi |
Preface | p. xxiii |
I Purpose and Process | p. 1 |
1 Database Marketing and Data Mining | p. 3 |
1.1 Database Marketing | p. 4 |
1.1.1 Common Database Marketing Applications | p. 5 |
1.1.2 Obstacles to Implementing a Database Marketing Program | p. 8 |
1.1.3 Who Stands to Benefit the Most from the Use of Database Marketing? | p. 9 |
1.2 Data Mining | p. 9 |
1.2.1 Two Definitions of Data Mining | p. 9 |
1.2.2 Classes of Data Mining Methods | p. 10 |
1.2.2.1 Grouping Methods | p. 10 |
1.2.2.2 Predictive Modeling Methods | p. 11 |
1.3 Linking Methods to Marketing Applications | p. 14 |
2 A Process Model for Data Mining-CRISP-DM | p. 17 |
2.1 History and Background | p. 17 |
2.2 The Basic Structure of CRISP-DM | p. 19 |
2.2.1 CRISP-DM Phases | p. 19 |
2.2.2 The Process Model within a Phase | p. 21 |
2.2.3 The CRISP-DM Phases in More Detail | p. 21 |
2.2.3.1 Business Understanding | p. 21 |
2.2.3.2 Data Understanding | p. 22 |
2.2.3.3 Data Preparation | p. 23 |
2.2.3.4 Modeling | p. 25 |
2.2.3.5 Evaluation | p. 26 |
2.2.3.6 Deployment | p. 27 |
2.2.4 The Typical Allocation of Effort across Project Phases | p. 28 |
II Predictive Modeling Tools | p. 31 |
3 Basic Tools for Understanding Data | p. 33 |
3.1 Measurement Scales | p. 34 |
3.2 Software Tools | p. 36 |
3.2.1 Getting R | p. 37 |
3.2.2 Installing R on Windows | p. 41 |
3.2.3 Installing R on OS X | p. 43 |
3.2.4 Installing the RcmdrPlugin.BCA Package and Its Dependencies | p. 45 |
3.3 Reading Data into R Tutorial | p. 48 |
3.4 Creating Simple Summary Statistics Tutorial | p. 57 |
3.5 Frequency Distributions and Histograms Tutorial | p. 63 |
3.6 Contingency Tables Tutorial | p. 73 |
4 Multiple Linear Regression | p. 81 |
4.1 Jargon Clarification | p. 82 |
4.2 Graphical and Algebraic Representation of the Single Predictor Problem | p. 83 |
4.2.1 The Probability of a Relationship between the Variables | p. 89 |
4.2.2 Outliers | p. 91 |
4.3 Multiple Regression | p. 91 |
4.3.1 Categorical Predictors | p. 92 |
4.3.2 Nonlinear Relationships and Variable Transformations | p. 94 |
4.3.3 Too Many Predictor Variables: Overfitting and Adjusted R 2 | p. 97 |
4.4 Summary | p. 98 |
4.5 Data Visualization and Linear Regression Tutorial | p. 99 |
5 Logistic Regression | p. 117 |
5.1 A Graphical Illustration of the Problem | p. 118 |
5.2 The Generalized Linear Model | p. 121 |
5.3 Logistic Regression Details | p. 124 |
5.4 Logistic Regression Tutorial | p. 126 |
5.4.1 Highly Targeted Database Marketing | p. 126 |
5.4.2 Oversampling | p. 127 |
5.4.3 Overfitting and Model Validation | p. 128 |
6 Lift Charts | p. 147 |
6.1 Constructing Lift Charts | p. 147 |
6.1.1 Predict, Sort, and Compare to Actual Behavior | p. 147 |
6.1.2 Correcting Lift Charts for Oversampling | p. 151 |
6.2 Using Lift Charts | p. 154 |
6.3 Lift Chart Tutorial | p. 159 |
7 Tree Models | p. 165 |
7.1 The Tree Algorithm | p. 166 |
7.1.1 Calibrating the Tree on an Estimation Sample | p. 167 |
7.1.2 Stopping Rules and Controlling Overfitting | p. 170 |
7.2 Trees Models Tutorial | p. 172 |
8 Neural Network Models | p. 187 |
8.1 The Biological Inspiration for Artificial Neural Networks | p. 187 |
8.2 Artificial Neural Networks as Predictive Models | p. 192 |
8.3 Neural Network Models Tutorial | p. 194 |
9 Putting It All Together | p. 201 |
9.1 Stepwise Variable Selection | p. 201 |
9.2 The Rapid Model Development Framework | p. 204 |
9.2.1 Up-Selling Using the Wesbrook Database | p. 204 |
9.2.2 Think about the Behavior That You Are Trying to Predict | p. 205 |
9.2.3 Carefully Examine the Variables Contained in the Data Set | p. 205 |
9.2.4 Use Decision Trees and Regression to Find the Important Predictor Variables | p. 207 |
9.2.5 Use a Neural Network to Examine Whether Nonlinear Relationships Are Present | p. 208 |
9.2.6 If There Are Nonlinear Relationships, Use Visualization to Find and Understand Them | p. 209 |
9.3 Applying the Rapid Development Framework Tutorial | p. 210 |
III Grouping Methods | p. 233 |
10 Ward's Method of Cluster Analysis and Principal Components | p. 235 |
10.1 Summarizing Data Sets | p. 235 |
10.2 Ward's Method of Cluster Analysis | p. 236 |
10.2.1 A Single Variable Example | p. 238 |
10.2.2 Extension to Two or More Variables | p. 240 |
10.3 Principal Components | p. 242 |
10.4 Ward's Method Tutorial | p. 248 |
11 K-Centroids Partitioning Cluster Analysis | p. 259 |
11.1 How K-Centroid Clustering Works | p. 260 |
11.1.1 The Basic Algorithm to Find K-Centroids Clusters | p. 260 |
11.1.2 Specific K-Centroid Clustering Algorithms | p. 261 |
11.2 Cluster Types and the Nature of Customer Segments | p. 264 |
11.3 Methods to Assess Cluster Structure | p. 267 |
11.3.1 The Adjusted Rand Index to Assess Cluster Structure Reproducibility | p. 268 |
11.3.2 The Calinski-Harabasz Index to Assess within Cluster Homogeneity and between Cluster Separation | p. 274 |
11.4 K-Centroids Clustering Tutorial | p. 275 |
Bibliography | p. 283 |
Index | p. 287 |