Cover image for Customer and business analytics : applied data mining for business decision making using R
Title:
Customer and business analytics : applied data mining for business decision making using R
Personal Author:
Series:
Chapman & Hall/CRC the R series

Chapman & Hall/CRC the R series.
Publication Information:
Boca Raton, FL : CRC Press, c2012.
Physical Description:
xxvi, 289 p. : ill. ; 27 cm.
ISBN:
9781466503960
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010303026 HF5415.126 P88 2012 Open Access Book Book
Searching...

On Order

Summary

Summary

Customer and Business Analytics: Applied Data Mining for Business Decision Making Using R explains and demonstrates, via the accompanying open-source software, how advanced analytical tools can address various business problems. It also gives insight into some of the challenges faced when deploying these tools. Extensively classroom-tested, the text is ideal for students in customer and business analytics or applied data mining as well as professionals in small- to medium-sized organizations.

The book offers an intuitive understanding of how different analytics algorithms work. Where necessary, the authors explain the underlying mathematics in an accessible manner. Each technique presented includes a detailed tutorial that enables hands-on experience with real data. The authors also discuss issues often encountered in applied data mining projects and present the CRISP-DM process model as a practical framework for organizing these projects.

Showing how data mining can improve the performance of organizations, this book and its R-based software provide the skills and tools needed to successfully develop advanced analytics capabilities.


Author Notes

Dr. Daniel S. Putler is a Data Artisan in Residence at Alteryx, a business intelligence/analytics software company. Dr. Robert E. Krider is a professor of marketing in the Beedie School of Business at Simon Fraser University. He has also taught in Hong Kong, Shanghai, Portugal, and Germany. His research tackles questions of customer and competitor behavior in retailing and media industries.


Table of Contents

List of Figuresp. xiii
List of Tablesp. xxi
Prefacep. xxiii
I Purpose and Processp. 1
1 Database Marketing and Data Miningp. 3
1.1 Database Marketingp. 4
1.1.1 Common Database Marketing Applicationsp. 5
1.1.2 Obstacles to Implementing a Database Marketing Programp. 8
1.1.3 Who Stands to Benefit the Most from the Use of Database Marketing?p. 9
1.2 Data Miningp. 9
1.2.1 Two Definitions of Data Miningp. 9
1.2.2 Classes of Data Mining Methodsp. 10
1.2.2.1 Grouping Methodsp. 10
1.2.2.2 Predictive Modeling Methodsp. 11
1.3 Linking Methods to Marketing Applicationsp. 14
2 A Process Model for Data Mining-CRISP-DMp. 17
2.1 History and Backgroundp. 17
2.2 The Basic Structure of CRISP-DMp. 19
2.2.1 CRISP-DM Phasesp. 19
2.2.2 The Process Model within a Phasep. 21
2.2.3 The CRISP-DM Phases in More Detailp. 21
2.2.3.1 Business Understandingp. 21
2.2.3.2 Data Understandingp. 22
2.2.3.3 Data Preparationp. 23
2.2.3.4 Modelingp. 25
2.2.3.5 Evaluationp. 26
2.2.3.6 Deploymentp. 27
2.2.4 The Typical Allocation of Effort across Project Phasesp. 28
II Predictive Modeling Toolsp. 31
3 Basic Tools for Understanding Datap. 33
3.1 Measurement Scalesp. 34
3.2 Software Toolsp. 36
3.2.1 Getting Rp. 37
3.2.2 Installing R on Windowsp. 41
3.2.3 Installing R on OS Xp. 43
3.2.4 Installing the RcmdrPlugin.BCA Package and Its Dependenciesp. 45
3.3 Reading Data into R Tutorialp. 48
3.4 Creating Simple Summary Statistics Tutorialp. 57
3.5 Frequency Distributions and Histograms Tutorialp. 63
3.6 Contingency Tables Tutorialp. 73
4 Multiple Linear Regressionp. 81
4.1 Jargon Clarificationp. 82
4.2 Graphical and Algebraic Representation of the Single Predictor Problemp. 83
4.2.1 The Probability of a Relationship between the Variablesp. 89
4.2.2 Outliersp. 91
4.3 Multiple Regressionp. 91
4.3.1 Categorical Predictorsp. 92
4.3.2 Nonlinear Relationships and Variable Transformationsp. 94
4.3.3 Too Many Predictor Variables: Overfitting and Adjusted R 2p. 97
4.4 Summaryp. 98
4.5 Data Visualization and Linear Regression Tutorialp. 99
5 Logistic Regressionp. 117
5.1 A Graphical Illustration of the Problemp. 118
5.2 The Generalized Linear Modelp. 121
5.3 Logistic Regression Detailsp. 124
5.4 Logistic Regression Tutorialp. 126
5.4.1 Highly Targeted Database Marketingp. 126
5.4.2 Oversamplingp. 127
5.4.3 Overfitting and Model Validationp. 128
6 Lift Chartsp. 147
6.1 Constructing Lift Chartsp. 147
6.1.1 Predict, Sort, and Compare to Actual Behaviorp. 147
6.1.2 Correcting Lift Charts for Oversamplingp. 151
6.2 Using Lift Chartsp. 154
6.3 Lift Chart Tutorialp. 159
7 Tree Modelsp. 165
7.1 The Tree Algorithmp. 166
7.1.1 Calibrating the Tree on an Estimation Samplep. 167
7.1.2 Stopping Rules and Controlling Overfittingp. 170
7.2 Trees Models Tutorialp. 172
8 Neural Network Modelsp. 187
8.1 The Biological Inspiration for Artificial Neural Networksp. 187
8.2 Artificial Neural Networks as Predictive Modelsp. 192
8.3 Neural Network Models Tutorialp. 194
9 Putting It All Togetherp. 201
9.1 Stepwise Variable Selectionp. 201
9.2 The Rapid Model Development Frameworkp. 204
9.2.1 Up-Selling Using the Wesbrook Databasep. 204
9.2.2 Think about the Behavior That You Are Trying to Predictp. 205
9.2.3 Carefully Examine the Variables Contained in the Data Setp. 205
9.2.4 Use Decision Trees and Regression to Find the Important Predictor Variablesp. 207
9.2.5 Use a Neural Network to Examine Whether Nonlinear Relationships Are Presentp. 208
9.2.6 If There Are Nonlinear Relationships, Use Visualization to Find and Understand Themp. 209
9.3 Applying the Rapid Development Framework Tutorialp. 210
III Grouping Methodsp. 233
10 Ward's Method of Cluster Analysis and Principal Componentsp. 235
10.1 Summarizing Data Setsp. 235
10.2 Ward's Method of Cluster Analysisp. 236
10.2.1 A Single Variable Examplep. 238
10.2.2 Extension to Two or More Variablesp. 240
10.3 Principal Componentsp. 242
10.4 Ward's Method Tutorialp. 248
11 K-Centroids Partitioning Cluster Analysisp. 259
11.1 How K-Centroid Clustering Worksp. 260
11.1.1 The Basic Algorithm to Find K-Centroids Clustersp. 260
11.1.2 Specific K-Centroid Clustering Algorithmsp. 261
11.2 Cluster Types and the Nature of Customer Segmentsp. 264
11.3 Methods to Assess Cluster Structurep. 267
11.3.1 The Adjusted Rand Index to Assess Cluster Structure Reproducibilityp. 268
11.3.2 The Calinski-Harabasz Index to Assess within Cluster Homogeneity and between Cluster Separationp. 274
11.4 K-Centroids Clustering Tutorialp. 275
Bibliographyp. 283
Indexp. 287