Skip to:Content
|
Bottom
Cover image for Commercial data mining : processing, analysis and modeling for predictive analytics projects
Title:
Commercial data mining : processing, analysis and modeling for predictive analytics projects
Publication Information:
Amsterdam : Elsevier, 2014
Physical Description:
xi, 288 pages ; 23 cm.
ISBN:
9780124166028

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010337167 HD30.25 N48 2014 Open Access Book Book
Searching...
Searching...
33000000016496 HD30.25 N48 2014 Open Access Book Book
Searching...

On Order

Summary

Summary

Whether you are brand new to data mining or working on your tenth predictive analytics project, Commercial Data Mining will be there for you as an accessible reference outlining the entire process and related themes. In this book, you'll learn that your organization does not need a huge volume of data or a Fortune 500 budget to generate business using existing information assets. Expert author David Nettleton guides you through the process from beginning to end and covers everything from business objectives to data sources, and selection to analysis and predictive modeling.

Commercial Data Mining includes case studies and practical examples from Nettleton's more than 20 years of commercial experience. Real-world cases covering customer loyalty, cross-selling, and audience prediction in industries including insurance, banking, and media illustrate the concepts and techniques explained throughout the book.


Author Notes

David F. Nettleton has more than 25 years of experience in IT system development, specializing in databases and data analysis. He has a Bachelor of Science degree in Computer Science, Master of Science degree in Computer Software and Systems Design and a Ph.D. in Artificial Intelligence. He has worked for IBM as a Business Intelligence Consultant, among other companies. In 1995 he founded his own consultancy dedicated to commercial data analysis projects, working in the Banking, Insurance, Media, Industry and Health Sectors. He has published over 40 articles and papers in journals, national and international congresses and magazines, and has given many presentations in conferences and workshops. He is currently a contract researcher at the Universitat Pompeu Fabra, Barcelona, Spain and at the IIIA-CSIC, Spain, specializing in data mining applied to online social networks and data privacy. Dr. Nettleton was born in England and lives in Barcelona, Spain since 1988.


Table of Contents

Acknowledgmentsp. xi
1 Introductionp. 1
2 Business Objectivesp. 7
Introductionp. 7
Criteria for Choosing a Viable Projectp. 8
Evaluation of Potential Commercial Data Analysis Projects - General Considerationsp. 8
Evaluation of Viability in Terms of Available Data - Specific Considerationsp. 8
Factors That Influence Project Benefitsp. 9
Factors That Influence Project Costsp. 10
Example 1: Customer Call Center - Objective: IT Support for Customer Reclamationsp. 10
Overall Evaluation of the Cost and Benefit of Mr. Strong's Projectp. 12
Example 2: Online Music App - Objective: Determine Effectiveness of Advertising for Mobile Device Appsp. 13
Overall Evaluation of the Cost and Benefit of Melody-online's Projectp. 14
Summaryp. 15
Further Readingp. 16
3 Incorporating Various Sources of Data and Informationp. 17
Introductionp. 17
Data about a Business's Products and Servicesp. 19
Surveys and Questionnairesp. 20
Examples of Survey and Questionnaire Formsp. 21
Surveys and Questionnaires: Data Table Populationp. 24
Issues When Designing Formsp. 24
Loyalty Card/Customer Cardp. 26
Registration Form for a Customer Cardp. 27
Customer Card Registrations: Data Table Populationp. 30
Transactional Analysis of Customer Card Usagep. 36
Demographic Datap. 38
The Census: Census Data, United States, 2010p. 39
Macro-Economic Datap. 40
Data about Competitorsp. 43
Financial Markets Data: Stocks, Shares, Commodities, and Investmentsp. 45
4 Data Representationp. 49
Introductionp. 49
Basic Data Representationp. 49
Basic Data Typesp. 49
Representation, Comparison, and Processing of Variables of Different Typesp. 51
Normalization of the Values of a Variablep. 56
Distribution of the Values of a Variablep. 57
Atypical Values - Outliersp. 58
Advanced Data Representationp. 61
Hierarchical Datap. 61
Semantic Networksp. 62
Graph Datap. 63
Fuzzy Datap. 64
5 Data Qualityp. 67
Introductionp. 67
Examples of Typical Data Problemsp. 69
Content Errors in the Datap. 70
Relevance and Reliabilityp. 71
Quantitative Evaluation of the Data Qualityp. 73
Data Extraction and Data Quality - Common Mistakes and How to Avoid Themp. 74
Data Extractionp. 74
Derived Datap. 77
Summary of Data Extraction Example tp. 77
How Data Entry and Data Creation May Affect Data Qualityp. 78
6 Selection of Variables and Factor Derivationp. 79
Introductionp. 79
Selection from the Available Datap. 80
Statistical Techniques for Evaluating a Set of Input Variablesp. 81
Summary of the Approach of Selecting from the Available Datap. 87
Reverse Engineering: Selection by Considering the Desired Resultp. 87
Statistical Techniques for Evaluating and Selecting Input Variables For a Specific Business Objectivep. 87
Transforming Numerical Variables into Ordinal Categorical Variablesp. 90
Customer Segmentationp. 92
Summary of the Reverse Engineering Approachp. 99
Data Mining Approaches to Selecting Variablesp. 99
Rule Inductionp. 99
Neural Networksp. 100
Clusteringp. 101
Packaged Solutions: Preselecting Specific Variables for a Given Business Sectorp. 101
The FAMS (Fraud and Abuse Management) Systemp. 103
Summaryp. 104
7 Data Sampling and Partitioningp. 105
Introductionp. 105
Sampling for Data Reductionp. 106
Partitioning the Data Based on Business Criteriap. 111
Issues Related to Samplingp. 115
Sampling versus Big Datap. 116
8 Data Analysisp. 119
Introductionp. 119
Visualizationp. 120
Associationsp. 121
Clustering and Segmentationp. 122
Segmentation and Visualizationp. 124
Analysis of Transactional Sequencesp. 129
Analysis of Time Seriesp. 130
Bank Current Account: Time Series Data Profilesp. 131
Typical Mistakes when Performing Data Analysis and Interpreting Resultsp. 134
9 Data Modelingp. 137
Introductionp. 137
Modeling Concepts and Issuesp. 137
Supervised and Unsupervised Learningp. 137
Cross-Validationp. 138
Evaluating the Results of Data Models - Measuring Precisionp. 139
Neural Networksp. 141
Predictive Neural Networksp. 141
Kohonen Neural Network for Clusteringp. 144
Classification: Rule/Tree Inductionp. 144
The ID3 Decision Tree Induction Algorithmp. 146
The C4.5 Decision Tree Induction Algorithmp. 147
The C5.0 Decision Tree Induction Algorithmp. 148
Traditional Statistical Modelsp. 149
Regression Techniquesp. 149
Summary of the use of regression techniquesp. 151
K-meansp. 151
Other Methods and Techniques for Creating Predictive Modelsp. 152
Applying the Models to the Datap. 153
Simulation Models - "What If?"p. 154
Summary of Modelingp. 156
10 Deployment Systems: From Query Reporting to EIS and Expert Systemsp. 159
Introductionp. 159
Query and Report Generationp. 159
Query and Reporting Systemsp. 163
Executive Information Systemsp. 164
EIS Interface for a "What If" Scenario Modelerp. 164
Executive Information Systems (EIS)p. 166
Expert Systemsp. 167
Case-Based Systemsp. 169
Summaryp. 170
11 Text Analysisp. 171
Basic Analysis of Textual Informationp. 171
Advanced Analysis of Textual Informationp. 172
Keyword Definition and Information Retrievalp. 173
Identification of Names and Personal Information of Individualsp. 173
Identifying Blocks of Interesting Textp. 174
Information Retrieval Conceptsp. 175
Assessing Sentiment on Social Mediap. 176
Commercial Text Mining Productsp. 178
12 Data Mining from Relationally Structured Data, Marts, and Warehousesp. 181
Introductionp. 181
Data Warehouse and Data Martsp. 182
Creating a File or Table for Data Miningp. 186
13 CRM - Customer Relationship Management and Analysisp. 195
Introductionp. 195
CRM Metrics and Data Collectionp. 195
Customer Life Cyclep. 196
Example: Retail Bankp. 198
Integrated CRM Systemsp. 200
CRM Application Softwarep. 200
Customer Satisfactionp. 201
Example CRM Applicationp. 201
14 Analysis of Data on the Internet I - Website Analysis and Internet Search (Online Chapter)p. 209
15 Analysis of Data on the Internet II - Search Experience Analysis (Online Chapter)p. 211
16 Analysis of Data on the Internet III - Online Social Network Analysis (Online Chapter)p. 213
17 Analysis of Data on the Internet IV - Search Trend Analysis over Time (Online Chapter)p. 215
18 Data Privacy and Privacy-Preserving Data Publishingp. 217
Introductionp. 217
Popular Applications and Data Privacyp. 218
Legal Aspects - Responsibility and Limitsp. 220
Privacy-Preserving Data Publishingp. 221
Privacy Conceptsp. 221
Anonymization Techniquesp. 223
Document Sanitizationp. 226
19 Creating an Environment for Commercial Data Analysisp. 229
Introductionp. 229
Integrated Commercial Data Analysis Toolsp. 229
Creating an Ad Hoc/Low-Cost Environment for Commercial Data Analysisp. 233
20 Summaryp. 239
Appendix: Case Studiesp. 241
Case Study 1 Customer Loyalty at an Insurance Companyp. 241
Introductionp. 241
Definition of the Operational and Informational Data of Interestp. 242
Data Extraction and Creation of Files for Analysisp. 242
Data Explorationp. 243
Modeling Phasep. 248
Case Study 2 Cross-Selling a Pension Plan at a Retail Bankp. 251
Introductionp. 252
Data Definitionp. 252
Data Analysisp. 255
Model Generationp. 259
Results and Conclusionsp. 262
Example Weka Screens: Data Processing, Analysis, and Modelingp. 262
Case Study 3 Audience Prediction for a Television Channelp. 268
Introductionp. 268
Data Definitionp. 269
Data Analysisp. 270
Audience Prediction by Programp. 272
Audience Prediction for Publicity Blocksp. 273
Glossary (Online)p. 277
Bibliographyp. 279
Indexp. 281
Go to:Top of Page