Cover image for Business intelligence : data mining and optimization for decision making
Title:
Business intelligence : data mining and optimization for decision making
Personal Author:
Publication Information:
West Sussex, England : John Wiley & Sons, 2009
Physical Description:
xviii, 417 p. : ill., maps ; 24 cm.
ISBN:
9780470511381

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010201315 HD30.23 V47 2009 Open Access Book Book
Searching...

On Order

Summary

Summary

Business intelligence is a broad category of applications and technologies for gathering, providing access to, and analyzing data for the purpose of helping enterprise users make better business decisions. The term implies having a comprehensive knowledge of all factors that affect a business, such as customers, competitors, business partners, economic environment, and internal operations, therefore enabling optimal decisions to be made.

Business Intelligence provides readers with an introduction and practical guide to the mathematical models and analysis methodologies vital to business intelligence.

This book:

Combines detailed coverage with a practical guide to the mathematical models and analysis methodologies of business intelligence. Covers all the hot topics such as data warehousing, data mining and its applications, machine learning, classification, supply optimization models, decision support systems, and analytical methods for performance evaluation. Is made accessible to readers through the careful definition and introduction of each concept, followed by the extensive use of examples and numerous real-life case studies. Explains how to utilise mathematical models and analysis models to make effective and good quality business decisions.

This book is aimed at postgraduate students following data analysis and data mining courses.

Researchers looking for a systematic and broad coverage of topics in operations research and mathematical models for decision-making will find this an invaluable guide.


Author Notes

Carlo Vercellis - School of Management, Politecnico di Milano, Italy

As well as teaching courses in Operations Research and Business Intelligence, Professor Vercellis is director of the research group MOLD (Mathematical Modeling, Optimization, Learning from Data). He has written four book in Italian, contributed to numerous other books, and has had many papers published in a variety of international journals.


Table of Contents

Prefacep. xiii
I Components of the decision-making processp. 1
1 Business intelligencep. 3
1.1 Effective and timely decisionsp. 3
1.2 Data, information and knowledgep. 6
1.3 The role of mathematical modelsp. 8
1.4 Business intelligence architecturesp. 9
1.4.1 Cycle of a business intelligence analysisp. 11
1.4.2 Enabling factors in business intelligence projectsp. 13
1.4.3 Development of a business intelligence systemp. 14
1.5 Ethics and business intelligencep. 17
1.6 Notes and readingsp. 18
2 Decision support systemsp. 21
2.1 Definition of systemp. 21
2.2 Representation of the decision-making processp. 23
2.2.1 Rationality and problem solvingp. 24
2.2.2 The decision-making processp. 25
2.2.3 Types of decisionsp. 29
2.2.4 Approaches to the decision-making processp. 33
2.3 Evolution of information systemsp. 35
2.4 Definition of decision support systemp. 36
2.5 Development of a decision support systemp. 40
2.6 Notes and readingsp. 43
3 Data warehousingp. 45
3.1 Definition of data warehousep. 45
3.1.1 Data martsp. 49
3.1.2 Data qualityp. 50
3.2 Data warehouse architecturep. 51
3.2.1 ETL toolsp. 53
3.2.2 Metadatap. 54
3.3 Cubes and multidimensional analysisp. 55
3.3.1 Hierarchies of concepts and OLAP operationsp. 60
3.3.2 Materialization of cubes of datap. 61
3.4 Notes and readingsp. 62
II Mathematical models and methodsp. 63
4 Mathematical models for decision makingp. 65
4.1 Structure of mathematical modelsp. 65
4.2 Development of a modelp. 67
4.3 Classes of modelsp. 70
4.4 Notes and readingsp. 75
5 Data miningp. 77
5.1 Definition of data miningp. 77
5.1.1 Models and methods for data miningp. 79
5.1.2 Data mining, classical statistics and OLAPp. 80
5.1.3 Applications of data miningp. 81
5.2 Representation of input datap. 82
5.3 Data mining processp. 84
5.4 Analysis methodologiesp. 90
5.5 Notes and readingsp. 94
6 Data preparationp. 95
6.1 Data validationp. 95
6.1.1 Incomplete datap. 96
6.1.2 Data affected by noisep. 97
6.2 Data transformationp. 99
6.2.1 Standardizationp. 99
6.2.2 Feature extractionp. 100
6.3 Data reductionp. 100
6.3.1 Samplingp. 101
6.3.2 Feature selectionp. 102
6.3.3 Principal component analysisp. 104
6.3.4 Data discretizationp. 109
7 Data explorationp. 113
7.1 Univariate analysisp. 113
7.1.1 Graphical analysis of categorical attributesp. 114
7.1.2 Graphical analysis of numerical attributesp. 116
7.1.3 Measures of central tendency for numerical attributesp. 118
7.1.4 Measures of dispersion for numerical attributesp. 121
7.1.5 Measures of relative location for numerical attributesp. 126
7.1.6 Identification of outliers for numerical attributesp. 127
7.1.7 Measures of heterogeneity for categorical attributesp. 129
7.1.8 Analysis of the empirical densityp. 130
7.1.9 Summary statisticsp. 135
7.2 Bivariate analysisp. 136
7.2.1 Graphical analysisp. 136
7.2.2 Measures of correlation for numerical attributesp. 142
7.2.3 Contingency tables for categorical attributesp. 145
7.3 Multivariate analysisp. 147
7.3.1 Graphical analysisp. 147
7.3.2 Measures of correlation for numerical attributesp. 149
7.4 Notes and readingsp. 152
8 Regressionp. 153
8.1 Structure of regression modelsp. 153
8.2 Simple linear regressionp. 156
8.2.1 Calculating the regression linep. 158
8.3 Multiple linear regressionp. 161
8.3.1 Calculating the regression coefficientsp. 162
8.3.2 Assumptions on the residualsp. 163
8.3.3 Treatment of categorical predictive attributesp. 166
8.3.4 Ridge regressionp. 167
8.3.5 Generalized linear regressionp. 168
8.4 Validation of regression modelsp. 168
8.4.1 Normality and independence of the residualsp. 169
8.4.2 Significance of the coefficientsp. 172
8.4.3 Analysis of variancep. 174
8.4.4 Coefficient of determinationp. 175
8.4.5 Coefficient of linear correlationp. 176
8.4.6 Multicollinearity of the independent variablesp. 177
8.4.7 Confidence and prediction limitsp. 178
8.5 Selection of predictive variablesp. 179
8.5.1 Example of development of a regression modelp. 180
8.6 Notes and readingsp. 185
9 Time seriesp. 187
9.1 Definition of time seriesp. 187
9.1.1 Index numbersp. 190
9.2 Evaluating time series modelsp. 192
9.2.1 Distortion measuresp. 192
9.2.2 Dispersion measuresp. 193
9.2.3 Tracking signalp. 194
9.3 Analysis of the components of time seriesp. 195
9.3.1 Moving averagep. 196
9.3.2 Decomposition of a time seriesp. 198
9.4 Exponential smoothing modelsp. 203
9.4.1 Simple exponential smoothingp. 203
9.4.2 Exponential smoothing with trend adjustmentp. 204
9.4.3 Exponential smoothing with trend and seasonalityp. 206
9.4.4 Simple adaptive exponential smoothingp. 207
9.4.5 Exponential smoothing with damped trendp. 208
9.4.6 Initial values for exponential smoothing modelsp. 209
9.4.7 Removal of trend and seasonalityp. 209
9.5 Autoregressive modelsp. 210
9.5.1 Moving average modelsp. 212
9.5.2 Autoregressive moving average modelsp. 212
9.5.3 Autoregressive integrated moving average modelsp. 212
9.5.4 Identification of autoregressive modelsp. 213
9.6 Combination of predictive modelsp. 216
9.7 The forecasting processp. 217
9.7.1 Characteristics of the forecasting processp. 217
9.7.2 Selection of a forecasting methodp. 219
9.8 Notes and readingsp. 219
10 Classificationp. 221
10.1 Classification problemsp. 221
10.1.1 Taxonomy of classification modelsp. 224
10.2 Evaluation of classification modelsp. 226
10.2.1 Holdout methodp. 228
10.2.2 Repeated random samplingp. 228
10.2.3 Cross-validationp. 229
10.2.4 Confusion matricesp. 230
10.2.5 ROC curve chartsp. 233
10.2.6 Cumulative gain and lift chartsp. 234
10.3 Classification treesp. 236
10.3.1 Splitting rulesp. 240
10.3.2 Univariate splitting criteriap. 243
10.3.3 Example of development of a classification treep. 246
10.3.4 Stopping criteria and pruning rulesp. 250
10.4 Bayesian methodsp. 251
10.4.1 Naive Bayesian classifiersp. 252
10.4.2 Example of naive Bayes classifierp. 253
10.4.3 Bayesian networksp. 256
10.5 Logistic regressionp. 257
10.6 Neural networksp. 259
10.6.1 The Rosenblatt perceptronp. 259
10.6.2 Multi-level feed-forward networksp. 260
10.7 Support vector machinesp. 262
10.7.1 Structural risk minimizationp. 262
10.7.2 Maximal margin hyperplane for linear separationp. 266
10.7.3 Nonlinear separationp. 270
10.8 Notes and readingsp. 275
11 Association rulesp. 277
11.1 Motivation and structure of association rulesp. 277
11.2 Single-dimension association rulesp. 281
11.3 Apriori algorithmp. 284
11.3.1 Generation of frequent itemsetsp. 284
11.3.2 Generation of strong rulesp. 285
11.4 General Association rulesp. 288
11.5 Notes and readingsp. 290
12 Clusteringp. 293
12.1 Clustering methodsp. 293
12.1.1 Taxonomy of clustering methodsp. 294
12.1.2 Affinity measuresp. 296
12.2 Partition methodsp. 302
12.2.1 K-means algorithmp. 302
12.2.2 K-medoids algorithmp. 305
12.3 Hierarchical methodsp. 307
12.3.1 Agglomerative hierarchical methodsp. 308
12.3.2 Divisive hierarchical methodsp. 310
12.4 Evaluation of clustering modelsp. 312
12.5 Notes and readingsp. 315
III Business intelligence applicationsp. 317
13 Marketing modelsp. 319
13.1 Relational marketingp. 320
13.1.1 Motivations and objectivesp. 320
13.1.2 An environment for relational marketing analysisp. 327
13.1.3 Lifetime valuep. 329
13.1.4 The effect of latency in predictive modelsp. 332
13.1.5 Acquisitionp. 333
13.1.6 Retentionp. 334
13.1.7 Cross-selling and up-sellingp. 335
13.1.8 Market basket analysisp. 335
13.1.9 Web miningp. 336
13.2 Salesforce managementp. 338
13.2.1 Decision processes in salesforce managementp. 339
13.2.2 Models for salesforce managementp. 342
13.2.3 Response functionsp. 343
13.2.4 Sales territory designp. 346
13.2.5 Calls and product presentations planningp. 347
13.3 Business case studiesp. 352
13.3.1 Retention in telecommunicationsp. 352
13.3.2 Acquisition in the automotive industryp. 354
13.3.3 Cross-selling in the retail industryp. 358
13.4 Notes and readingsp. 360
14 Logistic and production modelsp. 361
14.1 Supply chain optimizationp. 362
14.2 Optimization models for logistics planningp. 364
14.2.1 Tactical planningp. 364
14.2.2 Extra capacityp. 365
14.2.3 Multiple resourcesp. 366
14.2.4 Backloggingp. 366
14.2.5 Minimum lots and fixed costsp. 369
14.2.6 Bill of materialsp. 370
14.2.7 Multiple plantsp. 371
14.3 Revenue management systemsp. 372
14.3.1 Decision processes in revenue managementp. 373
14.4 Business case studiesp. 376
14.4.1 Logistics planning in the food industryp. 376
14.4.2 Logistics planning in the packaging industryp. 383
14.5 Notes and readingsp. 384
15 Data envelopment analysisp. 385
15.1 Efficiency measuresp. 386
15.2 Efficient frontierp. 386
15.3 The CCR modelp. 390
15.3.1 Definition of target objectivesp. 392
15.3.2 Peer groupsp. 393
15.4 Identification of good operating practicesp. 394
15.4.1 Cross-efficiency analysisp. 394
15.4.2 Virtual inputs and virtual outputsp. 395
15.4.3 Weight restrictionsp. 396
15.5 Other modelsp. 396
15.6 Notes and readingsp. 397
Appendix A Software toolsp. 399
Appendix B Dataset repositoriesp. 401
Referencesp. 403
Indexp. 413