Pattern classification using ensemble methods

Preface	p. vii
1 Introduction to Pattern Classification	p. 1
1.1 Pattern Classification	p. 2
1.2 Induction Algorithms	p. 4
1.3 Rule Induction	p. 5
1.4 Decision Trees	p. 5
1.5 Bayesian Methods	p. 8
1.5.1 Overview	p. 8
1.5.2 Naïve Bayes	p. 9
1.5.2.1 The Basic Naïve Bayes Classifier	p. 9
1.5.2.2 Naïve Bayes Induction for Numeric Attributes	p. 12
1.5.2.3 Correction to the Probability Estimation	p. 12
1.5.2.4 Laplace Correction	p. 13
1.5.2.5 No Match	p. 14
1.5.3 Other Bayesian Methods	p. 14
1.6 Other Induction Methods	p. 14
1.6.1 Neural Networks	p. 14
1.6.2 Genetic Algorithms	p. 17
1.6.3 Instance-based Learning	p. 17
1.6.4 Support Vector Machines	p. 18
2 Introduction to Ensemble Learning	p. 19
2.1 Back to the Roots	p. 20
2.2 The Wisdom of Crowds	p. 22
2.3 The Bagging Algorithm	p. 22
2.4 The Boosting Algorithm	p. 28
2.5 The AdaBoost Algorithm	p. 28
2.6 No Free Lunch Theorem and Ensemble Learning	p. 36
2.7 Bias-Variance Decomposition and Ensemble Learning	p. 38
2.8 Occam's Razor and Ensemble Learning	p. 40
2.9 Classifier Dependency	p. 41
2.9.1 Dependent Methods	p. 42
2.9.1.1 Model-guided Instance Selection	p. 42
2.9.1.2 Basic Boosting Algorithms	p. 42
2.9.1.3 Advanced Boosting Algorithms	p. 44
2.9.1.4 Incremental Batch Learning	p. 51
2.9.2 Independent Methods	p. 51
2.9.2.1 Bagging	p. 53
2.9.2.2 Wagging	p. 54
2.9.2.3 Random Forest and Random Subspace Projection	p. 55
2.9.2.4 Non-Linear Boosting Projection (NLBP)	p. 56
2.9.2.5 Cross-validated Committees	p. 58
2.9.2.6 Robust Boosting	p. 59
2.10 Ensemble Methods for Advanced Classification Tasks	p. 61
2.10.1 Cost-Sensitive Classification	p. 61
2.10.2 Ensemble for Learning Concept Drift	p. 63
2.10.3 Reject Driven Classification	p. 63
3 Ensemble Classification	p. 65
3.1 Fusions Methods	p. 65
3.1.1 Weighting Methods	p. 65
3.1.2 Majority Voting	p. 66
3.1.3 Performance Weighting	p. 67
3.1.4 Distribution Summation	p. 68
3.1.5 Bayesian Combination	p. 68
3.1.6 Dempster-Shafer	p. 69
3.1.7 Vogging	p. 69
3.1.8 Naïve Bayes	p. 69
3.1.9 Entropy Weighting	p. 70
3.1.10 Density-based Weighting	p. 70
3.1.11 DEA Weighting Method	p. 70
3.1.12 Logarithmic Opinion Pool	p. 71
3.1.13 Order Statistics	p. 71
3.2 Selecting Classification	p. 71
3.2.1 Partitioning the Instance Space	p. 74
3.2.1.1 The K-Means Algorithm as a Decomposition Tool	p. 75
3.2.1.2 Determining the Number of Subsets	p. 78
3.2.1.3 The Basic K-Classifier Algorithm	p. 78
3.2.1.4 The Heterogeneity Detecting K-Classifier (HDK-Classifier)	p. 81
3.2.1.5 Running-Time Complexity	p. 81
3.3 Mixture of Experts and Meta Learning	p. 82
3.3.1 Stacking	p. 82
3.3.2 Arbiter Trees	p. 85
3.3.3 Combiner Trees	p. 88
3.3.4 Grading	p. 88
3.3.5 Gating Network	p. 89
4 Ensemble Diversity	p. 93
4.1 Overview	p. 93
4.2 Manipulating the Inducer	p. 94
4.2.1 Manipulation of the Inducer's Parameters	p. 95
4.2.2 Starting Point in Hypothesis Space	p. 95
4.2.3 Hypothesis Space Traversal	p. 95
4.3 Manipulating the Training Samples	p. 96
4.3.1 Resampling	p. 96
4.3.2 Creation	p. 97
4.3.3 Partitioning	p. 100
4.4 Manipulating the Target Attribute Representation	p. 101
4.4.1 Label Switching	p. 102
4.5 Partitioning the Search Space	p. 103
4.5.1 Divide and Conquer	p. 104
4.5.2 Feature Subset-based Ensemble Methods	p. 105
4.5.2.1 Random-based Strategy	p. 106
4.5.2.2 Reduct-based Strategy	p. 106
4.5.2.3 Collective-Performance-based Strategy	p. 107
4.5.2.4 Feature Set Partitioning	p. 108
4.5.2.5 Rotation Forest	p. 111
4.6 Multi-Inducers	p. 112
4.7 Measuring the Diversity	p. 114
5 Ensemble Selection	p. 119
5.1 Ensemble Selection	p. 119
5.2 Pre Selection of the Ensemble Size	p. 120
5.3 Selection of the Ensemble Size While Training	p. 120
5.4 Pruning - Post Selection of the Ensemble Size	p. 121
5.4.1 Ranking-based	p. 122
5.4.2 Search based Methods	p. 123
5.4.2.1 Collective Agreement-based Ensemble Pruning Method	p. 124
5.4.3 Clustering-based Methods	p. 129
5.4.4 Pruning Timing	p. 129
5.4.4.1 Pre-combining Pruning	p. 129
5.4.4.2 Post-combining Pruning	p. 130
6 Error Correcting Output Codes	p. 133
6.1 Code-matrix Decomposition of Multiclass Problems	p. 135
6.2 Type I - Training an Ensemble Given a Code-Matrix	p. 136
6.2.1 Error correcting output codes	p. 138
6.2.2 Code-Matrix Framework	p. 139
6.2.3 Code-matrix Design Problem	p. 140
6.2.4 Orthogonal Arrays (OA)	p. 144
6.2.5 Hadamard Matrix	p. 146
6.2.6 Probabilistic Error Correcting Output Code	p. 146
6.2.7 Other ECOC Strategies	p. 147
6.3 Type II - Adapting Code-matrices to the Multiclass Problems	p. 149
7 Evaluating Ensembles of Classifiers	p. 153
7.1 Generalization Error	p. 153
7.1.1 Theoretical Estimation of Generalization Error	p. 154
7.1.2 Empirical Estimation of Generalization Error	p. 155
7.1.3 Alternatives to the Accuracy Measure	p. 157
7.1.4 The F-Measure	p. 158
7.1.5 Confusion Matrix	p. 160
7.1.6 Classifier Evaluation under Limited Resources	p. 161
7.1.6.1 ROC Curves	p. 163
7.1.6.2 Hit Rate Curve	p. 163
7.1.6.3 Qrecall (Quota Recall)	p. 164
7.1.6.4 Lift Curve	p. 164
7.1.6.5 Pearson Correlation Coefficient	p. 165
7.1.6.6 Area Under Curve (AUC)	p. 166
7.1.6.7 Average Hit Rate	p. 167
7.1.6.8 Average Qrecall	p. 168
7.1.6.9 Potential Extract Measure (PEM)	p. 170
7.1.7 Statistical Tests for Comparing Ensembles	p. 172
7.1.7.1 McNemar's Test	p. 173
7.1.7.2 A Test for the Difference of Two Proportions	p. 174
7.1.7.3 The Resampled Paired t Test	p. 175
7.1.7.4 The k-fold Cross-validated Paired t Test	p. 176
7.2 Computational Complexity	p. 176
7.3 Interpretability of the Resulting Ensemble	p. 177
7.4 Scalability to Large Datasets	p. 178
7.5 Robustness	p. 179
7.6 Stability	p. 180
7.7 Flexibility	p. 180
7.8 Usability	p. 180
7.9 Software Availability	p. 180
7.10 Which Ensemble Method Should be Used?	p. 181
Bibliography	p. 185
Index	p. 223

Available:*

On Order

Table of Contents