Title:
Pattern classification using ensemble methods
Personal Author:
Series:
Series in machine perception and artificial intelligence ; 75
Publication Information:
Singapore ; Hackensack, NJ : World Scientific Publishing, 2010
Physical Description:
xv, 225 p. : ill. ; 24 cm.
ISBN:
9789814271066
Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010270383 | TK7882.P3 R65 2010 | Open Access Book | Book | Searching... |
On Order
Table of Contents
Preface | p. vii |
1 Introduction to Pattern Classification | p. 1 |
1.1 Pattern Classification | p. 2 |
1.2 Induction Algorithms | p. 4 |
1.3 Rule Induction | p. 5 |
1.4 Decision Trees | p. 5 |
1.5 Bayesian Methods | p. 8 |
1.5.1 Overview | p. 8 |
1.5.2 Naïve Bayes | p. 9 |
1.5.2.1 The Basic Naïve Bayes Classifier | p. 9 |
1.5.2.2 Naïve Bayes Induction for Numeric Attributes | p. 12 |
1.5.2.3 Correction to the Probability Estimation | p. 12 |
1.5.2.4 Laplace Correction | p. 13 |
1.5.2.5 No Match | p. 14 |
1.5.3 Other Bayesian Methods | p. 14 |
1.6 Other Induction Methods | p. 14 |
1.6.1 Neural Networks | p. 14 |
1.6.2 Genetic Algorithms | p. 17 |
1.6.3 Instance-based Learning | p. 17 |
1.6.4 Support Vector Machines | p. 18 |
2 Introduction to Ensemble Learning | p. 19 |
2.1 Back to the Roots | p. 20 |
2.2 The Wisdom of Crowds | p. 22 |
2.3 The Bagging Algorithm | p. 22 |
2.4 The Boosting Algorithm | p. 28 |
2.5 The AdaBoost Algorithm | p. 28 |
2.6 No Free Lunch Theorem and Ensemble Learning | p. 36 |
2.7 Bias-Variance Decomposition and Ensemble Learning | p. 38 |
2.8 Occam's Razor and Ensemble Learning | p. 40 |
2.9 Classifier Dependency | p. 41 |
2.9.1 Dependent Methods | p. 42 |
2.9.1.1 Model-guided Instance Selection | p. 42 |
2.9.1.2 Basic Boosting Algorithms | p. 42 |
2.9.1.3 Advanced Boosting Algorithms | p. 44 |
2.9.1.4 Incremental Batch Learning | p. 51 |
2.9.2 Independent Methods | p. 51 |
2.9.2.1 Bagging | p. 53 |
2.9.2.2 Wagging | p. 54 |
2.9.2.3 Random Forest and Random Subspace Projection | p. 55 |
2.9.2.4 Non-Linear Boosting Projection (NLBP) | p. 56 |
2.9.2.5 Cross-validated Committees | p. 58 |
2.9.2.6 Robust Boosting | p. 59 |
2.10 Ensemble Methods for Advanced Classification Tasks | p. 61 |
2.10.1 Cost-Sensitive Classification | p. 61 |
2.10.2 Ensemble for Learning Concept Drift | p. 63 |
2.10.3 Reject Driven Classification | p. 63 |
3 Ensemble Classification | p. 65 |
3.1 Fusions Methods | p. 65 |
3.1.1 Weighting Methods | p. 65 |
3.1.2 Majority Voting | p. 66 |
3.1.3 Performance Weighting | p. 67 |
3.1.4 Distribution Summation | p. 68 |
3.1.5 Bayesian Combination | p. 68 |
3.1.6 Dempster-Shafer | p. 69 |
3.1.7 Vogging | p. 69 |
3.1.8 Naïve Bayes | p. 69 |
3.1.9 Entropy Weighting | p. 70 |
3.1.10 Density-based Weighting | p. 70 |
3.1.11 DEA Weighting Method | p. 70 |
3.1.12 Logarithmic Opinion Pool | p. 71 |
3.1.13 Order Statistics | p. 71 |
3.2 Selecting Classification | p. 71 |
3.2.1 Partitioning the Instance Space | p. 74 |
3.2.1.1 The K-Means Algorithm as a Decomposition Tool | p. 75 |
3.2.1.2 Determining the Number of Subsets | p. 78 |
3.2.1.3 The Basic K-Classifier Algorithm | p. 78 |
3.2.1.4 The Heterogeneity Detecting K-Classifier (HDK-Classifier) | p. 81 |
3.2.1.5 Running-Time Complexity | p. 81 |
3.3 Mixture of Experts and Meta Learning | p. 82 |
3.3.1 Stacking | p. 82 |
3.3.2 Arbiter Trees | p. 85 |
3.3.3 Combiner Trees | p. 88 |
3.3.4 Grading | p. 88 |
3.3.5 Gating Network | p. 89 |
4 Ensemble Diversity | p. 93 |
4.1 Overview | p. 93 |
4.2 Manipulating the Inducer | p. 94 |
4.2.1 Manipulation of the Inducer's Parameters | p. 95 |
4.2.2 Starting Point in Hypothesis Space | p. 95 |
4.2.3 Hypothesis Space Traversal | p. 95 |
4.3 Manipulating the Training Samples | p. 96 |
4.3.1 Resampling | p. 96 |
4.3.2 Creation | p. 97 |
4.3.3 Partitioning | p. 100 |
4.4 Manipulating the Target Attribute Representation | p. 101 |
4.4.1 Label Switching | p. 102 |
4.5 Partitioning the Search Space | p. 103 |
4.5.1 Divide and Conquer | p. 104 |
4.5.2 Feature Subset-based Ensemble Methods | p. 105 |
4.5.2.1 Random-based Strategy | p. 106 |
4.5.2.2 Reduct-based Strategy | p. 106 |
4.5.2.3 Collective-Performance-based Strategy | p. 107 |
4.5.2.4 Feature Set Partitioning | p. 108 |
4.5.2.5 Rotation Forest | p. 111 |
4.6 Multi-Inducers | p. 112 |
4.7 Measuring the Diversity | p. 114 |
5 Ensemble Selection | p. 119 |
5.1 Ensemble Selection | p. 119 |
5.2 Pre Selection of the Ensemble Size | p. 120 |
5.3 Selection of the Ensemble Size While Training | p. 120 |
5.4 Pruning - Post Selection of the Ensemble Size | p. 121 |
5.4.1 Ranking-based | p. 122 |
5.4.2 Search based Methods | p. 123 |
5.4.2.1 Collective Agreement-based Ensemble Pruning Method | p. 124 |
5.4.3 Clustering-based Methods | p. 129 |
5.4.4 Pruning Timing | p. 129 |
5.4.4.1 Pre-combining Pruning | p. 129 |
5.4.4.2 Post-combining Pruning | p. 130 |
6 Error Correcting Output Codes | p. 133 |
6.1 Code-matrix Decomposition of Multiclass Problems | p. 135 |
6.2 Type I - Training an Ensemble Given a Code-Matrix | p. 136 |
6.2.1 Error correcting output codes | p. 138 |
6.2.2 Code-Matrix Framework | p. 139 |
6.2.3 Code-matrix Design Problem | p. 140 |
6.2.4 Orthogonal Arrays (OA) | p. 144 |
6.2.5 Hadamard Matrix | p. 146 |
6.2.6 Probabilistic Error Correcting Output Code | p. 146 |
6.2.7 Other ECOC Strategies | p. 147 |
6.3 Type II - Adapting Code-matrices to the Multiclass Problems | p. 149 |
7 Evaluating Ensembles of Classifiers | p. 153 |
7.1 Generalization Error | p. 153 |
7.1.1 Theoretical Estimation of Generalization Error | p. 154 |
7.1.2 Empirical Estimation of Generalization Error | p. 155 |
7.1.3 Alternatives to the Accuracy Measure | p. 157 |
7.1.4 The F-Measure | p. 158 |
7.1.5 Confusion Matrix | p. 160 |
7.1.6 Classifier Evaluation under Limited Resources | p. 161 |
7.1.6.1 ROC Curves | p. 163 |
7.1.6.2 Hit Rate Curve | p. 163 |
7.1.6.3 Qrecall (Quota Recall) | p. 164 |
7.1.6.4 Lift Curve | p. 164 |
7.1.6.5 Pearson Correlation Coefficient | p. 165 |
7.1.6.6 Area Under Curve (AUC) | p. 166 |
7.1.6.7 Average Hit Rate | p. 167 |
7.1.6.8 Average Qrecall | p. 168 |
7.1.6.9 Potential Extract Measure (PEM) | p. 170 |
7.1.7 Statistical Tests for Comparing Ensembles | p. 172 |
7.1.7.1 McNemar's Test | p. 173 |
7.1.7.2 A Test for the Difference of Two Proportions | p. 174 |
7.1.7.3 The Resampled Paired t Test | p. 175 |
7.1.7.4 The k-fold Cross-validated Paired t Test | p. 176 |
7.2 Computational Complexity | p. 176 |
7.3 Interpretability of the Resulting Ensemble | p. 177 |
7.4 Scalability to Large Datasets | p. 178 |
7.5 Robustness | p. 179 |
7.6 Stability | p. 180 |
7.7 Flexibility | p. 180 |
7.8 Usability | p. 180 |
7.9 Software Availability | p. 180 |
7.10 Which Ensemble Method Should be Used? | p. 181 |
Bibliography | p. 185 |
Index | p. 223 |