Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010304288 | QA402.5 D46 2013 | Open Access Book | Book | Searching... |
Searching... | 30000010327971 | QA402.5 D46 2013 | Open Access Book | Book | Searching... |
Searching... | 33000000000038 | QA402.5 D46 2013 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)--classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.
The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twin SVMs for binary classification problems, SVMs for solving multi-classification problems based on ordinal regression, SVMs for semi-supervised problems, and SVMs for problems with perturbations.
To improve readability, concepts, methods, and results are introduced graphically and with clear explanations. For important concepts and algorithms, such as the Crammer-Singer SVM for multi-class classification problems, the text provides geometric interpretations that are not depicted in current literature.
Enabling a sound understanding of SVMs, this book gives beginners as well as more experienced researchers and engineers the tools to solve real-world problems using SVMs.
Table of Contents
List of Figures | p. xvii |
List of Tables | p. xxi |
Preface | p. xxiii |
List of Symbols | p. xxvii |
1 Optimization | p. 1 |
1.1 Optimization Problems in Euclidian Space | p. 1 |
1.1.1 An example of optimization problems | p. 1 |
1.1.2 Optimization problems and their solutions | p. 3 |
1.1.3 Geometric interpretation of optimization problems | p. 4 |
1.2 Convex Programming in Euclidean Space | p. 5 |
1.2.1 Convex sets and convex functions | p. 6 |
1.2.1.1 Convex sets | p. 6 |
1.2.1.2 Convex functions | p. 7 |
1.2.2 Convex programming and their properties | p. 8 |
1.2.2.1 Convex programming problems | p. 8 |
1.2.2.2 Basic properties | p. 9 |
1.2.3 Duality theory | p. 12 |
1.2.3.1 Derivation of the dual problem | p. 12 |
1.2.3.2 Duality theory | p. 13 |
1.2.4 Optimality conditions | p. 15 |
1.2.5 Linear programming | p. 16 |
1.3 Convex Programming in Hilbert Space | p. 18 |
1.3.1 Convex sets and Fréchet derivative | p. 18 |
1.3.2 Convex programming problems | p. 19 |
1.3.3 Duality theory | p. 20 |
1.3.4 Optimality conditions | p. 20 |
1.4 Convex Programming with Generalized Inequality Constraints in Euclidian Space | p. 21 |
1.4.1 Convex programming with generalized inequality constraints | p. 21 |
1.4.1.1 Cones | p. 21 |
1.4.1.2 Generalized inequalities | p. 21 |
1.4.1.3 Convex programming with generalized inequality constraints | p. 22 |
1.4.2 Duality theory | p. 23 |
1.4.2.1 Dual cones | p. 23 |
1.4.2.2 Derivation of the dual problem | p. 23 |
1.4.2.3 Duality theory | p. 25 |
1.4.3 Optimality conditions | p. 26 |
1.4.4 Second-order cone programming | p. 27 |
1.4.4.1 Second-order cone programming and its dual problem | p. 27 |
1.4.4.2 Software for second-order cone programming | p. 30 |
1.4.5 Semidefinite programming | p. 31 |
1.4.5.1 Semidefinite programming and its dual problem | p. 31 |
1.4.5.2 Software for semidefinite programming | p. 35 |
1.5 Convex Programming with Generalized Inequality Constraints in Hubert Space | p. 36 |
1.5.1 K-convex function and Fréchet derivative | p. 36 |
1.5.2 Convex programming | p. 36 |
1.5.3 Duality theory | p. 37 |
1.5.4 Optimality conditions | p. 38 |
2 Linear Classification | p. 41 |
2.1 Presentation of Classification Problems | p. 41 |
2.1.1 A sample (diagnosis of heart disease) | p. 41 |
2.1.2 Classification problems and classification machines | p. 43 |
2.2 Support Vector Classification (SVC) for Linearly Separable Problems | p. 45 |
2.2.1 Maximal margin method | p. 45 |
2.2.1.1 Derivation of the maximal margin method | p. 45 |
2.2.1.2 Properties of the maximal margin method | p. 48 |
2.2.2 Linearly separable support vector classification | p. 50 |
2.2.2.1 Relationship between the primal and dual problems | p. 50 |
2.2.2.2 Linearly separable support vector classification | p. 53 |
2.2.3 Support vector | p. 54 |
2.3 Linear C-Support Vector Classification | p. 56 |
2.3.1 Maximal margin method | p. 56 |
2.3.1.1 Derivation of the maximal margin method | p. 56 |
2.3.1.2 Properties of the maximal margin method | p. 57 |
2.3.2 Linear C-support vector classification | p. 59 |
2.3.2.1 Relationship between the primal and dual problems | p. 59 |
2.3.2.2 Linear C-support vector classification | p. 60 |
3 Linear Regression | p. 63 |
3.1 Regression Problems and Linear Regression Problems | p. 63 |
3.2 Hard ¿-Band Hyperplane | p. 65 |
3.2.1 Linear regression problem and hard ¿-band hyperplane | p. 65 |
3.2.2 Hard ¿-band hyperplane and linear classification | p. 66 |
3.2.3 Optimization problem of constructing a hard ¿-band hyperplane | p. 68 |
3.3 Linear Hard ¿-band Support Vector Regression | p. 70 |
3.3.1 Primal problem | p. 70 |
3.3.2 Dual problem and relationship between the primal and dual problems | p. 70 |
3.3.3 Linear hard ¿-band support vector regression | p. 74 |
3.4 Linear ¿-Support Vector Regression | p. 76 |
3.4.1 Primal problem | p. 76 |
3.4.2 Dual problem and relationship between the primal and dual problems | p. 77 |
3.4.3 Linear ¿-support vector regression | p. 79 |
4 Kernels and Support Vector Machines | p. 81 |
4.1 From Linear Classification to Nonlinear Classification | p. 81 |
4.1.1 An example of nonlinear classification | p. 81 |
4.1.2 Classification machine based on nonlinear separation | p. 82 |
4.1.3 Regression machine based on nonlinear separation | p. 87 |
4.2 Kernels | p. 92 |
4.2.1 Properties | p. 93 |
4.2.2 Construction of kernels | p. 93 |
4.2.2.1 Basic kernels | p. 94 |
4.2.2.2 Operations keeping kernels | p. 94 |
4.2.2.3 Commonly used kernels | p. 96 |
4.2.2.4 Graph kernel | p. 97 |
4.3 Support Vector Machines and Their Properties | p. 101 |
4.3.1 Support vector classification | p. 101 |
4.3.1.1 Algorithm | p. 101 |
4.3.1.2 Support vector | p. 103 |
4.3.1.3 Properties | p. 104 |
4.3.1.4 Soft margin loss function | p. 105 |
4.3.1.5 Probabilistic outputs | p. 106 |
4.3.2 Support vector regression | p. 109 |
4.3.2.1 Algorithm | p. 109 |
4.3.2.2 Support vector | p. 110 |
4.3.2.3 Properties | p. 112 |
4.3.2.4 ¿-Insensitive loss function | p. 112 |
4.3.3 Flatness of support vector machines | p. 113 |
4.3.3.1 Runge phenomenon | p. 114 |
4.3.3.2 Flatness of ¿-support vector regression | p. 115 |
4.3.3.3 Flatness of C-support vector classification | p. 118 |
4.4 Meaning of Kernels | p. 120 |
5 Basic Statistical Learning Theory of C-Support Vector Classification | p. 127 |
5.1 Classification Problems on Statistical Learning Theory | p. 127 |
5.1.1 Probability distribution | p. 127 |
5.1.2 Description of classification problems | p. 131 |
5.2 Empirical Risk Minimization | p. 134 |
5.3 Vapnik Chervonenkis (VC) Dimension | p. 135 |
5.4 Structural Risk Minimization | p. 138 |
5.5 An Implementation of Structural Risk Minimization | p. 140 |
5.5.1 Primal problem | p. 140 |
5.5.2 Quasi-dual problem and relationship between quasi-dual problem and primal problem | p. 141 |
5.5.3 Structural risk minimization classification | p. 144 |
5.6 Theoretical Foundation of C-Support Vector Classification on Statistical Learning Theory | p. 145 |
5.6.1 Linear C-support vector classification | p. 145 |
5.6.2 Relationship between dual problem and quasi-dual problem | p. 146 |
5.6.3 Interpretation of C-support vector classification | p. 148 |
6 Model Construction | p. 151 |
6.1 Data Generation | p. 151 |
6.1.1 Orthogonal encoding | p. 152 |
6.1.2 Spectrum profile encoding | p. 153 |
6.1.3 Positional weighted matrix encoding | p. 154 |
6.2 Data Preprocessing | p. 155 |
6.2.1 Representation of nominal features | p. 155 |
6.2.2 Feature selection | p. 156 |
6.2.2.1 F-score method | p. 156 |
6.2.2.2 Recursive feature elimination method | p. 157 |
6.2.2.3 Methods based on p-norm support vector classification (0 ¿ p ¿ 1) | p. 158 |
6.2.3 Feature extraction | p. 164 |
6.2.3.1 Linear dimensionality reduction | p. 164 |
6.2.3.2 Nonlinear dimensionality reduction | p. 165 |
6.2.4 Data compression | p. 168 |
6.2.5 Data rebalancing | p. 169 |
6.3 Model Selection | p. 171 |
6.3.1 Algorithm evaluation | p. 171 |
6.3.1.1 Some evaluation measures for a decision function | p. 172 |
6.3.1.2 Some evaluation measures for a concrete algorithm | p. 175 |
6.3.2 Selection of kernels and parameters | p. 178 |
6.4 Rule Extraction | p. 180 |
6.4.1 A toy example | p. 180 |
6.4.2 Rule extraction | p. 182 |
7 Implementation | p. 187 |
7.1 Stopping Criterion | p. 188 |
7.1.1 The first stopping criterion | p. 188 |
7.1.2 The second stopping criterion | p. 189 |
7.1.3 The third stopping criterion | p. 190 |
7.2 Chunking | p. 192 |
7.3 Decomposing | p. 194 |
7.4 Sequential Minimal Optimization | p. 197 |
7.4.1 Main steps | p. 198 |
7.4.2 Selecting the working set | p. 198 |
7.4.3 Analytical solution of the two-variables problem | p. 199 |
7.5 Software | p. 201 |
8 Variants and Extensions of Support Vector Machines | p. 203 |
8.1 Variants of Binary Support Vector Classification | p. 203 |
8.1.1 Support vector classification with homogeneous decision function | p. 203 |
8.1.2 Bounded support vector classification | p. 206 |
8.1.3 Least squares support vector classification | p. 209 |
8.1.4 Proximal support vector classification | p. 211 |
8.1.5 v-Support vector classification | p. 213 |
8.1.5.1 v-Support vector classification | p. 213 |
8.1.5.2 Relationship between v-SVC and C-SVC | p. 215 |
8.1.5.3 Significance of the parameter v | p. 215 |
8.1.6 Linear programming support vector classifications (LPSVC) | p. 216 |
8.1.6.1 LPSVC corresponding to C-SVC | p. 216 |
8.1.6.2 LPSVC corresponding to v-SVC | p. 218 |
8.1.7 Twin support vector classification | p. 218 |
8.2 Variants of Support Vector Regression | p. 224 |
8.2.1 Least squares support vector regression | p. 225 |
8.2.2 v-Support vector regression | p. 226 |
8.2.2.1 v-Support vector regression | p. 227 |
8.2.2.2 Relationship between v-SVR and ¿-SVR | p. 229 |
8.2.2.3 The significance of the parameter v | p. 229 |
8.2.2.4 Linear programming support vector regression (LPSVR) | p. 229 |
8.3 Multiclass Classification | p. 232 |
8.3.1 Approaches based on binary classifiers | p. 232 |
8.3.1.1 One versus one | p. 232 |
8.3.1.2 One versus the rest | p. 232 |
8.3.1.3 Error-correcting output coding | p. 234 |
8.3.2 Approach based on ordinal regression machines | p. 236 |
8.3.2.1 Ordinal regression machine | p. 237 |
8.3.2.2 Approach based on ordinal regression machines | p. 240 |
8.3.3 Crammer-Singer multiclass support vector classification | p. 243 |
8.3.3.1 Basic idea | p. 243 |
8.3.3.2 Primal problem | p. 243 |
8.3.3.3 Crammer-Singer support vector classification | p. 245 |
8.4 Semisupervised Classification | p. 247 |
8.4.1 PU classification problem | p. 247 |
8.4.2 Biased support vector classification [101] | p. 247 |
8.4.2.1 Optimization problem | p. 247 |
8.4.2.2 The selection of the parameters C + and C - | p. 249 |
8.4.3 Classification problem with labeled and unlabeled in puts | p. 250 |
8.4.4 Support vector classification by semidefinite programming | p. 251 |
8.4.4.1 Optimization problem | p. 251 |
8.4.4.2 Approximate solution via semidefinite programming | p. 252 |
8.4.4.3 Support vector classification by semidefinite programming | p. 254 |
8.5 Universum Classification | p. 255 |
8.5.1 Universum classification problem | p. 255 |
8.5.2 Primal problem and dual problem | p. 256 |
8.5.2.1 Algorithm and its relationship with three-class classification | p. 258 |
8.5.2.2 Construction of Universum | p. 258 |
8.6 Privileged Classification | p. 259 |
8.6.1 Linear privileged support vector classification | p. 259 |
8.6.2 Nonlinear privileged support vector classification | p. 261 |
8.6.3 A variation | p. 263 |
8.7 Knowledge-based Classification | p. 265 |
8.7.1 Knowledge-based linear support vector classification | p. 265 |
8.7.2 Knowledge-based nonlinear support vector classification | p. 268 |
8.8 Robust Classification | p. 272 |
8.8.1 Robust classification problem | p. 272 |
8.8.2 The solution when the input sets are polyhedrons | p. 273 |
8.8.2.1 Linear robust support vector classification | p. 273 |
8.8.2.2 Robust support vector classification | p. 277 |
8.8.3 The solution when the input sets are superspheres | p. 277 |
8.8.3.1 Linear robust support vector classification | p. 277 |
8.8.3.2 Robust support vector classification | p. 281 |
8.9 Multi-instance Classification | p. 282 |
8.9.1 Multi-instance classification problem | p. 282 |
8.9.2 Multi-instance linear support vector classification | p. 283 |
8.9.2.1 Optimization problem | p. 283 |
8.9.2.2 Linear support vector classification | p. 286 |
8.9.3 Multi-instance support vector classification | p. 288 |
8.10 Multi-label Classification | p. 292 |
8.10.1 Problem transformation methods | p. 292 |
8.10.2 Algorithm adaptation methods | p. 294 |
8.10.2.1 A ranking system | p. 294 |
8.10.2.2 Label set size prediction | p. 296 |
8.10.3 Algorithm | p. 297 |
Bibliography | p. 299 |
Index | p. 315 |