Skip to:Content
|
Bottom
Cover image for Support vector machines : optimization based theory, algorithms, and extensions
Title:
Support vector machines : optimization based theory, algorithms, and extensions
Personal Author:
Series:
Chapman & Hall/CRC data mining and knowledge discovery series

Chapman & Hall/CRC data mining and knowledge discovery series
Publication Information:
London : CRC, 2012
Physical Description:
xxvii, 335 pages : illustrations ; 24 cm.
ISBN:
9781439857922
Abstract:
"Preface Support vector machines (SVMs), which were introduced by Vapnik in the early 1990s, are proved effective and promising techniques for data mining. SVMs have recently been breakthroughs in advance in their theoretical studies and implementations of algorithms. They have been successfully applied in many fields such as text categorization, speech recognition, remote sensing image analysis, time series forecasting, information security and etc. SVMs, having their roots in Statistical Learning Theory (SLT) and optimization methods, become powerful tools to solve the problems of machine learning with finite training points and to overcome some traditional difficulties such as the "curse of dimensionality", "over-fitting" and etc. SVMs theoretical foundation and implementation techniques have been established and SVMs are gaining quick development and popularity due to their many attractive features: nice mathematical representations, geometrical explanations, good generalization abilities and promising empirical performance. Some SVM monographs, including more sophisticated ones such as Cristianini & Shawe-Taylor [39] and Scholkopf & Smola [124], have been published. We have published two books about SVMs in Science Press of China since 2004 [42, 43], which attracted widespread concerns and received favorable comments. After several years research and teaching, we decide to rewrite the books and add new research achievements. The starting point and focus of the book is optimization theory, which is different from other books on SVMs in this respect. Optimization is one of the pillars on which SVMs are built, so it makes a lot of sense to consider them from this point of view"-- Provided by publisher.

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010304288 QA402.5 D46 2013 Open Access Book Book
Searching...
Searching...
30000010327971 QA402.5 D46 2013 Open Access Book Book
Searching...
Searching...
33000000000038 QA402.5 D46 2013 Open Access Book Book
Searching...

On Order

Summary

Summary

Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)--classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.

The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twin SVMs for binary classification problems, SVMs for solving multi-classification problems based on ordinal regression, SVMs for semi-supervised problems, and SVMs for problems with perturbations.

To improve readability, concepts, methods, and results are introduced graphically and with clear explanations. For important concepts and algorithms, such as the Crammer-Singer SVM for multi-class classification problems, the text provides geometric interpretations that are not depicted in current literature.

Enabling a sound understanding of SVMs, this book gives beginners as well as more experienced researchers and engineers the tools to solve real-world problems using SVMs.


Table of Contents

List of Figuresp. xvii
List of Tablesp. xxi
Prefacep. xxiii
List of Symbolsp. xxvii
1 Optimizationp. 1
1.1 Optimization Problems in Euclidian Spacep. 1
1.1.1 An example of optimization problemsp. 1
1.1.2 Optimization problems and their solutionsp. 3
1.1.3 Geometric interpretation of optimization problemsp. 4
1.2 Convex Programming in Euclidean Spacep. 5
1.2.1 Convex sets and convex functionsp. 6
1.2.1.1 Convex setsp. 6
1.2.1.2 Convex functionsp. 7
1.2.2 Convex programming and their propertiesp. 8
1.2.2.1 Convex programming problemsp. 8
1.2.2.2 Basic propertiesp. 9
1.2.3 Duality theoryp. 12
1.2.3.1 Derivation of the dual problemp. 12
1.2.3.2 Duality theoryp. 13
1.2.4 Optimality conditionsp. 15
1.2.5 Linear programmingp. 16
1.3 Convex Programming in Hilbert Spacep. 18
1.3.1 Convex sets and Fréchet derivativep. 18
1.3.2 Convex programming problemsp. 19
1.3.3 Duality theoryp. 20
1.3.4 Optimality conditionsp. 20
1.4 Convex Programming with Generalized Inequality Constraints in Euclidian Spacep. 21
1.4.1 Convex programming with generalized inequality constraintsp. 21
1.4.1.1 Conesp. 21
1.4.1.2 Generalized inequalitiesp. 21
1.4.1.3 Convex programming with generalized inequality constraintsp. 22
1.4.2 Duality theoryp. 23
1.4.2.1 Dual conesp. 23
1.4.2.2 Derivation of the dual problemp. 23
1.4.2.3 Duality theoryp. 25
1.4.3 Optimality conditionsp. 26
1.4.4 Second-order cone programmingp. 27
1.4.4.1 Second-order cone programming and its dual problemp. 27
1.4.4.2 Software for second-order cone programmingp. 30
1.4.5 Semidefinite programmingp. 31
1.4.5.1 Semidefinite programming and its dual problemp. 31
1.4.5.2 Software for semidefinite programmingp. 35
1.5 Convex Programming with Generalized Inequality Constraints in Hubert Spacep. 36
1.5.1 K-convex function and Fréchet derivativep. 36
1.5.2 Convex programmingp. 36
1.5.3 Duality theoryp. 37
1.5.4 Optimality conditionsp. 38
2 Linear Classificationp. 41
2.1 Presentation of Classification Problemsp. 41
2.1.1 A sample (diagnosis of heart disease)p. 41
2.1.2 Classification problems and classification machinesp. 43
2.2 Support Vector Classification (SVC) for Linearly Separable Problemsp. 45
2.2.1 Maximal margin methodp. 45
2.2.1.1 Derivation of the maximal margin methodp. 45
2.2.1.2 Properties of the maximal margin methodp. 48
2.2.2 Linearly separable support vector classificationp. 50
2.2.2.1 Relationship between the primal and dual problemsp. 50
2.2.2.2 Linearly separable support vector classificationp. 53
2.2.3 Support vectorp. 54
2.3 Linear C-Support Vector Classificationp. 56
2.3.1 Maximal margin methodp. 56
2.3.1.1 Derivation of the maximal margin methodp. 56
2.3.1.2 Properties of the maximal margin methodp. 57
2.3.2 Linear C-support vector classificationp. 59
2.3.2.1 Relationship between the primal and dual problemsp. 59
2.3.2.2 Linear C-support vector classificationp. 60
3 Linear Regressionp. 63
3.1 Regression Problems and Linear Regression Problemsp. 63
3.2 Hard ¿-Band Hyperplanep. 65
3.2.1 Linear regression problem and hard ¿-band hyperplanep. 65
3.2.2 Hard ¿-band hyperplane and linear classificationp. 66
3.2.3 Optimization problem of constructing a hard ¿-band hyperplanep. 68
3.3 Linear Hard ¿-band Support Vector Regressionp. 70
3.3.1 Primal problemp. 70
3.3.2 Dual problem and relationship between the primal and dual problemsp. 70
3.3.3 Linear hard ¿-band support vector regressionp. 74
3.4 Linear ¿-Support Vector Regressionp. 76
3.4.1 Primal problemp. 76
3.4.2 Dual problem and relationship between the primal and dual problemsp. 77
3.4.3 Linear ¿-support vector regressionp. 79
4 Kernels and Support Vector Machinesp. 81
4.1 From Linear Classification to Nonlinear Classificationp. 81
4.1.1 An example of nonlinear classificationp. 81
4.1.2 Classification machine based on nonlinear separationp. 82
4.1.3 Regression machine based on nonlinear separationp. 87
4.2 Kernelsp. 92
4.2.1 Propertiesp. 93
4.2.2 Construction of kernelsp. 93
4.2.2.1 Basic kernelsp. 94
4.2.2.2 Operations keeping kernelsp. 94
4.2.2.3 Commonly used kernelsp. 96
4.2.2.4 Graph kernelp. 97
4.3 Support Vector Machines and Their Propertiesp. 101
4.3.1 Support vector classificationp. 101
4.3.1.1 Algorithmp. 101
4.3.1.2 Support vectorp. 103
4.3.1.3 Propertiesp. 104
4.3.1.4 Soft margin loss functionp. 105
4.3.1.5 Probabilistic outputsp. 106
4.3.2 Support vector regressionp. 109
4.3.2.1 Algorithmp. 109
4.3.2.2 Support vectorp. 110
4.3.2.3 Propertiesp. 112
4.3.2.4 ¿-Insensitive loss functionp. 112
4.3.3 Flatness of support vector machinesp. 113
4.3.3.1 Runge phenomenonp. 114
4.3.3.2 Flatness of ¿-support vector regressionp. 115
4.3.3.3 Flatness of C-support vector classificationp. 118
4.4 Meaning of Kernelsp. 120
5 Basic Statistical Learning Theory of C-Support Vector Classificationp. 127
5.1 Classification Problems on Statistical Learning Theoryp. 127
5.1.1 Probability distributionp. 127
5.1.2 Description of classification problemsp. 131
5.2 Empirical Risk Minimizationp. 134
5.3 Vapnik Chervonenkis (VC) Dimensionp. 135
5.4 Structural Risk Minimizationp. 138
5.5 An Implementation of Structural Risk Minimizationp. 140
5.5.1 Primal problemp. 140
5.5.2 Quasi-dual problem and relationship between quasi-dual problem and primal problemp. 141
5.5.3 Structural risk minimization classificationp. 144
5.6 Theoretical Foundation of C-Support Vector Classification on Statistical Learning Theoryp. 145
5.6.1 Linear C-support vector classificationp. 145
5.6.2 Relationship between dual problem and quasi-dual problemp. 146
5.6.3 Interpretation of C-support vector classificationp. 148
6 Model Constructionp. 151
6.1 Data Generationp. 151
6.1.1 Orthogonal encodingp. 152
6.1.2 Spectrum profile encodingp. 153
6.1.3 Positional weighted matrix encodingp. 154
6.2 Data Preprocessingp. 155
6.2.1 Representation of nominal featuresp. 155
6.2.2 Feature selectionp. 156
6.2.2.1 F-score methodp. 156
6.2.2.2 Recursive feature elimination methodp. 157
6.2.2.3 Methods based on p-norm support vector classification (0 ¿ p ¿ 1)p. 158
6.2.3 Feature extractionp. 164
6.2.3.1 Linear dimensionality reductionp. 164
6.2.3.2 Nonlinear dimensionality reductionp. 165
6.2.4 Data compressionp. 168
6.2.5 Data rebalancingp. 169
6.3 Model Selectionp. 171
6.3.1 Algorithm evaluationp. 171
6.3.1.1 Some evaluation measures for a decision functionp. 172
6.3.1.2 Some evaluation measures for a concrete algorithmp. 175
6.3.2 Selection of kernels and parametersp. 178
6.4 Rule Extractionp. 180
6.4.1 A toy examplep. 180
6.4.2 Rule extractionp. 182
7 Implementationp. 187
7.1 Stopping Criterionp. 188
7.1.1 The first stopping criterionp. 188
7.1.2 The second stopping criterionp. 189
7.1.3 The third stopping criterionp. 190
7.2 Chunkingp. 192
7.3 Decomposingp. 194
7.4 Sequential Minimal Optimizationp. 197
7.4.1 Main stepsp. 198
7.4.2 Selecting the working setp. 198
7.4.3 Analytical solution of the two-variables problemp. 199
7.5 Softwarep. 201
8 Variants and Extensions of Support Vector Machinesp. 203
8.1 Variants of Binary Support Vector Classificationp. 203
8.1.1 Support vector classification with homogeneous decision functionp. 203
8.1.2 Bounded support vector classificationp. 206
8.1.3 Least squares support vector classificationp. 209
8.1.4 Proximal support vector classificationp. 211
8.1.5 v-Support vector classificationp. 213
8.1.5.1 v-Support vector classificationp. 213
8.1.5.2 Relationship between v-SVC and C-SVCp. 215
8.1.5.3 Significance of the parameter vp. 215
8.1.6 Linear programming support vector classifications (LPSVC)p. 216
8.1.6.1 LPSVC corresponding to C-SVCp. 216
8.1.6.2 LPSVC corresponding to v-SVCp. 218
8.1.7 Twin support vector classificationp. 218
8.2 Variants of Support Vector Regressionp. 224
8.2.1 Least squares support vector regressionp. 225
8.2.2 v-Support vector regressionp. 226
8.2.2.1 v-Support vector regressionp. 227
8.2.2.2 Relationship between v-SVR and ¿-SVRp. 229
8.2.2.3 The significance of the parameter vp. 229
8.2.2.4 Linear programming support vector regression (LPSVR)p. 229
8.3 Multiclass Classificationp. 232
8.3.1 Approaches based on binary classifiersp. 232
8.3.1.1 One versus onep. 232
8.3.1.2 One versus the restp. 232
8.3.1.3 Error-correcting output codingp. 234
8.3.2 Approach based on ordinal regression machinesp. 236
8.3.2.1 Ordinal regression machinep. 237
8.3.2.2 Approach based on ordinal regression machinesp. 240
8.3.3 Crammer-Singer multiclass support vector classificationp. 243
8.3.3.1 Basic ideap. 243
8.3.3.2 Primal problemp. 243
8.3.3.3 Crammer-Singer support vector classificationp. 245
8.4 Semisupervised Classificationp. 247
8.4.1 PU classification problemp. 247
8.4.2 Biased support vector classification [101]p. 247
8.4.2.1 Optimization problemp. 247
8.4.2.2 The selection of the parameters C + and C -p. 249
8.4.3 Classification problem with labeled and unlabeled in putsp. 250
8.4.4 Support vector classification by semidefinite programmingp. 251
8.4.4.1 Optimization problemp. 251
8.4.4.2 Approximate solution via semidefinite programmingp. 252
8.4.4.3 Support vector classification by semidefinite programmingp. 254
8.5 Universum Classificationp. 255
8.5.1 Universum classification problemp. 255
8.5.2 Primal problem and dual problemp. 256
8.5.2.1 Algorithm and its relationship with three-class classificationp. 258
8.5.2.2 Construction of Universump. 258
8.6 Privileged Classificationp. 259
8.6.1 Linear privileged support vector classificationp. 259
8.6.2 Nonlinear privileged support vector classificationp. 261
8.6.3 A variationp. 263
8.7 Knowledge-based Classificationp. 265
8.7.1 Knowledge-based linear support vector classificationp. 265
8.7.2 Knowledge-based nonlinear support vector classificationp. 268
8.8 Robust Classificationp. 272
8.8.1 Robust classification problemp. 272
8.8.2 The solution when the input sets are polyhedronsp. 273
8.8.2.1 Linear robust support vector classificationp. 273
8.8.2.2 Robust support vector classificationp. 277
8.8.3 The solution when the input sets are superspheresp. 277
8.8.3.1 Linear robust support vector classificationp. 277
8.8.3.2 Robust support vector classificationp. 281
8.9 Multi-instance Classificationp. 282
8.9.1 Multi-instance classification problemp. 282
8.9.2 Multi-instance linear support vector classificationp. 283
8.9.2.1 Optimization problemp. 283
8.9.2.2 Linear support vector classificationp. 286
8.9.3 Multi-instance support vector classificationp. 288
8.10 Multi-label Classificationp. 292
8.10.1 Problem transformation methodsp. 292
8.10.2 Algorithm adaptation methodsp. 294
8.10.2.1 A ranking systemp. 294
8.10.2.2 Label set size predictionp. 296
8.10.3 Algorithmp. 297
Bibliographyp. 299
Indexp. 315
Go to:Top of Page