Support vector machines : optimization based theory, algorithms, and extensions

Title:

Personal Author:

Deng, Naiyang

Series:

Chapman & Hall/CRC data mining and knowledge discovery series

Publication Information:

London : CRC, 2012

Physical Description:

xxvii, 335 pages : illustrations ; 24 cm.

ISBN:

9781439857922

Abstract:

"Preface Support vector machines (SVMs), which were introduced by Vapnik in the early 1990s, are proved effective and promising techniques for data mining. SVMs have recently been breakthroughs in advance in their theoretical studies and implementations of algorithms. They have been successfully applied in many fields such as text categorization, speech recognition, remote sensing image analysis, time series forecasting, information security and etc. SVMs, having their roots in Statistical Learning Theory (SLT) and optimization methods, become powerful tools to solve the problems of machine learning with finite training points and to overcome some traditional difficulties such as the "curse of dimensionality", "over-fitting" and etc. SVMs theoretical foundation and implementation techniques have been established and SVMs are gaining quick development and popularity due to their many attractive features: nice mathematical representations, geometrical explanations, good generalization abilities and promising empirical performance. Some SVM monographs, including more sophisticated ones such as Cristianini & Shawe-Taylor [39] and Scholkopf & Smola [124], have been published. We have published two books about SVMs in Science Press of China since 2004 [42, 43], which attracted widespread concerns and received favorable comments. After several years research and teaching, we decide to rewrite the books and add new research achievements. The starting point and focus of the book is optimization theory, which is different from other books on SVMs in this respect. Optimization is one of the pillars on which SVMs are built, so it makes a lot of sense to consider them from this point of view"-- Provided by publisher.

Subject Term:

Mathematical optimization

Added Author:

Tian, Yingjie, 1973-

Zhang, Chunhua, 1978-

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000010304288	QA402.5 D46 2013	Open Access Book	Book	Searching... Unknown
Searching... PSZ JB	30000010327971	QA402.5 D46 2013	Open Access Book	Book	Searching... Unknown
Searching... PSZ KL	33000000000038	QA402.5 D46 2013	Open Access Book	Book	Searching... Unknown

On Order

Summary

Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions presents an accessible treatment of the two main components of support vector machines (SVMs)--classification problems and regression problems. The book emphasizes the close connection between optimization theory and SVMs since optimization is one of the pillars on which SVMs are built.

The authors share insight on many of their research achievements. They give a precise interpretation of statistical leaning theory for C-support vector classification. They also discuss regularized twin SVMs for binary classification problems, SVMs for solving multi-classification problems based on ordinal regression, SVMs for semi-supervised problems, and SVMs for problems with perturbations.

To improve readability, concepts, methods, and results are introduced graphically and with clear explanations. For important concepts and algorithms, such as the Crammer-Singer SVM for multi-class classification problems, the text provides geometric interpretations that are not depicted in current literature.

Enabling a sound understanding of SVMs, this book gives beginners as well as more experienced researchers and engineers the tools to solve real-world problems using SVMs.

List of Figures	p. xvii
List of Tables	p. xxi
Preface	p. xxiii
List of Symbols	p. xxvii
1 Optimization	p. 1
1.1 Optimization Problems in Euclidian Space	p. 1
1.1.1 An example of optimization problems	p. 1
1.1.2 Optimization problems and their solutions	p. 3
1.1.3 Geometric interpretation of optimization problems	p. 4
1.2 Convex Programming in Euclidean Space	p. 5
1.2.1 Convex sets and convex functions	p. 6
1.2.1.1 Convex sets	p. 6
1.2.1.2 Convex functions	p. 7
1.2.2 Convex programming and their properties	p. 8
1.2.2.1 Convex programming problems	p. 8
1.2.2.2 Basic properties	p. 9
1.2.3 Duality theory	p. 12
1.2.3.1 Derivation of the dual problem	p. 12
1.2.3.2 Duality theory	p. 13
1.2.4 Optimality conditions	p. 15
1.2.5 Linear programming	p. 16
1.3 Convex Programming in Hilbert Space	p. 18
1.3.1 Convex sets and Fréchet derivative	p. 18
1.3.2 Convex programming problems	p. 19
1.3.3 Duality theory	p. 20
1.3.4 Optimality conditions	p. 20
1.4 Convex Programming with Generalized Inequality Constraints in Euclidian Space	p. 21
1.4.1 Convex programming with generalized inequality constraints	p. 21
1.4.1.1 Cones	p. 21
1.4.1.2 Generalized inequalities	p. 21
1.4.1.3 Convex programming with generalized inequality constraints	p. 22
1.4.2 Duality theory	p. 23
1.4.2.1 Dual cones	p. 23
1.4.2.2 Derivation of the dual problem	p. 23
1.4.2.3 Duality theory	p. 25
1.4.3 Optimality conditions	p. 26
1.4.4 Second-order cone programming	p. 27
1.4.4.1 Second-order cone programming and its dual problem	p. 27
1.4.4.2 Software for second-order cone programming	p. 30
1.4.5 Semidefinite programming	p. 31
1.4.5.1 Semidefinite programming and its dual problem	p. 31
1.4.5.2 Software for semidefinite programming	p. 35
1.5 Convex Programming with Generalized Inequality Constraints in Hubert Space	p. 36
1.5.1 K-convex function and Fréchet derivative	p. 36
1.5.2 Convex programming	p. 36
1.5.3 Duality theory	p. 37
1.5.4 Optimality conditions	p. 38
2 Linear Classification	p. 41
2.1 Presentation of Classification Problems	p. 41
2.1.1 A sample (diagnosis of heart disease)	p. 41
2.1.2 Classification problems and classification machines	p. 43
2.2 Support Vector Classification (SVC) for Linearly Separable Problems	p. 45
2.2.1 Maximal margin method	p. 45
2.2.1.1 Derivation of the maximal margin method	p. 45
2.2.1.2 Properties of the maximal margin method	p. 48
2.2.2 Linearly separable support vector classification	p. 50
2.2.2.1 Relationship between the primal and dual problems	p. 50
2.2.2.2 Linearly separable support vector classification	p. 53
2.2.3 Support vector	p. 54
2.3 Linear C-Support Vector Classification	p. 56
2.3.1 Maximal margin method	p. 56
2.3.1.1 Derivation of the maximal margin method	p. 56
2.3.1.2 Properties of the maximal margin method	p. 57
2.3.2 Linear C-support vector classification	p. 59
2.3.2.1 Relationship between the primal and dual problems	p. 59
2.3.2.2 Linear C-support vector classification	p. 60
3 Linear Regression	p. 63
3.1 Regression Problems and Linear Regression Problems	p. 63
3.2 Hard ¿-Band Hyperplane	p. 65
3.2.1 Linear regression problem and hard ¿-band hyperplane	p. 65
3.2.2 Hard ¿-band hyperplane and linear classification	p. 66
3.2.3 Optimization problem of constructing a hard ¿-band hyperplane	p. 68
3.3 Linear Hard ¿-band Support Vector Regression	p. 70
3.3.1 Primal problem	p. 70
3.3.2 Dual problem and relationship between the primal and dual problems	p. 70
3.3.3 Linear hard ¿-band support vector regression	p. 74
3.4 Linear ¿-Support Vector Regression	p. 76
3.4.1 Primal problem	p. 76
3.4.2 Dual problem and relationship between the primal and dual problems	p. 77
3.4.3 Linear ¿-support vector regression	p. 79
4 Kernels and Support Vector Machines	p. 81
4.1 From Linear Classification to Nonlinear Classification	p. 81
4.1.1 An example of nonlinear classification	p. 81
4.1.2 Classification machine based on nonlinear separation	p. 82
4.1.3 Regression machine based on nonlinear separation	p. 87
4.2 Kernels	p. 92
4.2.1 Properties	p. 93
4.2.2 Construction of kernels	p. 93
4.2.2.1 Basic kernels	p. 94
4.2.2.2 Operations keeping kernels	p. 94
4.2.2.3 Commonly used kernels	p. 96
4.2.2.4 Graph kernel	p. 97
4.3 Support Vector Machines and Their Properties	p. 101
4.3.1 Support vector classification	p. 101
4.3.1.1 Algorithm	p. 101
4.3.1.2 Support vector	p. 103
4.3.1.3 Properties	p. 104
4.3.1.4 Soft margin loss function	p. 105
4.3.1.5 Probabilistic outputs	p. 106
4.3.2 Support vector regression	p. 109
4.3.2.1 Algorithm	p. 109
4.3.2.2 Support vector	p. 110
4.3.2.3 Properties	p. 112
4.3.2.4 ¿-Insensitive loss function	p. 112
4.3.3 Flatness of support vector machines	p. 113
4.3.3.1 Runge phenomenon	p. 114
4.3.3.2 Flatness of ¿-support vector regression	p. 115
4.3.3.3 Flatness of C-support vector classification	p. 118
4.4 Meaning of Kernels	p. 120
5 Basic Statistical Learning Theory of C-Support Vector Classification	p. 127
5.1 Classification Problems on Statistical Learning Theory	p. 127
5.1.1 Probability distribution	p. 127
5.1.2 Description of classification problems	p. 131
5.2 Empirical Risk Minimization	p. 134
5.3 Vapnik Chervonenkis (VC) Dimension	p. 135
5.4 Structural Risk Minimization	p. 138
5.5 An Implementation of Structural Risk Minimization	p. 140
5.5.1 Primal problem	p. 140
5.5.2 Quasi-dual problem and relationship between quasi-dual problem and primal problem	p. 141
5.5.3 Structural risk minimization classification	p. 144
5.6 Theoretical Foundation of C-Support Vector Classification on Statistical Learning Theory	p. 145
5.6.1 Linear C-support vector classification	p. 145
5.6.2 Relationship between dual problem and quasi-dual problem	p. 146
5.6.3 Interpretation of C-support vector classification	p. 148
6 Model Construction	p. 151
6.1 Data Generation	p. 151
6.1.1 Orthogonal encoding	p. 152
6.1.2 Spectrum profile encoding	p. 153
6.1.3 Positional weighted matrix encoding	p. 154
6.2 Data Preprocessing	p. 155
6.2.1 Representation of nominal features	p. 155
6.2.2 Feature selection	p. 156
6.2.2.1 F-score method	p. 156
6.2.2.2 Recursive feature elimination method	p. 157
6.2.2.3 Methods based on p-norm support vector classification (0 ¿ p ¿ 1)	p. 158
6.2.3 Feature extraction	p. 164
6.2.3.1 Linear dimensionality reduction	p. 164
6.2.3.2 Nonlinear dimensionality reduction	p. 165
6.2.4 Data compression	p. 168
6.2.5 Data rebalancing	p. 169
6.3 Model Selection	p. 171
6.3.1 Algorithm evaluation	p. 171
6.3.1.1 Some evaluation measures for a decision function	p. 172
6.3.1.2 Some evaluation measures for a concrete algorithm	p. 175
6.3.2 Selection of kernels and parameters	p. 178
6.4 Rule Extraction	p. 180
6.4.1 A toy example	p. 180
6.4.2 Rule extraction	p. 182
7 Implementation	p. 187
7.1 Stopping Criterion	p. 188
7.1.1 The first stopping criterion	p. 188
7.1.2 The second stopping criterion	p. 189
7.1.3 The third stopping criterion	p. 190
7.2 Chunking	p. 192
7.3 Decomposing	p. 194
7.4 Sequential Minimal Optimization	p. 197
7.4.1 Main steps	p. 198
7.4.2 Selecting the working set	p. 198
7.4.3 Analytical solution of the two-variables problem	p. 199
7.5 Software	p. 201
8 Variants and Extensions of Support Vector Machines	p. 203
8.1 Variants of Binary Support Vector Classification	p. 203
8.1.1 Support vector classification with homogeneous decision function	p. 203
8.1.2 Bounded support vector classification	p. 206
8.1.3 Least squares support vector classification	p. 209
8.1.4 Proximal support vector classification	p. 211
8.1.5 v-Support vector classification	p. 213
8.1.5.1 v-Support vector classification	p. 213
8.1.5.2 Relationship between v-SVC and C-SVC	p. 215
8.1.5.3 Significance of the parameter v	p. 215
8.1.6 Linear programming support vector classifications (LPSVC)	p. 216
8.1.6.1 LPSVC corresponding to C-SVC	p. 216
8.1.6.2 LPSVC corresponding to v-SVC	p. 218
8.1.7 Twin support vector classification	p. 218
8.2 Variants of Support Vector Regression	p. 224
8.2.1 Least squares support vector regression	p. 225
8.2.2 v-Support vector regression	p. 226
8.2.2.1 v-Support vector regression	p. 227
8.2.2.2 Relationship between v-SVR and ¿-SVR	p. 229
8.2.2.3 The significance of the parameter v	p. 229
8.2.2.4 Linear programming support vector regression (LPSVR)	p. 229
8.3 Multiclass Classification	p. 232
8.3.1 Approaches based on binary classifiers	p. 232
8.3.1.1 One versus one	p. 232
8.3.1.2 One versus the rest	p. 232
8.3.1.3 Error-correcting output coding	p. 234
8.3.2 Approach based on ordinal regression machines	p. 236
8.3.2.1 Ordinal regression machine	p. 237
8.3.2.2 Approach based on ordinal regression machines	p. 240
8.3.3 Crammer-Singer multiclass support vector classification	p. 243
8.3.3.1 Basic idea	p. 243
8.3.3.2 Primal problem	p. 243
8.3.3.3 Crammer-Singer support vector classification	p. 245
8.4 Semisupervised Classification	p. 247
8.4.1 PU classification problem	p. 247
8.4.2 Biased support vector classification [101]	p. 247
8.4.2.1 Optimization problem	p. 247
8.4.2.2 The selection of the parameters C + and C -	p. 249
8.4.3 Classification problem with labeled and unlabeled in puts	p. 250
8.4.4 Support vector classification by semidefinite programming	p. 251
8.4.4.1 Optimization problem	p. 251
8.4.4.2 Approximate solution via semidefinite programming	p. 252
8.4.4.3 Support vector classification by semidefinite programming	p. 254
8.5 Universum Classification	p. 255
8.5.1 Universum classification problem	p. 255
8.5.2 Primal problem and dual problem	p. 256
8.5.2.1 Algorithm and its relationship with three-class classification	p. 258
8.5.2.2 Construction of Universum	p. 258
8.6 Privileged Classification	p. 259
8.6.1 Linear privileged support vector classification	p. 259
8.6.2 Nonlinear privileged support vector classification	p. 261
8.6.3 A variation	p. 263
8.7 Knowledge-based Classification	p. 265
8.7.1 Knowledge-based linear support vector classification	p. 265
8.7.2 Knowledge-based nonlinear support vector classification	p. 268
8.8 Robust Classification	p. 272
8.8.1 Robust classification problem	p. 272
8.8.2 The solution when the input sets are polyhedrons	p. 273
8.8.2.1 Linear robust support vector classification	p. 273
8.8.2.2 Robust support vector classification	p. 277
8.8.3 The solution when the input sets are superspheres	p. 277
8.8.3.1 Linear robust support vector classification	p. 277
8.8.3.2 Robust support vector classification	p. 281
8.9 Multi-instance Classification	p. 282
8.9.1 Multi-instance classification problem	p. 282
8.9.2 Multi-instance linear support vector classification	p. 283
8.9.2.1 Optimization problem	p. 283
8.9.2.2 Linear support vector classification	p. 286
8.9.3 Multi-instance support vector classification	p. 288
8.10 Multi-label Classification	p. 292
8.10.1 Problem transformation methods	p. 292
8.10.2 Algorithm adaptation methods	p. 294
8.10.2.1 A ranking system	p. 294
8.10.2.2 Label set size prediction	p. 296
8.10.3 Algorithm	p. 297
Bibliography	p. 299
Index	p. 315

Available:*

On Order

Summary

Summary

Table of Contents