pfeil_rechts

Suchen und Finden

Service

Support Vector Machines for Pattern Classification

von: Shigeo Abe

Springer-Verlag, 2010

ISBN: 9781849960984 , 473 Seiten

2. Auflage

Format: PDF, OL

Kopierschutz: Wasserzeichen

Online-Lesen für: Windows PC,Mac OSX,Linux

Preis: 149,79 EUR

Mehr zum Inhalt

Support Vector Machines for Pattern Classification

Preface: 6
Acknowledgments: 12
Contents: 14
Symbols: 20
1 Introduction: 21
1.1 Decision Functions: 22
1.1.1 Decision Functions for Two-Class Problems: 22
1.1.2 Decision Functions for Multiclass Problems: 24
1.2 Determination of Decision Functions: 28
1.3 Data Sets Used in the Book: 29
1.4 Classifier Evaluation: 33
References: 36
2 Two-Class Support Vector Machines: 40
2.1 Hard-Margin Support Vector Machines: 40
2.2 L1 Soft-Margin Support Vector Machines: 47
2.3 Mapping to a High-Dimensional Space: 50
2.3.1 Kernel Tricks: 50
2.3.2 Kernels: 52
2.3.3 Normalizing Kernels: 62
2.3.4 Properties of Mapping Functions Associated with Kernels: 63
2.3.5 Implicit Bias Terms: 66
2.3.6 Empirical Feature Space: 69
2.4 L2 Soft-Margin Support Vector Machines: 75
2.5 Advantages and Disadvantages: 77
2.5.1 Advantages: 77
2.5.2 Disadvantages: 78
2.6 Characteristics of Solutions: 79
2.6.1 Hessian Matrix: 79
2.6.2 Dependence of Solutions on C: 81
2.6.3 Equivalence of L1 and L2 Support Vector Machines: 86
2.6.4 Nonunique Solutions: 89
2.6.5 Reducing the Number of Support Vectors: 97
2.6.6 Degenerate Solutions: 100
2.6.7 Duplicate Copies of Data: 102
2.6.8 Imbalanced Data: 104
2.6.9 Classification for the Blood Cell Data: 104
2.7 Class Boundaries for Different Kernels: 107
2.8 Developing Classifiers: 112
2.8.1 Model Selection: 112
2.8.2 Estimating Generalization Errors: 112
2.8.3 Sophistication of Model Selection: 116
2.8.4 Effect of Model Selection by Cross-Validation: 117
2.9 Invariance for Linear Transformation: 121
References: 125
3 Multiclass Support Vector Machines: 132
3.1 One-Against-All Support Vector Machines: 133
3.1.1 Conventional Support Vector Machines: 133
3.1.2 Fuzzy Support Vector Machines: 135
3.1.3 Equivalence of Fuzzy Support Vector Machines and Support Vector Machines with Continuous Decision Functions: 138
3.1.4 Decision-Tree-Based Support Vector Machines: 141
3.2 Pairwise Support Vector Machines: 146
3.2.1 Conventional Support Vector Machines: 146
3.2.2 Fuzzy Support Vector Machines: 147
3.2.3 Performance Comparison of Fuzzy Support Vector Machines: 148
3.2.4 Cluster-Based Support Vector Machines: 151
3.2.5 Decision-Tree-Based Support Vector Machines: 152
3.2.6 Pairwise Classification with Correcting Classifiers: 162
3.3 Error-Correcting Output Codes: 163
3.3.1 Output Coding by Error-Correcting Codes: 164
3.3.2 Unified Scheme for Output Coding: 165
3.3.3 Equivalence of ECOC with Membership Functions: 166
3.3.4 Performance Evaluation: 166
3.4 All-at-Once Support Vector Machines: 168
3.5 Comparisons of Architectures: 171
3.5.1 One-Against-All Support Vector Machines: 171
3.5.2 Pairwise Support Vector Machines: 171
3.5.3 ECOC Support Vector Machines: 172
3.5.4 All-at-Once Support Vector Machines: 172
3.5.5 Training Difficulty: 172
3.5.6 Training Time Comparison: 176
References: 177
4 Variants of Support Vector Machines: 181
4.1 Least-Squares Support Vector Machines: 181
4.1.1 Two-Class Least-Squares Support Vector Machines: 182
4.1.2 One-Against-All Least-Squares Support Vector Machines: 184
4.1.3 Pairwise Least-Squares Support Vector Machines: 186
4.1.4 All-at-Once Least-Squares Support Vector Machines: 187
4.1.5 Performance Comparison: 188
4.2 Linear Programming Support Vector Machines: 192
4.2.1 Architecture: 193
4.2.2 Performance Evaluation: 196
4.3 Sparse Support Vector Machines: 198
4.3.1 Several Approaches for Sparse SupportVector Machines: 199
4.3.2 Idea: 201
4.3.3 Support Vector Machines Trained in the Empirical Feature Space: 202
4.3.4 Selection of Linearly Independent Data: 205
4.3.5 Performance Evaluation: 207
4.4 Performance Comparison of Different Classifiers: 210
4.5 Robust Support Vector Machines: 214
4.6 Bayesian Support Vector Machines: 215
4.6.1 One-Dimensional Bayesian Decision Functions: 217
4.6.2 Parallel Displacement of a Hyperplane: 218
4.6.3 Normal Test: 219
4.7 Incremental Training: 219
4.7.1 Overview: 219
4.7.2 Incremental Training Using Hyperspheres: 222
4.8 Learning Using Privileged Information: 231
4.9 Semi-Supervised Learning: 234
4.10 Multiple Classifier Systems: 235
4.11 Multiple Kernel Learning: 236
4.12 Confidence Level: 237
4.13 Visualization: 238
References: 238
5 Training Methods: 245
5.1 Preselecting Support Vector Candidates: 245
5.1.1 Approximation of Boundary Data: 246
5.1.2 Performance Evaluation: 248
5.2 Decomposition Techniques: 249
5.3 KKT Conditions Revisited: 252
5.4 Overview of Training Methods: 257
5.5 Primal--Dual Interior-Point Methods: 260
5.5.1 Primal--Dual Interior-Point Methods for Linear Programming: 260
5.5.2 Primal--Dual Interior-Point Methods for Quadratic Programming: 264
5.5.3 Performance Evaluation: 266
5.6 Steepest Ascent Methods and Newton's Methods: 270
5.6.1 Solving Quadratic Programming Problems Without Constraints: 270
5.6.2 Training of L1 Soft-Margin Support Vector Machines: 272
5.6.3 Sequential Minimal Optimization: 277
5.6.4 Training of L2 Soft-Margin Support Vector Machines: 278
5.6.5 Performance Evaluation: 279
5.7 Batch Training by Exact Incremental Training: 280
5.7.1 KKT Conditions: 281
5.7.2 Training by Solving a Set of Linear Equations: 282
5.7.3 Performance Evaluation: 290
5.8 Active Set Training in Primal and Dual: 291
5.8.1 Training Support Vector Machines in the Primal: 291
5.8.2 Comparison of Training Support Vector Machines in the Primal and the Dual: 294
5.8.3 Performance Evaluation: 297
5.9 Training of Linear Programming Support Vector Machines: 299
5.9.1 Decomposition Techniques: 300
5.9.2 Decomposition Techniques for Linear Programming Support Vector Machines: 307
5.9.3 Computer Experiments: 315
References: 317
6 Kernel-Based Methods: 322
6.1 Kernel Least Squares: 322
6.1.1 Algorithm: 322
6.1.2 Performance Evaluation: 325
6.2 Kernel Principal Component Analysis: 328
6.3 Kernel Mahalanobis Distance: 331
6.3.1 SVD-Based Kernel Mahalanobis Distance: 332
6.3.2 KPCA-Based Mahalanobis Distance: 335
6.4 Principal Component Analysis in the EmpiricalFeature Space: 336
6.5 Kernel Discriminant Analysis: 337
6.5.1 Kernel Discriminant Analysis for Two-Class Problems: 338
6.5.2 Linear Discriminant Analysis for Two-Class Problems in the Empirical Feature Space: 341
6.5.3 Kernel Discriminant Analysis for Multiclass Problems: 342
References: 344
7 Feature Selection and Extraction: 347
7.1 Selecting an Initial Set of Features: 347
7.2 Procedure for Feature Selection: 348
7.3 Feature Selection Using Support Vector Machines: 349
7.3.1 Backward or Forward Feature Selection: 349
7.3.2 Support Vector Machine-Based Feature Selection: 352
7.3.3 Feature Selection by Cross-Validation: 353
7.4 Feature Extraction: 355
References: 356
8 Clustering: 358
8.1 Domain Description: 358
8.2 Extension to Clustering: 364
References: 366
9 Maximum-Margin Multilayer Neural Networks: 368
9.1 Approach: 368
9.2 Three-Layer Neural Networks: 369
9.3 CARVE Algorithm: 372
9.4 Determination of Hidden-Layer Hyperplanes: 373
9.4.1 Rotation of Hyperplanes: 374
9.4.2 Training Algorithm: 377
9.5 Determination of Output-Layer Hyperplanes: 378
9.6 Determination of Parameter Values: 378
9.7 Performance Evaluation: 379
References: 380
10 Maximum-Margin Fuzzy Classifiers: 382
10.1 Kernel Fuzzy Classifiers with Ellipsoidal Regions: 383
10.1.1 Conventional Fuzzy Classifiers withEllipsoidal Regions: 383
10.1.2 Extension to a Feature Space: 384
10.1.3 Transductive Training: 385
10.1.4 Maximizing Margins: 390
10.1.5 Performance Evaluation: 393
10.2 Fuzzy Classifiers with Polyhedral Regions: 397
10.2.1 Training Methods: 398
10.2.2 Performance Evaluation: 406
References: 408
11 Function Approximation: 410
11.1 Optimal Hyperplanes: 410
11.2 L1 Soft-Margin Support Vector Regressors: 414
11.3 L2 Soft-Margin Support Vector Regressors: 416
11.4 Model Selection: 418
11.5 Training Methods: 418
11.5.1 Overview: 418
11.5.2 Newton's Methods: 420
11.5.3 Active Set Training: 437
11.6 Variants of Support Vector Regressors: 444
11.6.1 Linear Programming Support Vector Regressors: 445
11.6.2 -Support Vector Regressors: 446
11.6.3 Least-Squares Support Vector Regressors: 447
11.7 Variable Selection: 450
11.7.1 Overview: 450
11.7.2 Variable Selection by Block Deletion: 451
11.7.3 Performance Evaluation: 452
References: 453
A Conventional Classifiers: 458
A.1 Bayesian Classifiers: 458
A.2 Nearest-Neighbor Classifiers: 459
References: 460
B Matrices: 462
B.1 Matrix Properties: 462
B.2 Least-Squares Methods and Singular Value Decomposition: 464
B.3 Covariance Matrices: 467
References: 469
C Quadratic Programming: 470
C.1 Optimality Conditions: 470
C.2 Properties of Solutions: 471
D Positive Semidefinite Kernels and Reproducing Kernel Hilbert Space: 474
D.1 Positive Semidefinite Kernels: 474
D.2 Reproducing Kernel Hilbert Space: 478
References: 480
Index: 482

AGB
Datenschutz
Impressum
Kontakt
F.A.Q
Widerruf

Alle Preise verstehen sich inklusive der gesetzlichen MwSt.

Support Vector Machines for Pattern Classification

von: Shigeo Abe

Support Vector Machines for Pattern Classification

Preface

Acknowledgments

Contents

Symbols

1 Introduction

1.1 Decision Functions

1.1.1 Decision Functions for Two-Class Problems

1.1.2 Decision Functions for Multiclass Problems

1.2 Determination of Decision Functions

1.3 Data Sets Used in the Book

1.4 Classifier Evaluation

References

2 Two-Class Support Vector Machines

2.1 Hard-Margin Support Vector Machines

2.2 L1 Soft-Margin Support Vector Machines

2.3 Mapping to a High-Dimensional Space

2.3.1 Kernel Tricks

2.3.2 Kernels

2.3.3 Normalizing Kernels

2.3.4 Properties of Mapping Functions Associated with Kernels

2.3.5 Implicit Bias Terms

2.3.6 Empirical Feature Space

2.4 L2 Soft-Margin Support Vector Machines

2.5 Advantages and Disadvantages

2.5.1 Advantages

2.5.2 Disadvantages

2.6 Characteristics of Solutions

2.6.1 Hessian Matrix

2.6.2 Dependence of Solutions on C

2.6.3 Equivalence of L1 and L2 Support Vector Machines

2.6.4 Nonunique Solutions

2.6.5 Reducing the Number of Support Vectors

2.6.6 Degenerate Solutions

2.6.7 Duplicate Copies of Data

2.6.8 Imbalanced Data

2.6.9 Classification for the Blood Cell Data

2.7 Class Boundaries for Different Kernels

2.8 Developing Classifiers

2.8.1 Model Selection

2.8.2 Estimating Generalization Errors

2.8.3 Sophistication of Model Selection

2.8.4 Effect of Model Selection by Cross-Validation

2.9 Invariance for Linear Transformation

References

3 Multiclass Support Vector Machines

3.1 One-Against-All Support Vector Machines

3.1.1 Conventional Support Vector Machines

3.1.2 Fuzzy Support Vector Machines

3.1.3 Equivalence of Fuzzy Support Vector Machines and Support Vector Machines with Continuous Decision Functions

3.1.4 Decision-Tree-Based Support Vector Machines

3.2 Pairwise Support Vector Machines

3.2.1 Conventional Support Vector Machines

3.2.2 Fuzzy Support Vector Machines

3.2.3 Performance Comparison of Fuzzy Support Vector Machines

3.2.4 Cluster-Based Support Vector Machines

3.2.5 Decision-Tree-Based Support Vector Machines

3.2.6 Pairwise Classification with Correcting Classifiers

3.3 Error-Correcting Output Codes

3.3.1 Output Coding by Error-Correcting Codes

3.3.2 Unified Scheme for Output Coding

3.3.3 Equivalence of ECOC with Membership Functions

3.3.4 Performance Evaluation

3.4 All-at-Once Support Vector Machines

3.5 Comparisons of Architectures

3.5.1 One-Against-All Support Vector Machines

3.5.2 Pairwise Support Vector Machines

3.5.3 ECOC Support Vector Machines

3.5.4 All-at-Once Support Vector Machines

3.5.5 Training Difficulty

3.5.6 Training Time Comparison

References

4 Variants of Support Vector Machines

4.1 Least-Squares Support Vector Machines

4.1.1 Two-Class Least-Squares Support Vector Machines

4.1.2 One-Against-All Least-Squares Support Vector Machines

4.1.3 Pairwise Least-Squares Support Vector Machines

4.1.4 All-at-Once Least-Squares Support Vector Machines