Menu Content/Inhalt
TriloByte Home arrow QC-Expert
SVM-C – SVM Classification models Print E-mail
< Prev   Next >
Dialog window for SVM – Classification
Example of graphical output
SVM-C – SVM Classification models
SVM models minimize suitably defined error (misclassification rate in classification or deviation in some metric in regression). For example, in classification of a linearly separable task in two dimensions (with two independent numerical variables and one two-level factor response variable defining one of two classes such as “A” and “B” for each value) we look for a line which separates (discriminates) both classes and allows for maximal distance of the different classes from the separating line thus generally minimizing risk of misclassification for any new data, see next figure. The SVM model can then be used to predict the class from a given set of independent variable values including probabilities for each class.

In a non-separable case (like that on next figure), a line is sought that minimizes the misclassification “distance” of misclassified (or incorrectly classified) data. On next figure the separating line minimizes the sum of distances of incorrectly classified point “A” and one incorrectly classified point “B” from the separating line and maximizes distance of the correctly classified data from the separating line.

In case of separable data with binary response (yi = –1 or 1) the length of the normal vector w to the separation line (or generally separation hyper plane) is minimized:
SVM-C
subject to
SVM-C
which maximizes the width of the gap between the two classes (lines H1 and H2 in next figure). In the case of non-separable data, a term for misclass penalization with a user-defined tuning “cost” parameter C is added.
SVM-C
subject to
SVM-C
Geometrical interpretation of non-separable case is illustrated on next figure. The points that lie on (or “support”) the separation zone lines H1 and H2 are called “support vectors” – hence the name of the whole method. The support vectors are circled on next figure.

Alternatively, instead of the loss coefficient C, a ratio ν (0 < ν < 1) may be employed,
SVM-C
subject to
SVM-C
ν corresponds to an expected ratio of misclassified cases.

Last Updated ( 31.05.2013 )
 
< Prev   Next >

Login

Seminars, Courses

Show All

Poll

What application is most suitable for you?
powered by www.trilobyte.cz