Plot the maximum margin hyperplane with scikits.learn

Suppose some given data points each belong to one of two classes, and the goal is to decide which class a new data point will be in. In the case of support vector machines, a data point is viewed as a p-dimensional vector (2-dimensional in this example), and we want to know whether we can separate such points with a p -1-dimensional hyperplane (a line in our case). There are many hyperplanes that might classify the data. One reasonable choice as the best hyperplane is the one that represents the largest separation, or margin, between the two classes. So we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier.

Using the new svm module in scikits.learn, you can easily plot the maximum margin hyperplane. If clf is an instance of svm.SVC(), the coefficients in the decision function are stored in clf.coef_ , and the independent term in clf.rho_ . The complete source code is:

import numpy as np
import pylab as pl
from scikits.learn import svm

# we create 40 separable points
np.random.seed(0)
X = np.r_[np.random.randn(20, 2) - [2,2], np.random.randn(20, 2) + [2, 2]]
Y = [0]*20 + [1]*20

# fit the model
clf = svm.SVC(kernel=‘linear’)
clf.fit(X, Y)

# get the separating hyperplane
w = np.dot(clf.coef_[0], clf.support_)
a = -w[0]/w[1]
xx = np.linspace(-5, 5)
yy = a*xx + (clf.rho_[0])/w[1]

# plot the paralels to the separating hyperplane that pass through the
# support vectors
b = clf.support_[0]
yy_down = a*xx + (b[1] - a*b[0])
b = clf.support_[-1]
yy_up = a*xx + (b[1] - a*b[0])

# plot the line, the points, and the nearest vectors to the plane
pl.set_cmap(pl.cm.Paired)
pl.plot(xx, yy, ‘k-’)
pl.plot(xx, yy_down, ‘k–’)
pl.plot(xx, yy_up, ‘k–’)
pl.scatter(X[:,0], X[:,1], c=Y)
pl.scatter(clf.support_[:,0], clf.support_[:,1], marker=‘+’)

pl.axis(‘tight’)
pl.show()

And the result is

Separating hyperplane

where the vectors that are closest to the separating line are highlighted with a small ‘+’.

Up-to date code of this can be found in directory examples/ of scikits.learn

4 Responses to “Plot the maximum margin hyperplane with scikits.learn”

  1. Tim Says:

    Thanks for the post. I am wondering if it is possible to visualize the plane in 3 or more dimensions using scikits.learn?

  2. fabian Says:

    Sure,

    <w, x> + clf.rho_ = 0,

    where w = np.dot(clf.coef_[0], clf.support_) is the equation of your hyperplane (in arbitrary dimension) so you just would have to solve the linear equation and plot the result.

  3. Winston Says:

    Yo opino que deberías de dormir de vez en cuando. Aunque sea una vez por semana.

  4. Renton Says:

    No Winston, dormir es una perdida de tiempo. Déjalo que trabaje.

Leave a Reply