SVM / SVC in sklearn

The SVC in sklearn is the implementation for classification using SVM.

When using linear kernel, the dividing hyper plane wx+b=0. The weights w and bias b are provided.

Just need to check if the weights * x + b > 0 or not. If yes, then the prediction is 1, else 0.


import numpy as np

import pandas as pd

from sklearn import datasets

from sklearn.preprocessing import StandardScaler

from sklearn.svm import SVC


iris = datasets.load_iris()

df = pd.DataFrame(iris.data, columns = iris.feature_names)

df['y'] = iris.target


#only 0 and 1

df = df.loc[df['y'].isin([0,1])]

X = df[iris.feature_names]

y = df['y']


#standardize data

scaler = StandardScaler()

X_scaled = scaler.fit_transform(X)


svc = SVC(kernel='linear', C=1, gamma='auto')

svc.fit(X_scaled, y)


# use the svc built-in prediction function

pred1 = svc.predict(X_scaled)


# use the weights and bias to construct the prediction function

# weights * x + b>0 then prediction = 1 else 0

weights = svc.coef_

bias = svc.intercept_

distances = np.sum(weights * X_scaled, axis=1) + bias

pred2 = np.where(distances > 0, 1, 0)


#the two results should be the same

pred1 == pred2


Plot the decision boundary / dividing plane.

This works probably only when there are two features used in the model. When more than 2 features, the dividing plane needs input from all features. Projecting it to a 2D space of the features, may not look right because the other features values can not be reflected in the wx+b=0 function. One way is to take the average values from other features but still looks not good.

Note the dividing plane is w*x + b =0, i.e. w1*x1 + w2*x2 + bias = 0

Therefore the decision boundary is x2 = -w1/w2 * x1 - bias/w2, or y= -w1/w2 * x - bias/w2

import numpy as np

import pandas as pd

from sklearn import datasets

from sklearn.preprocessing import StandardScaler

from sklearn.svm import SVC

import matplotlib.pyplot as plt


iris = datasets.load_iris()

df = pd.DataFrame(iris.data, columns = iris.feature_names)

df['y'] = iris.target


#only 0 and 1

#only first two features

df = df.loc[df['y'].isin([0,1])]

X = df[iris.feature_names[:2]]

y = df['y']


svc = SVC(kernel='linear', C=1, gamma='auto')

svc.fit(X, y)


weights = svc.coef_[0]

bias = svc.intercept_[0]


#plot the data and the dividing boundary

fig, ax = plt.subplots(1,1,figsize=(6,6))

df1 = df.loc[df['y']==1]

df0 = df.loc[df['y']==0]

feature0 = iris.feature_names[0]

feature1 = iris.feature_names[1]

df1.plot(feature0, feature1, color='red', ax=ax, kind='scatter')

df0.plot(feature0, feature1, color='blue', ax=ax, kind='scatter')


w0 = weights[0]

w1 = weights[1]


#dividing boundary is wx+b = 0

#to draw the decision boundary in the 2D space of 2 of the features

# w0 * x0 + w1 * x1 + b = 0

# x1 = -w0/w1 * x0 - b/w1

xx = range(int(df[feature0].min()), int(df[feature0].max())+1)

yy = [-w0/w1 * i - bias/w1 for i in x0]

plt.plot(xx, yy, color = 'green')


plt.show()


Plot the decision boundary and color the areas

Again for 2 features only.  Simply create a mesh of the X,Y pair and run the prediction for every single point in the XY space. Color the mesh  using contour.
import numpy as np

import matplotlib.pyplot as plt

from sklearn import svm


np.random.seed(0)

X = np.r_[np.random.randn(20, 2) - [2, 2], np.random.randn(20, 2) + [2, 2]]

Y = [0] * 20 + [1] * 20


fig, ax = plt.subplots()

clf2 = svm.LinearSVC(C=1).fit(X, Y)


# get the separating hyperplane

w = clf2.coef_[0]

a = -w[0] / w[1]

xx = np.linspace(-5, 5)

yy = a * xx - (clf2.intercept_[0]) / w[1]


# create a mesh to plot in

x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1

y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1

xx2, yy2 = np.meshgrid(np.arange(x_min, x_max, .2),

                     np.arange(y_min, y_max, .2))

Z = clf2.predict(np.c_[xx2.ravel(), yy2.ravel()])


Z = Z.reshape(xx2.shape)

ax.contourf(xx2, yy2, Z, cmap=plt.cm.coolwarm, alpha=0.3)

ax.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.coolwarm, s=25)

ax.plot(xx,yy)


ax.axis([x_min, x_max,y_min, y_max])

plt.show()