SVM / SVC in sklearn
The SVC in sklearn is the implementation for classification using SVM.
When using linear kernel, the dividing hyper plane wx+b=0. The weights w and bias b are provided.
Just need to check if the weights * x + b > 0 or not. If yes, then the prediction is 1, else 0.
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns = iris.feature_names)
df['y'] = iris.target
#only 0 and 1
df = df.loc[df['y'].isin([0,1])]
X = df[iris.feature_names]
y = df['y']
#standardize data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
svc = SVC(kernel='linear', C=1, gamma='auto')
svc.fit(X_scaled, y)
# use the svc built-in prediction function
pred1 = svc.predict(X_scaled)
# use the weights and bias to construct the prediction function
# weights * x + b>0 then prediction = 1 else 0
weights = svc.coef_
bias = svc.intercept_
distances = np.sum(weights * X_scaled, axis=1) + bias
pred2 = np.where(distances > 0, 1, 0)
#the two results should be the same
pred1 == pred2
Plot the decision boundary / dividing plane.
This works probably only when there are two features used in the model. When more than 2 features, the dividing plane needs input from all features. Projecting it to a 2D space of the features, may not look right because the other features values can not be reflected in the wx+b=0 function. One way is to take the average values from other features but still looks not good.
Note the dividing plane is w*x + b =0, i.e. w1*x1 + w2*x2 + bias = 0
Therefore the decision boundary is x2 = -w1/w2 * x1 - bias/w2, or y= -w1/w2 * x - bias/w2
import numpy as np
import pandas as pd
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
import matplotlib.pyplot as plt
iris = datasets.load_iris()
df = pd.DataFrame(iris.data, columns = iris.feature_names)
df['y'] = iris.target
#only 0 and 1
#only first two features
df = df.loc[df['y'].isin([0,1])]
X = df[iris.feature_names[:2]]
y = df['y']
svc = SVC(kernel='linear', C=1, gamma='auto')
svc.fit(X, y)
weights = svc.coef_[0]
bias = svc.intercept_[0]
#plot the data and the dividing boundary
fig, ax = plt.subplots(1,1,figsize=(6,6))
df1 = df.loc[df['y']==1]
df0 = df.loc[df['y']==0]
feature0 = iris.feature_names[0]
feature1 = iris.feature_names[1]
df1.plot(feature0, feature1, color='red', ax=ax, kind='scatter')
df0.plot(feature0, feature1, color='blue', ax=ax, kind='scatter')
w0 = weights[0]
w1 = weights[1]
#dividing boundary is wx+b = 0
#to draw the decision boundary in the 2D space of 2 of the features
# w0 * x0 + w1 * x1 + b = 0
# x1 = -w0/w1 * x0 - b/w1
xx = range(int(df[feature0].min()), int(df[feature0].max())+1)
yy = [-w0/w1 * i - bias/w1 for i in x0]
plt.plot(xx, yy, color = 'green')
plt.show()
Plot the decision boundary and color the areas
Again for 2 features only. Simply create a mesh of the X,Y pair and run the prediction for every single point in the XY space. Color the mesh using contour.
import numpy as np
import matplotlib.pyplot as plt
from sklearn import svm
np.random.seed(0)
X = np.r_[np.random.randn(20, 2) - [2, 2], np.random.randn(20, 2) + [2, 2]]
Y = [0] * 20 + [1] * 20
fig, ax = plt.subplots()
clf2 = svm.LinearSVC(C=1).fit(X, Y)
# get the separating hyperplane
w = clf2.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 5)
yy = a * xx - (clf2.intercept_[0]) / w[1]
# create a mesh to plot in
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx2, yy2 = np.meshgrid(np.arange(x_min, x_max, .2),
np.arange(y_min, y_max, .2))
Z = clf2.predict(np.c_[xx2.ravel(), yy2.ravel()])
Z = Z.reshape(xx2.shape)
ax.contourf(xx2, yy2, Z, cmap=plt.cm.coolwarm, alpha=0.3)
ax.scatter(X[:, 0], X[:, 1], c=Y, cmap=plt.cm.coolwarm, s=25)
ax.plot(xx,yy)
ax.axis([x_min, x_max,y_min, y_max])
plt.show()