227. Polynomial Features

Adding Linear Complexity

When we want to train a model, we can easily imagine that we are unable to capture patterns of the training data if only using straight lines. Polynomial features is useful when you want to add more complexity to your model by converting the original features into their higher order terms.
Here is one way to implement Polynomial Feature using Scikit-learn.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures

# Generate Data
x = np.linspace(-5, 5, 20)
y = 2 * x - 2 * (x ** 2) + 0.5 * (x ** 3) + np.random.normal(loc=0, scale=10, size=len(x))

# Change Data Structure: [x1,x2,x3,..] to [[x1], [x2], [x3],...]
x = x[:, np.newaxis]
y = y[:, np.newaxis]


"""Fit model without polynomial features"""
# Calculate b0 and b1 from equation(y = b0 + b1x)  
model = LinearRegression()
model.fit(x, y)
y_pred = model.predict(x)

"""Add dimensions and fit model with polynomial features"""
# Convert Features to 3 dimensions
polynomial_features= PolynomialFeatures(degree=3)

# x_poly = [x1,x2,x3,..] to [[1,x1,x1^2,x1^3], [1,x2,x2^2,x2^3], [1,x3,x3^2,x3^3], ...]
# Compared to the previous data, you can see that dimensions are added for each x
x_poly = polynomial_features.fit_transform(x)

# Calculate b0~b3 from equation(y = b0 + b1x + b2x^2 b3x^3)
model = LinearRegression()
model.fit(x_poly, y)
poly_y_pred = model.predict(x_poly)

# Visualize
plt.scatter(x, y)
plt.plot(x, y_pred, color='r',label = "Without Poly Features")
plt.plot(x, poly_y_pred, color='g', label = "With Poly Features")
plt.legend(loc = "lower right")
plt.show()