Linear Regression - Nori12/Machine-Learning-Tutorial GitHub Wiki

Machine Learning Tutorial

Linear Regression

Linear regression is a statistical method of finding the relationship between independent and dependent variables. These can be done using different way, but the most classic linear method for regression is the Ordinary Least Square method, which will be assumed here.

Linear regression finds the parameters w and b that minimize the mean squared error between predictions and the true regression targets, y, on the training set. The mean squared error is the sum of the squared differences between the predictions and the true values. Linear regression has no parameters, which is a benefit, but it also has no way to control model complexity.

from sklearn.linear_model import LinearRegression

lr = LinearRegression().fit(X_train, y_train)

To get the coefficients:

print("lr.coef_: {}".format(lr.coef_))
print("lr.intercept_: {}".format(lr.intercept_))

# lr.coef_: [ 0.394]
# lr.intercept_: -0.031804343026759746

The intercept_ attribute is always a single float number, while the coef_ attribute is a NumPy array with one entry per input feature. As we only have a single input fea‐ ture in the wave dataset, lr.coef_ only has a single entry.

OBS: coef_ and intercept_: scikit-learn always stores anything that is derived from the training data in attributes that end with a trailing underscore. That is to separate them from parameters that are set by the user.