ICP 5 - PavankumarManchala/Python-and-Deep-Learning-Programming-ICPs GitHub Wiki

Submitted By:

Pavankumar Manchala – 22

**Tasks: **

  1. Regression techniques

a) Linear Regression b) Multiple Regression

Technologies Used: Pycharm

  1. Delete all the outlier data for the GarageArea field (for the same data set in the use case: House Prices)

Results:

The above part of code displays a scatter plot for the GarageArea field and the SalesPrice column. The resulted plot is as follows.

From the plot it is clear that the plot has anamoly data after 1000 and below 200. So, these data has to be eliminated and then plotted. The code is as shown below.

The part of the code train.GarageArea < 200 and train.GarageArea >1000 chooses those data and filters them. And the result is as follows.

  1. Create Multiple Regression for the “wine quality” dataset. In this data set “quality” is the target label.Evaluate the model using RMSE and R2 score.

Choose the dataset available in the Link.

Part-1

You need to find the top 3 most correlated features to the target label(quality).

The .corr() function correlates the data with the target variable. And it prints the top 3 variables. It uses the test_train_split function for splitting the data and then fitting the model.

The value of RMSE should be as minimal as possible and the value of R2 should be near to 1. The scores of the dataset given are as follows.

Video Explanation: https://drive.google.com/open?id=1Av1g5rBxYeFJfI6-BRZG9fX8l3b_euBE