ICP 4 - awais546/Python-and-Deep-Learning GitHub Wiki

Introduction to Python and Deep Learning

Introduction

In this ICP we covered the basics of machine learning and different methods of creating machine learning models using scikit library. We performed some analysis on different datasets and determine different conclusions from it.

Tasks

The tasks performed in this lab are as follows.

  • Find the correlation between two columns
  • Develop a Naive Bayes classifier
  • Develop an SVM classifier

Correlation between columns

Correlation between two columns can be found using python library. There is a built in method of corr that helps to identify the correlation in a numeric value. If the value is in between 0.3 to 0.5 then the correlation is strong. If the value is higher then the correlation is stronger. Following screenshot shows the correlation of two columns 'Survived' and 'Sex'.

Classifiers

The classifiers can be made using the scikit library. There are different models present in this library. The models that we used in this lab are Naive Bayes and SVM classifiers. The libraries to import are shown in the following screenshot.

The data is loaded in a data frame and then split into training and testing data. Testing data is 20% of the whole data. Following screenshot shows the making of both classifiers.

The accuracy of both the models are shown in the following screenshot.

The classification report is shown in the following screenshot

We can see that the SVM performed better. Naive Bayes perform well on those features that are independent and non-linear that is why it did not perform well for this dataset.