ICP_4 - acvc279/Python_Deeplearning GitHub Wiki

VIDEO LINK: https://drive.google.com/file/d/1YZig1C4SFh1r0-VS6D9dRY24fmRver0l/view?usp=drivesdk

Q1: find the correlation between ‘survived’(target column) and ‘sex’ column for the Titanic use case in class.Do you think we should keep this feature?

First import pandas for read a files we found correlation between survived and sex by using (data'Sex', 'Survived'.groupby(['Sex'], as_index=False).mean().sort_values(by='Survived', ascending=False)) Yes, because of these feature we get high correlation

Q2: Implement Naïve Bayes method using scikit-learn library

Firsted imported pandas and some libraries from sklearnread the file and stored in data varible. stored the attributes in x and type attrribute in y by using iloc and the text train and spilt the data with 80% of training and 20% testing then using naive bayes we predicted the accuracy of ytest and ypred.Finally accuracy is 0.37 which is 37%

Q3: Implement linear SVMmethodusing scikit library Use the same dataset

Firsted imported pandas and some libraries from sklearnread the file and stored in data varible. stored the attributes in x and type attrribute in y by using iloc and the text train and spilt the data with 80% of training and 20% testing then using svm we predicted the accuracy of ytest and ypred. Finally accuracy is 0.51 which is 51%.

Which algorithm you got better accuracy? Can you justify why?

SVM got the high accurraccy because of the ability that hyperplane can be use in 3-d and also it perform well on classfication types.