ICP_4 - acvc279/Python_Deeplearning GitHub Wiki
https://drive.google.com/file/d/1YZig1C4SFh1r0-VS6D9dRY24fmRver0l/view?usp=drivesdk
VIDEO LINK:Q1: find the correlation between ‘survived’(target column) and ‘sex’ column for the Titanic use case in class.Do you think we should keep this feature?
First import pandas for read a files we found correlation between survived and sex by using (data'Sex', 'Survived'.groupby(['Sex'], as_index=False).mean().sort_values(by='Survived', ascending=False)) Yes, because of these feature we get high correlation
Q2: Implement Naïve Bayes method using scikit-learn library
Firsted imported pandas and some libraries from sklearnread the file and stored in data varible. stored the attributes in x and type attrribute in y by using iloc and the text train and spilt the data with 80% of training and 20% testing then using naive bayes we predicted the accuracy of ytest and ypred.Finally accuracy is 0.37 which is 37%
Q3: Implement linear SVMmethodusing scikit library Use the same dataset
Firsted imported pandas and some libraries from sklearnread the file and stored in data varible. stored the attributes in x and type attrribute in y by using iloc and the text train and spilt the data with 80% of training and 20% testing then using svm we predicted the accuracy of ytest and ypred. Finally accuracy is 0.51 which is 51%.
Which algorithm you got better accuracy? Can you justify why?
SVM got the high accurraccy because of the ability that hyperplane can be use in 3-d and also it perform well on classfication types.