ICP4 - SaranAkkiraju/Python_and_Deep_Learning_Programming_ICP GitHub Wiki

1. find the correlation between ‘survived’(target column) and ‘sex’ column for the Titanic use case in class. Do you think we should keep this feature?

  • Correlation value: q1

  • Should we keep the feature: Yes, as the female value is highly correlated with Survived we should keep it

2.Implementing Naïve Bayes method using scikit-learn libraryUse dataset available in https://umkc.box.com/s/ea6wn1cidukan67t02j60nmp1ljln3kd

  • Use train_test_splitto create training and testing part
  • Evaluate the model on testing part Dataset

pgm1

Out1

3. Implement linear SVMmethodusing scikit library

  • Use the same dataset above
  • Use train_test_split to create training and testing part
  • Evaluate the model on testing part pgm2

Out2

  • Which algorithm you got better accuracy? Can you justify why?

SVM got better accuracy as accuracy on test data is higher for SVM over naive bayes, it would have been better because Naive Bayes considers there exists no correlation between input data

4. use the SVM with RBF kernel on the same dataset. How the result changed?

pgm3

Out3

Change in Results: Results got improved