In Class Programming 13 - sirisha1206/Spark GitHub Wiki

Name:Naga Sirisha Sunkara

Class ID:21

Team ID:5

Technical partners details:

Name:Vinay Santhosham

Class ID:17


The objective of this assignment is to compare the different machine algorithms like naive bayes,decision tree ,random forest algorithms

Comparison of different machine learning algorithms

For implementing we used immunotherapy dataset.

We have trained the models on columns like age,area and induration diameter and predicted the output.


Naive Bayes Algorithm:

Output for 80% Training and 20% testing data:

Output for 70% training data and 30% testing data:

Decision Tree Algorithm:

Output for 80% Training and 20% testing data:

Output for 70% training data and 30% testing data:

Random Forest Algorithm:

Output for 80% Training and 20% testing data:

Output for 70% training data and 30% testing data:


Among the different machine learning algorithms for immunotherapy dataset,we have got the best accuracy for decision tree algorithm of 0.69. Decision tree algorithm handle missing values as easily as any normal value of the variable.Decision tree algorithm run fast even with lots of observations and variables and trees can be used for supervised and unsupervised learning.