Decision Tree in R : Heart Failure Prediction - clumsyspeedboat/Decision-Tree-Neo4j GitHub Wiki
R code
Python code
Data Analysis Results
Decision Tree
We have implemented the following algorithms of Decision Trees for comparison of accuracy
- CART
- C 4.5
- C 5.0
We have taken (2/3)rd of our Heart Failure Prediction Dataset to be used as Training Dataset & (1/3)rd to be used as Testing Dataset
CART - Classification & Regression Trees
- depth = 4
- leaf nodes = 4
Confusion Matrix: Prediction on Test Dataset
Accuracy = (55+23)/(55+23+8+11) = 80.41 %
Decision Tree (C 4.5)
- depth = 7
- leaf nodes = 7
Confusion Matrix: Prediction on Test Dataset
Accuracy = (56+20)/(56+20+7+14) = 78.35 %
Decision Tree (C 5.0)
- depth = 6
- leaf nodes = 6
Confusion Matrix: Prediction on Test Dataset
Accuracy = (55+23)/(55+23+8+11) = 80.41 %
There is a slight decrease in accuracy in C 4.5 from CART (also C 5.0) & this might be due to the higher number of leaf nodes present in the C 5.0 model than in CART. Without any pruning being performed on either of the models, the higher number of leaf nodes causes the Training Dataset to overfit & the Testing Dataset to underfit for C 4.5