Decision Tree in R : Heart Failure Prediction - clumsyspeedboat/Decision-Tree-Neo4j GitHub Wiki

R code

Python code

Data Analysis Results


Decision Tree

We have implemented the following algorithms of Decision Trees for comparison of accuracy

  • CART
  • C 4.5
  • C 5.0

We have taken (2/3)rd of our Heart Failure Prediction Dataset to be used as Training Dataset & (1/3)rd to be used as Testing Dataset


CART - Classification & Regression Trees

  • depth = 4
  • leaf nodes = 4

Decision_Tree_R_Cart

Confusion Matrix: Prediction on Test Dataset

Accuracy = (55+23)/(55+23+8+11) = 80.41 %


Decision Tree (C 4.5)

  • depth = 7
  • leaf nodes = 7

Decision_Tree_C4 5

Confusion Matrix: Prediction on Test Dataset

Accuracy = (56+20)/(56+20+7+14) = 78.35 %


Decision Tree (C 5.0)

  • depth = 6
  • leaf nodes = 6

Decision_Tree_R_C5 0

Confusion Matrix: Prediction on Test Dataset

Accuracy = (55+23)/(55+23+8+11) = 80.41 %


There is a slight decrease in accuracy in C 4.5 from CART (also C 5.0) & this might be due to the higher number of leaf nodes present in the C 5.0 model than in CART. Without any pruning being performed on either of the models, the higher number of leaf nodes causes the Training Dataset to overfit & the Testing Dataset to underfit for C 4.5