Tutorial 02 - clairedavid/ml_in_hep GitHub Wiki

Questions

Valid samples never used.

Proba: is the reasoning done legit? It gets the proba only in the last leaf ...

Scikit: ?

Should be BDTs

Use thee GridSearch example: https://colab.research.google.com/github/ageron/handson-ml2/blob/master/06_decision_trees.ipynb

Using Harrison's dataset

  1. Intro
  2. The manual decision stump
  3. Decision Tree
  4. Random Forest
  5. Make a ROC curve : compare performance
  6. AdaBoost
  7. Add to ROC curve
  8. Bonus: XGBoost (?)

When to add the validation - overlay - training curves?

Seaborn? That would be good for them. Or assignment? Feature importance!

Gradient Boosting in Python/Scikit-Learn

Excellent source https://towardsdatascience.com/gradient-boosting-classification-explained-through-python-60cc980eeb3d