ML2 ‐ Lec (5) - RenadShamrani/test GitHub Wiki

🧠 Ensemble Learning

Definition: Combining multiple models (base learners) to improve accuracy and generalization.

Key Benefits:

  • ✅ More robust predictions
  • ✅ Reduces overfitting
  • ✅ Improves stability

🏆 Types of Ensemble Learning

1️⃣ Homogeneous Ensembles

🔹 Use the same algorithm but train on different data.
Examples:

  • Bagging (Bootstrap Aggregating)
  • Boosting (Sequential Learning)

2️⃣ Heterogeneous Ensembles

🔹 Use different algorithms on the same data.
Examples:

  • Voting (Majority decision)
  • Stacking (Meta-learning)

🏗 Bagging (Bootstrap Aggregating)

Goal: Reduce variance, stabilize predictions.
How it works:

  1. Bootstrap: Create random subsets of data (with replacement).
  2. Train multiple models independently on these subsets.
  3. Aggregate predictions (majority voting for classification, averaging for regression).

🔹 Common Algorithm: Random Forest 🌳
🔹 Reduces overfitting, works well for high-variance models (e.g., decision trees).

📌 Formula:
For classification, final prediction = majority vote
For regression, final prediction =
[ \hat{y} = \frac{1}{M} \sum_{m=1}^{M} G_m(x) ]
where (G_m(x)) is the prediction of the (m)th model.


Boosting (Sequential Learning)

Goal: Reduce bias & variance by focusing on misclassified samples.
How it works:

  1. Train a weak learner.
  2. Identify misclassified samples and increase their weight.
  3. Train the next model to correct these mistakes.
  4. Final prediction: weighted combination of all models.

🔹 Common Algorithms:

  • AdaBoost (Adjusts sample weights)
  • Gradient Boosting (Optimizes loss function)
  • XGBoost, LightGBM, CatBoost (Advanced versions)

📌 Formula:
Final prediction:
[ F(x) = \sum_{m=1}^{M} \alpha_m G_m(x) ]
where (\alpha_m) is the weight of each model.

🚨 Boosting can overfit! Needs careful tuning.


🔄 Comparison of Bagging vs Boosting vs Stacking

Criteria Bagging 🏗 Boosting Stacking 🏆
Approach Parallel training Sequential training Meta-learning
Goal Reduce variance Reduce bias & variance Improve accuracy
Base Models Homogeneous Homogeneous Heterogeneous
Final Prediction Voting/Averaging Weighted sum Meta-model

🎯 Key Takeaways

Bagging → Best for reducing overfitting
Boosting → Best for improving accuracy
Stacking → Best for combining different models

  • Use Bagging (Random Forest) when overfitting is a problem.
  • Use Boosting (AdaBoost, XGBoost) for high accuracy but watch for overfitting.
  • Use Stacking when different models capture different aspects of data.


1. Ensemble Learning 🤝

  • What?: Combine multiple models (weak learners) to create a stronger model.
  • Goal: Improve accuracy, reduce overfitting, and increase robustness.
  • Types:
    • Homogeneous: Same algorithm, different data (e.g., Bagging, Boosting).
    • Heterogeneous: Different algorithms, same data (e.g., Stacking).

2. Bagging (Bootstrap Aggregating) 🎒

  • What?: Train multiple models on different subsets of data (sampled with replacement).
  • Aggregation: Average (regression) or majority vote (classification).
  • Example: Random Forest 🌳 (ensemble of decision trees).
  • Advantages:
    • Reduces variance and overfitting.
    • Improves accuracy and stability.
    • Easy to parallelize.

3. Boosting 🚀

  • What?: Sequentially train models, focusing on misclassified samples.
  • How?: Increase weights of misclassified samples in each iteration.
  • Example: AdaBoost (Adaptive Boosting).
  • Advantages:
    • Reduces bias and variance.
    • Improves accuracy by correcting errors.
    • Handles noisy data well.

4. Stacking 🥞

  • What?: Combine predictions of multiple models using a meta-model.
  • Steps:
    1. Train base models (level-0).
    2. Use their predictions as input to train a meta-model (level-1).
  • Advantages:
    • Leverages model diversity.
    • Often improves performance over individual models.

5. Key Concepts 🔑

  • Base Learners: Individual models in the ensemble.
  • Diversity: Ensures models make different errors.
  • Aggregation: Combine predictions (e.g., averaging, voting).
  • Random Forest: Bagging + Decision Trees.
  • AdaBoost: Boosting + Decision Stumps.

Mind Map 🧠

Ensemble Learning
├── Bagging (Bootstrap Aggregating)
│   ├── Random Forest (Decision Trees)
│   ├── Reduces Variance
│   └── Parallel Training
├── Boosting
│   ├── AdaBoost (Sequential Training)
│   ├── Focuses on Misclassified Samples
│   └── Reduces Bias
└── Stacking
    ├── Combines Predictions with Meta-Model
    └── Leverages Model Diversity

Key Symbols 🔑

  • M: Number of models.
  • D: Dataset.
  • G_m(x): Model m's prediction.
  • w_i: Weight of model i (in Boosting).

You’re ready! 🎉 Just remember Bagging = parallel training, Boosting = sequential training, and Stacking = meta-model! 🚀