ML2 ‐ Lec (3) - RenadShamrani/test GitHub Wiki
1. Decision Trees 🌳
- What?: A tree-like model for classification/regression.
- Goal: Build the smallest possible tree that fits the data.
- Nodes: Test attributes.
- Branches: Attribute values.
- Leaves: Class labels or predictions.
2. ID3 Algorithm 🛠️
- Steps:
- Start at the root.
- Choose the best attribute (max info gain).
- Split data based on attribute values.
- Repeat for each branch.
 
- Stopping Criteria:
- All examples in a branch are the same class.
- No more attributes to split.
- Assign majority class if no data.
 
3. Entropy & Information Gain 📊
- Entropy: Measures impurity/uncertainty.
Entropy(S) = -p_+ \log_2 p_+ - p_- \log_2 p_-- p_+: Proportion of positive examples.
- p_-: Proportion of negative examples.
 
- Information Gain:
Gain(S, A) = Entropy(S) - \sum_{v} \frac{|S_v|}{|S|} Entropy(S_v)- A: Attribute.
- S_v: Subset of data for value- v.
 
4. Overfitting & Pruning ✂️
- Overfitting: Tree too complex → fits noise.
- Pruning:
- Pre-pruning: Stop early (e.g., min samples per leaf).
- Post-pruning: Grow full tree, then remove nodes.
 
- Goal: Simplify tree to improve generalization.
5. Extensions 🔄
- Continuous Attributes: Discretize using thresholds.
- Missing Values: Use most frequent value or probability estimates.
- Cost-Sensitive Attributes: Modify gain to account for feature costs.
- Regression Trees: Predict numeric values (average in leaves).
6. Key Concepts 🔑
- Gini Index: Alternative to entropy for impurity.
Gini(S) = 1 - \sum p_i^2
- Gain Ratio: Adjusts info gain to penalize many-valued attributes.
GainRatio(S, A) = \frac{Gain(S, A)}{SplitInformation(S, A)}
- Multivariate Trees: Use linear combinations of attributes.
Mind Map 🧠
Decision Trees
├── ID3 Algorithm
│   ├── Entropy (impurity measure)
│   ├── Information Gain (choose best attribute)
│   └── Stopping Criteria (pure branch, no attributes)
├── Overfitting
│   ├── Pre-pruning (stop early)
│   └── Post-pruning (grow full, then cut)
└── Extensions
    ├── Continuous Attributes (discretize)
    ├── Missing Values (use most frequent)
    ├── Regression Trees (predict numeric values)
    └── Multivariate Trees (linear combinations)
Key Symbols 🔑
- S: Dataset.
- A: Attribute.
- p_+: Proportion of positive examples.
- p_-: Proportion of negative examples.
- Gain(S, A): Information gain for attribute- A.
- Gini(S): Gini impurity for dataset- S.
You’re ready! 🎉 Just remember Decision Trees = split data based on attributes, Entropy = measure of impurity, and Pruning = avoid overfitting! 🚀
1. Decision Trees Extensions 🌳
- Gain Ratio: Adjusts info gain to penalize attributes with many values.
GainRatio(S, A) = \frac{Gain(S, A)}{SplitInformation(S, A)}
- Continuous Attributes: Discretize using thresholds (e.g., Temperature > 54).
- Missing Values: Use most frequent value or probability estimates.
- Cost-Sensitive Attributes: Modify gain to account for feature costs.
Gain2(S, A) = \frac{Gain(S, A)^2}{Cost(S, A)}
2. Multiclass Classification 🎯
- Entropy for Multiple Classes:
Entropy(S) = -\sum_{i=1}^c p_i \log_2 p_i- c: Number of classes.
- p_i: Proportion of class- i.
 
3. Regression Trees 📈
- Goal: Predict continuous values.
- Splitting Criterion: Minimize variance (standard deviation reduction).
SDR(S, A) = SD(S) - \sum_{v} \frac{|S_v|}{|S|} SD(S_v)
- Prediction: Mean value in leaf nodes.
4. CART (Classification and Regression Trees) 🛠️
- Gini Index: Measures impurity.
Gini(S) = 1 - \sum_{i=1}^c p_i^2
- Weighted Gini:
Gini_{split} = \frac{N_1}{N} Gini(S_1) + \frac{N_2}{N} Gini(S_2)
- Regression: Use Mean Squared Error (MSE) for splitting.
MSE = \frac{1}{N} \sum (y_i - \hat{y})^2
5. Key Concepts 🔑
- Gain Ratio: Penalizes attributes with many values.
- Continuous Attributes: Discretize using thresholds.
- Missing Values: Use most frequent value or probability estimates.
- Regression Trees: Predict numeric values (mean in leaves).
- CART: Uses Gini Index for classification, MSE for regression.
Mind Map 🧠
Decision Trees Extensions
├── Gain Ratio (penalize many-valued attributes)
├── Continuous Attributes (discretize using thresholds)
├── Missing Values (use most frequent or probability)
├── Cost-Sensitive Attributes (modify gain with cost)
├── Multiclass Classification (entropy for multiple classes)
└── Regression Trees
    ├── Splitting Criterion (minimize variance)
    ├── Prediction (mean in leaves)
    └── CART (Gini Index for classification, MSE for regression)
Key Symbols 🔑
- S: Dataset.
- A: Attribute.
- Gain(S, A): Information gain for attribute- A.
- Gini(S): Gini impurity for dataset- S.
- MSE: Mean Squared Error (for regression).
You’re ready! 🎉 Just remember Decision Trees = split data based on attributes, Gain Ratio = penalize many-valued attributes, and Regression Trees = predict numeric values! 🚀