Random Forest (ID3 algorithm)
Random Forest Algorithm (ID3): A Complete Step-by-Step Guide
Random Forest is an ensemble machine-learning algorithm that builds many decision trees on random subsets of training data and combines their predictions. For classification tasks it takes a majority vote; for regression it averages the outputs. By training each tree on a different bootstrap sample, it dramatically reduces overfitting while maintaining high accuracy.
The Core Idea: Why Multiple Trees?
A single decision tree is prone to overfitting — it memorises the training data and performs poorly on unseen data. Random Forest fixes this by:
Bootstrapping — sampling the training data with replacement to create different subsets for each tree.
Aggregating — combining predictions from all trees so individual errors cancel out.
The Dataset
Day | Outlook | Temp | Humidity | Wind | Can Play |
|---|---|---|---|---|---|
D1 | Sunny | Hot | High | Weak | No |
D2 | Sunny | Hot | High | Strong | No |
D3 | Overcast | Mild | High | Weak | Yes |
D4 | Rain | Cool | High | Weak | Yes |
D5 | Rain | Cool | Normal | Weak | Yes |
D6 | Rain | Cool | Normal | Strong | No |
D7 | Overcast | Cool | Normal | Strong | Yes |
D8 | Sunny | Mild | High | Weak | No |
D9 | Sunny | Cool | Normal | Weak | Yes |
D10 | Rain | Mild | Normal | Weak | Yes |
D11 | Sunny | Mild | Normal | Strong | Yes |
D12 | Overcast | Mild | High | Strong | Yes |
D13 | Overcast | Hot | Normal | Weak | Yes |
D14 | Rain | Mild | High | Strong | No |
Unseen Data Point (to classify)
Outlook | Temp | Humidity | Wind |
|---|---|---|---|
Overcast | Mild | Normal | Weak |
We will build 3 trees on 3 different bootstrap samples and take a majority vote.
Entropy and Information Gain — Quick Recap
Entropy measures the impurity of a set:
Information Gain measures how much an attribute reduces impurity:
The attribute with the highest gain becomes the splitting node.
Model 1 — Bootstrap Sample (D1–D10)
Day | Outlook | Temp | Humidity | Wind | Can Play |
|---|---|---|---|---|---|
D1 | Sunny | Hot | High | Weak | No |
D2 | Sunny | Hot | High | Strong | No |
D3 | Overcast | Mild | High | Weak | Yes |
D4 | Rain | Cool | High | Weak | Yes |
D5 | Rain | Cool | Normal | Weak | No ← resampled |
D6 | Rain | Cool | Normal | Strong | Yes ← resampled |
D7 | Overcast | Cool | Normal | Strong | Yes |
D8 | Sunny | Mild | High | Weak | No |
D9 | Sunny | Cool | Normal | Weak | Yes |
D10 | Rain | Mild | Normal | Weak | Yes |
6 Yes, 4 No (10 records)
Step 1 — Root Entropy
Step 2 — Information Gain for Each Attribute
Outlook
Value | Records | Entropy |
|---|---|---|
Sunny | D1, D2, D8, D9 → [No, No, No, Yes] | |
Overcast | D3, D7 → [Yes, Yes] | 0 (pure) |
Rain | D4, D5, D6, D10 → [Yes, No, Yes, Yes] |
Temp
Value | Records | Entropy |
|---|---|---|
Hot | D1, D2 → [No, No] | 0 (pure) |
Mild | D3, D8, D10 → [Yes, No, Yes] | |
Cool | D4, D5, D6, D7, D9 → [Yes, No, Yes, Yes, Yes] |
Humidity
Value | Records | Entropy |
|---|---|---|
High | D1, D2, D3, D4, D8 → [No, No, Yes, Yes, No] | |
Normal | D5, D6, D7, D9, D10 → [No, Yes, Yes, Yes, Yes] |
Wind
Value | Records | Entropy |
|---|---|---|
Weak | D1, D3, D4, D5, D8, D9, D10 → [No, Yes, Yes, No, No, Yes, Yes] | |
Strong | D2, D6, D7 → [No, Yes, Yes] |
Summary of Gains:
Attribute | Gain |
|---|---|
Outlook | 0.322 |
Temp | 0.335 ← Max |
Humidity | 0.124 |
Wind | 0.006 |
✅ Temp has the highest gain → Root Node = Temp
Expanding the Temp = Hot Branch
Records: D1 [No], D2 [No] → Pure leaf: No
Expanding the Temp = Mild Branch (S1)
Records: D3 [Overcast, Yes], D8 [Sunny, No], D10 [Rain, Yes]
Outlook on S1
Value | Records | Entropy |
|---|---|---|
Overcast | [Yes] | 0 (pure) |
Sunny | [No] | 0 (pure) |
Rain | [Yes] | 0 (pure) |
Humidity on S1
Value | Records | Entropy |
|---|---|---|
High | D3, D8 → [Yes, No] | 1.0 |
Normal | D10 → [Yes] | 0 |
Wind on S1
All records have Wind = Weak → only one value, no split possible.
✅ Outlook has the highest gain (0.918) → Split Temp=Mild on Outlook
Overcast → Yes
Sunny → No
Rain → Yes
Expanding the Temp = Cool Branch (S2)
Records: D4 [Rain, High, Weak, Yes], D5 [Rain, Normal, Weak, No], D6 [Rain, Normal, Strong, Yes], D7 [Overcast, Normal, Strong, Yes], D9 [Sunny, Normal, Weak, Yes]
Outlook on S2
Value | Records | Entropy |
|---|---|---|
Overcast | D7 → [Yes] | 0 (pure) |
Sunny | D9 → [Yes] | 0 (pure) |
Rain | D4, D5, D6 → [Yes, No, Yes] |
Humidity on S2
Value | Records | Entropy |
|---|---|---|
High | D4 → [Yes] | 0 (pure) |
Normal | D5, D6, D7, D9 → [No, Yes, Yes, Yes] |
Wind on S2
Value | Records | Entropy |
|---|---|---|
Weak | D4, D5, D9 → [Yes, No, Yes] | 0.918 |
Strong | D6, D7 → [Yes, Yes] | 0 (pure) |
Summary for S2:
Attribute | Gain |
|---|---|
Outlook | 0.171 ← tied max |
Humidity | 0.073 |
Wind | 0.171 ← tied max |
Both Outlook and Wind tie at 0.171. We pick Outlook (alphabetically or by convention).
✅ Split Temp=Cool on Outlook:
Overcast → Yes (pure)
Sunny → Yes (pure)
Rain → [D4: Yes, D5: No, D6: Yes] — needs further split
Further split: Temp=Cool, Outlook=Rain
Records: D4 [High, Weak, Yes], D5 [Normal, Weak, No], D6 [Normal, Strong, Yes]
Entropy = 0.918
Wind:
Weak | D4, D5 → [Yes, No] | E = 1.0 |
|---|---|---|
Strong | D6 → [Yes] | E = 0 |
Humidity:
High | D4 → [Yes] | E = 0 |
|---|---|---|
Normal | D5, D6 → [No, Yes] | E = 1.0 |
Both tie. Pick Wind:
Strong → Yes
Weak → [D4: Yes, D5: No] → still impure → pick majority → Yes
Model 1 Classification
Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak
Root: Temp = Mild → go to Mild branch
Mild → Outlook = Overcast → Yes
Model 1 Prediction: ✅ Yes
Model 2 — Bootstrap Sample (D3–D12)
Day | Outlook | Temp | Humidity | Wind | Can Play |
|---|---|---|---|---|---|
D3 | Overcast | Mild | High | Weak | Yes |
D4 | Rain | Cool | High | Weak | Yes |
D5 | Rain | Cool | Normal | Weak | Yes |
D6 | Rain | Cool | Normal | Strong | No |
D7 | Overcast | Cool | Normal | Strong | Yes |
D8 | Sunny | Mild | High | Weak | No |
D9 | Sunny | Cool | Normal | Weak | Yes |
D10 | Rain | Mild | Normal | Weak | Yes |
D11 | Sunny | Mild | Normal | Strong | Yes |
D12 | Overcast | Mild | High | Strong | Yes |
7 Yes, 3 No (10 records)
Information Gain for Each Attribute
Outlook
Value | Records | Entropy |
|---|---|---|
Sunny | D8, D9, D11 → [No, Yes, Yes] | 0.918 |
Overcast | D3, D7, D12 → [Yes, Yes, Yes] | 0 (pure) |
Rain | D4, D5, D6, D10 → [Yes, Yes, No, Yes] | 0.811 |
Temp
Value | Records | Entropy |
|---|---|---|
Mild | D3, D8, D10, D11, D12 → [Yes, No, Yes, Yes, Yes] | |
Cool | D4, D5, D6, D7, D9 → [Yes, Yes, No, Yes, Yes] |
Humidity
Value | Records | Entropy |
|---|---|---|
High | D3, D4, D8, D12 → [Yes, Yes, No, Yes] | 0.811 |
Normal | D5, D6, D7, D9, D10, D11 → [Yes, No, Yes, Yes, Yes, Yes] |
Wind
Value | Records | Entropy |
|---|---|---|
Weak | D3, D4, D5, D8, D9, D10 → [Yes, Yes, Yes, No, Yes, Yes] | |
Strong | D6, D7, D11, D12 → [No, Yes, Yes, Yes] | 0.811 |
Summary of Gains (Model 2):
Attribute | Gain |
|---|---|
Outlook | 0.281 ← Max |
Temp | 0.159 |
Humidity | 0.167 |
Wind | 0.167 |
✅ Outlook has the highest gain → Root Node = Outlook
Outlook = Overcast → Pure Yes
All 3 records are Yes → Leaf: Yes
Outlook = Sunny
Records: D8 [Mild, High, Weak, No], D9 [Cool, Normal, Weak, Yes], D11 [Mild, Normal, Strong, Yes]
Entropy = 0.918
Humidity:
High | D8 → [No] | E = 0 |
|---|---|---|
Normal | D9, D11 → [Yes, Yes] | E = 0 |
Split on Humidity:
High → No
Normal → Yes
Outlook = Rain
Records: D4 [Cool, High, Weak, Yes], D5 [Cool, Normal, Weak, Yes], D6 [Cool, Normal, Strong, No], D10 [Mild, Normal, Weak, Yes]
Entropy =
Wind:
Weak | D4, D5, D10 → [Yes, Yes, Yes] | E = 0 |
|---|---|---|
Strong | D6 → [No] | E = 0 |
Split on Wind:
Weak → Yes
Strong → No
Model 2 Classification
Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak
Root: Outlook = Overcast → Yes
Model 2 Prediction: ✅ Yes
Model 3 — Bootstrap Sample (D5–D14)
Day | Outlook | Temp | Humidity | Wind | Can Play |
|---|---|---|---|---|---|
D5 | Rain | Cool | Normal | Weak | Yes |
D6 | Rain | Cool | Normal | Strong | No |
D7 | Overcast | Cool | Normal | Strong | Yes |
D8 | Sunny | Mild | High | Weak | No |
D9 | Sunny | Cool | Normal | Weak | Yes |
D10 | Rain | Mild | Normal | Weak | Yes |
D11 | Sunny | Mild | Normal | Strong | Yes |
D12 | Overcast | Mild | High | Strong | Yes |
D13 | Overcast | Hot | Normal | Weak | Yes |
D14 | Rain | Mild | High | Strong | No |
7 Yes, 3 No (10 records)
Information Gain for Each Attribute
Outlook
Value | Records | Entropy |
|---|---|---|
Sunny | D8, D9, D11 → [No, Yes, Yes] | 0.918 |
Overcast | D7, D12, D13 → [Yes, Yes, Yes] | 0 (pure) |
Rain | D5, D6, D10, D14 → [Yes, No, Yes, No] | 1.0 |
Temp
Value | Records | Entropy |
|---|---|---|
Hot | D13 → [Yes] | 0 |
Mild | D8, D10, D11, D12, D14 → [No, Yes, Yes, Yes, No] | |
Cool | D5, D6, D7, D9 → [Yes, No, Yes, Yes] | 0.811 |
Humidity
Value | Records | Entropy |
|---|---|---|
High | D8, D12, D14 → [No, Yes, No] | 0.918 |
Normal | D5, D6, D7, D9, D10, D11, D13 → [Yes, No, Yes, Yes, Yes, Yes, Yes] |
Wind
Value | Records | Entropy |
|---|---|---|
Weak | D5, D8, D9, D10, D13 → [Yes, No, Yes, Yes, Yes] | 0.722 |
Strong | D6, D7, D11, D12, D14 → [No, Yes, Yes, Yes, No] | 0.971 |
Summary of Gains (Model 3):
Attribute | Gain |
|---|---|
Outlook | 0.206 ← Max |
Temp | 0.071 |
Humidity | 0.191 |
Wind | 0.034 |
✅ Outlook has the highest gain → Root Node = Outlook
Outlook = Overcast
Records: D7, D12, D13 → all Yes → Leaf: Yes
Outlook = Sunny
Records: D8 [Mild, High, Weak, No], D9 [Cool, Normal, Weak, Yes], D11 [Mild, Normal, Strong, Yes]
Entropy = 0.918
Humidity splits perfectly (same as Model 2):
High → No
Normal → Yes
Outlook = Rain
Records: D5 [Cool, Normal, Weak, Yes], D6 [Cool, Normal, Strong, No], D10 [Mild, Normal, Weak, Yes], D14 [Mild, High, Strong, No]
Entropy = 1.0 (2 Yes, 2 No)
Wind:
Weak | D5, D10 → [Yes, Yes] | E = 0 |
|---|---|---|
Strong | D6, D14 → [No, No] | E = 0 |
Split on Wind:
Weak → Yes
Strong → No
Model 3 Classification
Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak
Root: Outlook = Overcast → Yes
Model 3 Prediction: ✅ Yes
Final Prediction — Majority Vote
Model | Prediction |
|---|---|
Model 1 (Temp root) | Yes |
Model 2 (Outlook root) | Yes |
Model 3 (Outlook root) | Yes |
✅ Final Answer: The unseen data point (Overcast, Mild, Normal, Weak) is classified as Yes — the person can play.
Summary — Key Takeaways
Concept | Explanation |
|---|---|
Bootstrap sampling | Each tree trains on a random subset (with replacement) of the data |
Uses entropy and information gain to pick the best splitting attribute | |
Root node selection | The attribute with the highest information gain becomes the split |
Pure leaf | When all records in a subset belong to one class, stop splitting |
Majority vote | Final prediction = class chosen by most trees |
Ensemble benefit | Individual tree errors cancel out, giving better overall accuracy |
Random Forest is powerful precisely because no two trees are identical — each sees a different slice of the data and grows differently. Their disagreements cancel out, leaving only the signal.
