Random Forest (ID3 algorithm)

Aug 7, 2025
Updated 1 day ago
5 min read

Random Forest Algorithm (ID3): A Complete Step-by-Step Guide

Random Forest is an ensemble machine-learning algorithm that builds many decision trees on random subsets of training data and combines their predictions. For classification tasks it takes a majority vote; for regression it averages the outputs. By training each tree on a different bootstrap sample, it dramatically reduces overfitting while maintaining high accuracy.


The Core Idea: Why Multiple Trees?

A single decision tree is prone to overfitting — it memorises the training data and performs poorly on unseen data. Random Forest fixes this by:

  1. Bootstrapping — sampling the training data with replacement to create different subsets for each tree.

  2. Aggregating — combining predictions from all trees so individual errors cancel out.


The Dataset

Day

Outlook

Temp

Humidity

Wind

Can Play

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Mild

High

Weak

Yes

D4

Rain

Cool

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

Unseen Data Point (to classify)

Outlook

Temp

Humidity

Wind

Overcast

Mild

Normal

Weak

We will build 3 trees on 3 different bootstrap samples and take a majority vote.


Entropy and Information Gain — Quick Recap

Entropy measures the impurity of a set:

Information Gain measures how much an attribute reduces impurity:

The attribute with the highest gain becomes the splitting node.


Model 1 — Bootstrap Sample (D1–D10)

Day

Outlook

Temp

Humidity

Wind

Can Play

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Mild

High

Weak

Yes

D4

Rain

Cool

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

No ← resampled

D6

Rain

Cool

Normal

Strong

Yes ← resampled

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

6 Yes, 4 No (10 records)

Step 1 — Root Entropy

Step 2 — Information Gain for Each Attribute

Outlook

Value

Records

Entropy

Sunny

D1, D2, D8, D9 → [No, No, No, Yes]

Overcast

D3, D7 → [Yes, Yes]

0 (pure)

Rain

D4, D5, D6, D10 → [Yes, No, Yes, Yes]

Temp

Value

Records

Entropy

Hot

D1, D2 → [No, No]

0 (pure)

Mild

D3, D8, D10 → [Yes, No, Yes]

Cool

D4, D5, D6, D7, D9 → [Yes, No, Yes, Yes, Yes]

Humidity

Value

Records

Entropy

High

D1, D2, D3, D4, D8 → [No, No, Yes, Yes, No]

Normal

D5, D6, D7, D9, D10 → [No, Yes, Yes, Yes, Yes]

Wind

Value

Records

Entropy

Weak

D1, D3, D4, D5, D8, D9, D10 → [No, Yes, Yes, No, No, Yes, Yes]

Strong

D2, D6, D7 → [No, Yes, Yes]

Summary of Gains:

Attribute

Gain

Outlook

0.322

Temp

0.335 ← Max

Humidity

0.124

Wind

0.006

✅ Temp has the highest gain → Root Node = Temp

Model 1 – Step 1: Initial ID3 decision tree setup for Random Forest Model 1 showing root node selection. Temp is selected as the root attribute with the highest information gain (0.335), branching into Hot, Mild, and Cool.

Expanding the Temp = Hot Branch

Records: D1 [No], D2 [No] → Pure leaf: No

Model 1 – Step 2: Expansion of the Temp = Hot branch in Model 1. The Hot branch becomes a pure leaf node classified as “No,” while Mild and Cool remain unexpanded.

Expanding the Temp = Mild Branch (S1)

Records: D3 [Overcast, Yes], D8 [Sunny, No], D10 [Rain, Yes]

Outlook on S1

Value

Records

Entropy

Overcast

[Yes]

0 (pure)

Sunny

[No]

0 (pure)

Rain

[Yes]

0 (pure)

Humidity on S1

Value

Records

Entropy

High

D3, D8 → [Yes, No]

1.0

Normal

D10 → [Yes]

0

Wind on S1

All records have Wind = Weak → only one value, no split possible.

✅ Outlook has the highest gain (0.918) → Split Temp=Mild on Outlook

  • Overcast → Yes

  • Sunny → No

  • Rain → Yes

Model 1 – Step 3: Expansion of the Temp = Mild branch in Model 1. Outlook is selected as the best splitting attribute (gain = 0.918), creating Overcast → Yes, Sunny → No, and Rain → Yes branches.

Expanding the Temp = Cool Branch (S2)

Records: D4 [Rain, High, Weak, Yes], D5 [Rain, Normal, Weak, No], D6 [Rain, Normal, Strong, Yes], D7 [Overcast, Normal, Strong, Yes], D9 [Sunny, Normal, Weak, Yes]

Outlook on S2

Value

Records

Entropy

Overcast

D7 → [Yes]

0 (pure)

Sunny

D9 → [Yes]

0 (pure)

Rain

D4, D5, D6 → [Yes, No, Yes]

Humidity on S2

Value

Records

Entropy

High

D4 → [Yes]

0 (pure)

Normal

D5, D6, D7, D9 → [No, Yes, Yes, Yes]

Wind on S2

Value

Records

Entropy

Weak

D4, D5, D9 → [Yes, No, Yes]

0.918

Strong

D6, D7 → [Yes, Yes]

0 (pure)

Summary for S2:

Attribute

Gain

Outlook

0.171 ← tied max

Humidity

0.073

Wind

0.171 ← tied max

Both Outlook and Wind tie at 0.171. We pick Outlook (alphabetically or by convention).

✅ Split Temp=Cool on Outlook:

  • Overcast → Yes (pure)

  • Sunny → Yes (pure)

  • Rain → [D4: Yes, D5: No, D6: Yes] — needs further split

Further split: Temp=Cool, Outlook=Rain

Records: D4 [High, Weak, Yes], D5 [Normal, Weak, No], D6 [Normal, Strong, Yes]

Entropy = 0.918

Wind:

Weak

D4, D5 → [Yes, No]

E = 1.0

Strong

D6 → [Yes]

E = 0

Humidity:

High

D4 → [Yes]

E = 0

Normal

D5, D6 → [No, Yes]

E = 1.0

Both tie. Pick Wind:

  • Strong → Yes

  • Weak → [D4: Yes, D5: No] → still impure → pick majority → Yes

Model 1 Classification

Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak

  • Root: Temp = Mild → go to Mild branch

  • Mild → Outlook = Overcast → Yes

Model 1 Prediction: ✅ Yes

Model 1 – Step 4: Final expansion of the Temp = Cool branch in Model 1. Outlook is selected as the next split, and the Rain subset further splits on Wind into Strong → Yes and Weak → Yes (majority leaf).

Model 2 — Bootstrap Sample (D3–D12)

Day

Outlook

Temp

Humidity

Wind

Can Play

D3

Overcast

Mild

High

Weak

Yes

D4

Rain

Cool

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

7 Yes, 3 No (10 records)

Information Gain for Each Attribute

Outlook

Value

Records

Entropy

Sunny

D8, D9, D11 → [No, Yes, Yes]

0.918

Overcast

D3, D7, D12 → [Yes, Yes, Yes]

0 (pure)

Rain

D4, D5, D6, D10 → [Yes, Yes, No, Yes]

0.811

Temp

Value

Records

Entropy

Mild

D3, D8, D10, D11, D12 → [Yes, No, Yes, Yes, Yes]

Cool

D4, D5, D6, D7, D9 → [Yes, Yes, No, Yes, Yes]

Humidity

Value

Records

Entropy

High

D3, D4, D8, D12 → [Yes, Yes, No, Yes]

0.811

Normal

D5, D6, D7, D9, D10, D11 → [Yes, No, Yes, Yes, Yes, Yes]

Wind

Value

Records

Entropy

Weak

D3, D4, D5, D8, D9, D10 → [Yes, Yes, Yes, No, Yes, Yes]

Strong

D6, D7, D11, D12 → [No, Yes, Yes, Yes]

0.811

Summary of Gains (Model 2):

Attribute

Gain

Outlook

0.281 ← Max

Temp

0.159

Humidity

0.167

Wind

0.167

✅ Outlook has the highest gain → Root Node = Outlook

Model 2 – Step 1: Initial ID3 tree construction for Random Forest Model 2 showing root node selection. Outlook is chosen as the root attribute with the highest information gain (0.281), branching into Overcast, Sunny, and Rain.
Outlook = Overcast → Pure Yes

All 3 records are Yes → Leaf: Yes

Model 2 – Step 2: Expansion of the Outlook = Overcast branch in Model 2. Since all records are positive, the branch becomes a pure “Yes” leaf node.
Outlook = Sunny

Records: D8 [Mild, High, Weak, No], D9 [Cool, Normal, Weak, Yes], D11 [Mild, Normal, Strong, Yes]

Entropy = 0.918

Humidity:

High

D8 → [No]

E = 0

Normal

D9, D11 → [Yes, Yes]

E = 0

Split on Humidity:

  • High → No

  • Normal → Yes

Model 2 – Step 3: Expansion of the Outlook = Sunny branch in Model 2. Humidity is selected as the best splitting attribute (gain = 0.918), producing High → No and Normal → Yes leaf nodes.
Outlook = Rain

Records: D4 [Cool, High, Weak, Yes], D5 [Cool, Normal, Weak, Yes], D6 [Cool, Normal, Strong, No], D10 [Mild, Normal, Weak, Yes]

Entropy =

Wind:

Weak

D4, D5, D10 → [Yes, Yes, Yes]

E = 0

Strong

D6 → [No]

E = 0

Split on Wind:

  • Weak → Yes

  • Strong → No

Model 2 Classification

Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak

  • Root: Outlook = Overcast → Yes

Model 2 Prediction: ✅ Yes

Model 2 – Step 4: Final expansion of the Outlook = Rain branch in Model 2. Wind is selected as the splitting attribute (gain = 0.811), producing Weak → Yes and Strong → No classifications.

Model 3 — Bootstrap Sample (D5–D14)

Day

Outlook

Temp

Humidity

Wind

Can Play

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

7 Yes, 3 No (10 records)

Information Gain for Each Attribute

Outlook

Value

Records

Entropy

Sunny

D8, D9, D11 → [No, Yes, Yes]

0.918

Overcast

D7, D12, D13 → [Yes, Yes, Yes]

0 (pure)

Rain

D5, D6, D10, D14 → [Yes, No, Yes, No]

1.0

Temp

Value

Records

Entropy

Hot

D13 → [Yes]

0

Mild

D8, D10, D11, D12, D14 → [No, Yes, Yes, Yes, No]

Cool

D5, D6, D7, D9 → [Yes, No, Yes, Yes]

0.811

Humidity

Value

Records

Entropy

High

D8, D12, D14 → [No, Yes, No]

0.918

Normal

D5, D6, D7, D9, D10, D11, D13 → [Yes, No, Yes, Yes, Yes, Yes, Yes]

Wind

Value

Records

Entropy

Weak

D5, D8, D9, D10, D13 → [Yes, No, Yes, Yes, Yes]

0.722

Strong

D6, D7, D11, D12, D14 → [No, Yes, Yes, Yes, No]

0.971

Summary of Gains (Model 3):

Attribute

Gain

Outlook

0.206 ← Max

Temp

0.071

Humidity

0.191

Wind

0.034

✅ Outlook has the highest gain → Root Node = Outlook

Model 3 – Step 1: Initial ID3 tree construction for Random Forest Model 3 showing root node selection. Outlook is selected as the root node with the highest information gain (0.206).
Outlook = Overcast

Records: D7, D12, D13 → all Yes → Leaf: Yes

Model 3 – Step 2: Expansion of the Outlook = Overcast branch in Model 3. The branch becomes a pure “Yes” leaf because all training examples are positive.
Outlook = Sunny

Records: D8 [Mild, High, Weak, No], D9 [Cool, Normal, Weak, Yes], D11 [Mild, Normal, Strong, Yes]

Entropy = 0.918

Humidity splits perfectly (same as Model 2):

  • High → No

  • Normal → Yes

Model 3 – Step 3: Expansion of the Outlook = Sunny branch in Model 3. Humidity is selected as the best splitting attribute, producing High → No and Normal → Yes outcomes.
Outlook = Rain

Records: D5 [Cool, Normal, Weak, Yes], D6 [Cool, Normal, Strong, No], D10 [Mild, Normal, Weak, Yes], D14 [Mild, High, Strong, No]

Entropy = 1.0 (2 Yes, 2 No)

Wind:

Weak

D5, D10 → [Yes, Yes]

E = 0

Strong

D6, D14 → [No, No]

E = 0

Split on Wind:

  • Weak → Yes

  • Strong → No

Model 3 Classification

Unseen point: Outlook=Overcast, Temp=Mild, Humidity=Normal, Wind=Weak

  • Root: Outlook = Overcast → Yes

Model 3 Prediction: ✅ Yes

Model 3 – Step 4: Final expansion of the Outlook = Rain branch in Model 3. Wind is selected as the optimal split with perfect information gain (1.000), producing Weak → Yes and Strong → No leaf nodes.

Final Prediction — Majority Vote

Model

Prediction

Model 1 (Temp root)

Yes

Model 2 (Outlook root)

Yes

Model 3 (Outlook root)

Yes

✅ Final Answer: The unseen data point (Overcast, Mild, Normal, Weak) is classified as Yes — the person can play.


Summary — Key Takeaways

Concept

Explanation

Bootstrap sampling

Each tree trains on a random subset (with replacement) of the data

ID3 algorithm

Uses entropy and information gain to pick the best splitting attribute

Root node selection

The attribute with the highest information gain becomes the split

Pure leaf

When all records in a subset belong to one class, stop splitting

Majority vote

Final prediction = class chosen by most trees

Ensemble benefit

Individual tree errors cancel out, giving better overall accuracy

Random Forest is powerful precisely because no two trees are identical — each sees a different slice of the data and grows differently. Their disagreements cancel out, leaving only the signal.