Created
Dec 8, 2025
Last Modified
3 months ago

Optimal Binary Search Tree (OBST)

Optimal Binary Search Tree (Optimal BST)

  • Let be the distinct keys ordered from the smallest to the longest and

  • Let be the probability of searching their

  • be the smallest average number of comparisons made for a successful search in a binary search tree

  • (B.S.T) made of keys

    • Where are integer indices

    • and,

We only want but to compute it, we must compute all smaller subproblems.


Recurrence Relation (Dynamic Programming)

To build the optimal BST for keys
:

We try choosing each key ​ (where ) as the root.

If ​ is root:

  • Left subtree:

  • Right subtree:

Both must be optimally arranged.

Cost for choosing ​ as a root:

The extra sum is because when a subtree becomes deeper, every key inside gains +1 comparison.

Final Recurrence

Boundary case:


Expected Cost Formula

If a tree assigns levels:

Then expected search cost is:


Construction OF OBST

When building an Optimal Binary Search Tree, the goal is to minimize the average search cost using the given probabilities of each key.

To compute this efficiently, we use dynamic programming, specifically two tables:


1. Cost Table

  • The table C[i][j] stores the minimum cost of constructing an optimal BST using keys from index i to j.

  • Main diagonal = 0, because when i == j, there’s no key → empty tree → cost is zero.

  • The given probabilities P[i] are placed just above the diagonal, since each key alone is its own OBST.

Table Initialization:

  • C[i][i] = 0

  • C[i][i+1] = P[i]

Filling the Table:

  • We compute values diagonally, moving from lower-left to upper-right.

  • Each entry C[i][j] represents the optimal cost of keys i…j.

  • The final value C[1][N] gives the minimum average search cost of the entire OBST.


2. Root Table

To reconstruct the actual tree, we maintain another table:

  • K[i][j] stores the index of the key that gives the minimum cost for the subproblem (i, j).

  • This value represents the optimal root for that range.

Why K[i][j] is important?

Because once we know the root for each subproblem:

  • The left subtree is built from keys i to K[i][j] - 1

  • The right subtree is built from keys K[i][j] + 1 to j

Using this, we can recursively rebuild the full Optimal Binary Search Tree.

i/j

1

2

...

j

...

n

1

2

0

0

0

i

0

0

0

0

0

0

0

n

0

0

0

0

0


Find O.B.S.T for the data (Problem 1)

Given Data

Keys = [A, B, C, D]
Prob = [0.1, 0.2, 0.4, 0.3]

Step 1 — Sort Keys

Keys = [A, B, C, D]
Prob = [0.1, 0.2, 0.3, 0.4]

Step 2 — Fill (length = 1)

i/j

1

2

3

4

1

0.1

2

0

0.2

3

0

0

0.3

4

0

0

0

0.4

Step 2 — Fill Table Using Formula

Base (main diagonal)


C[1,2]


C[2,3]


C[3,4]


C[1,3]


C[2,4]


C[1,4] (final)


Final 4×4 cost matrix (showing entries)

i/j

1

2

3

4

1

2

0

3

0

0

4

0

0

0

("0" denotes invalid entries; used in computations.)


Find O.B.S.T for the data (Problem 2)

A = [5, 20, 10, 3]

P = [20, 7, 12, 9]

Step 1 — Sort A

A = [3, 5, 10, 20]

P = [9, 20, 12, 7]

Step 2 — Fill (length = 1)

i/j

1

2

3

4

1

9

2

0

20

3

0

0

12

4

0

0

0

7

Step 2 — Fill Table Using Formula

Base (main diagonal)


C[1,2]


C[2,3]


C[3,4]


C[1,3]


C[2,4]


C[1,4] (final)


Final 4×4 cost matrix (showing entries)

i/j

1

2

3

4

1

2

0

3

0

0

4

0

0

0

("0" denotes invalid entries; used in computations.)