DB Scan Clustering Numerical

May 9, 2025

Updated 1 month ago

4 min read

DBSCAN Clustering Algorithm: A Numerical Solution Explained

DBSCAN clustering (Density-Based Spatial Clustering of Applications with Noise) is a powerful unsupervised machine learning algorithm that groups data points based on density rather than distance to a centroid. Unlike K-Means clustering, DBSCAN clustering can discover clusters of arbitrary shape and naturally handles noise points — outliers that don't belong to any cluster.

In this numerical solution, we'll work through a 12-point dataset step by step, classifying every point as a core point, border point, or noise point using ε = 1.9 and MinPts = 4.

DBSCAN Concepts Recap

Before jumping into the data, here are the three key definitions that drive every classification decision:

ε (epsilon): Radius of the neighborhood around a point
MinPts: Minimum number of points (including the point itself) required within ε to qualify as a core point
Core Point: Has ≥ MinPts neighbors within ε distance — the backbone of any cluster in data clustering
Border Point: Fewer than MinPts neighbors, but lies within ε of a core point
Noise Point: Neither a core point nor reachable from any core point — treated as an outlier

These three categories are what DBSCAN clustering uses to separate meaningful structure from noise in any dataset.

Given Parameters

ε = 1.9
MinPts = 4

Dataset: 12 Points on a 2D Grid

Point	x	y
P1	7	4
P2	6	4
P3	5	6
P4	4	2
P5	6	3
P6	5	2
P7	3	3
P8	4	5
P9	6	5
P10	3	6
P11	4	4
P12	8	2

DBSCAN clustering numerical example showing labeled data points P1 to P12 plotted on a 2D coordinate grid for density-based clustering analysis

Step 1: Euclidean Distance Matrix

The Euclidean distance between any two points A and B is:

d (A, B) = (x_{2} - x_{1})^{2} + (y_{2} - y_{1})^{2}

We compute this for all 12×12 pairs to determine which points fall within ε = 1.9 of each other.

12×12 Distance Matrix (rounded to 2 decimals)

	P1	P2	P3	P4	P5	P6	P7	P8	P9	P10	P11	P12
P1	0.00	1.00	2.83	3.61	1.41	2.83	4.12	3.16	1.41	4.47	3.00	2.24
P2	1.00	0.00	2.24	2.83	1.00	2.24	3.16	2.24	1.00	3.61	2.00	2.83
P3	2.83	2.24	0.00	4.12	3.16	4.00	3.61	1.41	1.41	2.00	2.24	5.00
P4	3.61	2.83	4.12	0.00	2.24	1.00	1.41	3.00	3.61	4.12	2.00	4.00
P5	1.41	1.00	3.16	2.24	0.00	1.41	3.00	2.83	2.00	4.24	2.24	2.24
P6	2.83	2.24	4.00	1.00	1.41	0.00	2.24	3.16	3.16	4.47	2.24	3.00
P7	4.12	3.16	3.61	1.41	3.00	2.24	0.00	2.24	3.61	3.00	1.41	5.10
P8	3.16	2.24	1.41	3.00	2.83	3.16	2.24	0.00	2.00	1.41	1.00	5.00
P9	1.41	1.00	1.41	3.61	2.00	3.16	3.61	2.00	0.00	3.16	2.24	3.61
P10	4.47	3.61	2.00	4.12	4.24	4.47	3.00	1.41	3.16	0.00	2.24	6.40
P11	3.00	2.00	2.24	2.00	2.24	2.24	1.41	1.00	2.24	2.24	0.00	4.47
P12	2.24	2.83	5.00	4.00	2.24	3.00	5.10	5.00	3.61	6.40	4.47	0.00

Any cell value ≤ 1.9 means those two points are neighbors. Each point also counts itself, so a point needs 3 external neighbors to reach MinPts = 4.

Step 2: Classify Each Point

Core Point Condition

≥ 4 neighbors (including itself) within distance ≤ 1.9

Border Point Condition

Fewer than 4 neighbors, but lies within ε of at least one core point

Noise Point Condition

Neither a core point nor within ε of any core point

P1 (7, 4)

Neighbors within ε = 1.9: P2 (1.00), P5 (1.41), P9 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point

P2 (6, 4)

Neighbors within ε = 1.9: P1 (1.00), P5 (1.00), P9 (1.00) → plus itself = 4

✅ 4 ≥ MinPts → Core Point

P3 (5, 6)

Neighbors within ε = 1.9: P8 (1.41), P9 (1.41) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P3 lies within ε of P8 and P9, both core points.

Border Point (of P8, P9)

P4 (4, 2)

Neighbors within ε = 1.9: P6 (1.00), P7 (1.41) → plus itself = 3

❌ 3 < MinPts. P4's neighbors (P6, P7) are not core points, so P4 is unreachable from any core.

❌ Noise Point

P5 (6, 3)

Neighbors within ε = 1.9: P1 (1.41), P2 (1.00), P6 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point

P6 (5, 2)

Neighbors within ε = 1.9: P4 (1.00), P5 (1.41) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P6 lies within ε of P5, a core point.

Border Point (of P5)

P7 (3, 3)

Neighbors within ε = 1.9: P4 (1.41), P11 (1.41) → plus itself = 3

❌ 3 < MinPts. Neither P4 nor P11 are core points, so P7 is not reachable from any core.

❌ Noise Point

P8 (4, 5)

Neighbors within ε = 1.9: P3 (1.41), P10 (1.41), P11 (1.00) → plus itself = 4

✅ 4 ≥ MinPts → Core Point

P9 (6, 5)

Neighbors within ε = 1.9: P1 (1.41), P2 (1.00), P3 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point

P10 (3, 6)

Neighbors within ε = 1.9: P8 (1.41) → plus itself = 2

⚠️ 2 < MinPts → not a core point. But P10 lies within ε of P8, a core point.

Border Point (of P8)

P11 (4, 4)

Neighbors within ε = 1.9: P7 (1.41), P8 (1.00) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P11 lies within ε of P8, a core point.

Border Point (of P8)

P12 (8, 2)

Neighbors within ε = 1.9: none → plus itself = 1

❌ 1 < MinPts, and no core point is within ε of P12.

❌ Noise Point

Final Classification Summary

Type	Points
Core	P1, P2, P5, P8, P9
Border	P3, P6, P10, P11
Noise	P4, P7, P12

Full Point-wise Table

Point	Neighbors (within ε=1.9)	Count	Type
P1	P2, P5, P9	4	✅ Core
P2	P1, P5, P9	4	✅ Core
P3	P8, P9	3	⚠️ Border (P8, P9)
P4	P6, P7	3	❌ Noise
P5	P1, P2, P6	4	✅ Core
P6	P4, P5	3	⚠️ Border (P5)
P7	P4, P11	3	❌ Noise
P8	P3, P10, P11	4	✅ Core
P9	P1, P2, P3	4	✅ Core
P10	P8	2	⚠️ Border (P8)
P11	P7, P8	3	⚠️ Border (P8)
P12	None	1	❌ Noise

Key Takeaway

This numerical solution demonstrates how DBSCAN clustering builds clusters purely from density reachability — no preset number of clusters required. The five core points (P1, P2, P5, P8, P9) anchor two overlapping density regions, the four border points attach to those regions at the edges, and three noise points (P4, P7, P12) remain isolated outliers.

This behavior is what makes DBSCAN clustering especially useful in real-world machine learning tasks where cluster shapes are irregular and outliers are expected. To understand the theory behind this algorithm before working numericals, see the DBSCAN Clustering concept note. For comparison with centroid-based data clustering, refer to K-Means Clustering.

Related Notes:
DBSCAN Clustering (Theory) → · K-Means Clustering → · Dimensionality Reduction