DB Scan Clustering Numerical

May 9, 2025
Updated 1 day ago
4 min read

DBSCAN Clustering Algorithm: A Numerical Solution Explained

DBSCAN clustering (Density-Based Spatial Clustering of Applications with Noise) is a powerful unsupervised machine learning algorithm that groups data points based on density rather than distance to a centroid. Unlike K-Means clustering, DBSCAN clustering can discover clusters of arbitrary shape and naturally handles noise points — outliers that don't belong to any cluster.

In this numerical solution, we'll work through a 12-point dataset step by step, classifying every point as a core point, border point, or noise point using ε = 1.9 and MinPts = 4.


DBSCAN Concepts Recap

Before jumping into the data, here are the three key definitions that drive every classification decision:

  • ε (epsilon): Radius of the neighborhood around a point

  • MinPts: Minimum number of points (including the point itself) required within ε to qualify as a core point

  • Core Point: Has ≥ MinPts neighbors within ε distance — the backbone of any cluster in data clustering

  • Border Point: Fewer than MinPts neighbors, but lies within ε of a core point

  • Noise Point: Neither a core point nor reachable from any core point — treated as an outlier

These three categories are what DBSCAN clustering uses to separate meaningful structure from noise in any dataset.


Given Parameters

  • ε = 1.9

  • MinPts = 4


Dataset: 12 Points on a 2D Grid

Point

x

y

P1

7

4

P2

6

4

P3

5

6

P4

4

2

P5

6

3

P6

5

2

P7

3

3

P8

4

5

P9

6

5

P10

3

6

P11

4

4

P12

8

2

DBSCAN clustering numerical example showing labeled data points P1 to P12 plotted on a 2D coordinate grid for density-based clustering analysis

Step 1: Euclidean Distance Matrix

The Euclidean distance between any two points A and B is:

We compute this for all 12×12 pairs to determine which points fall within ε = 1.9 of each other.

12×12 Distance Matrix (rounded to 2 decimals)

P1

P2

P3

P4

P5

P6

P7

P8

P9

P10

P11

P12

P1

0.00

1.00

2.83

3.61

1.41

2.83

4.12

3.16

1.41

4.47

3.00

2.24

P2

1.00

0.00

2.24

2.83

1.00

2.24

3.16

2.24

1.00

3.61

2.00

2.83

P3

2.83

2.24

0.00

4.12

3.16

4.00

3.61

1.41

1.41

2.00

2.24

5.00

P4

3.61

2.83

4.12

0.00

2.24

1.00

1.41

3.00

3.61

4.12

2.00

4.00

P5

1.41

1.00

3.16

2.24

0.00

1.41

3.00

2.83

2.00

4.24

2.24

2.24

P6

2.83

2.24

4.00

1.00

1.41

0.00

2.24

3.16

3.16

4.47

2.24

3.00

P7

4.12

3.16

3.61

1.41

3.00

2.24

0.00

2.24

3.61

3.00

1.41

5.10

P8

3.16

2.24

1.41

3.00

2.83

3.16

2.24

0.00

2.00

1.41

1.00

5.00

P9

1.41

1.00

1.41

3.61

2.00

3.16

3.61

2.00

0.00

3.16

2.24

3.61

P10

4.47

3.61

2.00

4.12

4.24

4.47

3.00

1.41

3.16

0.00

2.24

6.40

P11

3.00

2.00

2.24

2.00

2.24

2.24

1.41

1.00

2.24

2.24

0.00

4.47

P12

2.24

2.83

5.00

4.00

2.24

3.00

5.10

5.00

3.61

6.40

4.47

0.00

Any cell value ≤ 1.9 means those two points are neighbors. Each point also counts itself, so a point needs 3 external neighbors to reach MinPts = 4.


Step 2: Classify Each Point

Core Point Condition

≥ 4 neighbors (including itself) within distance ≤ 1.9

Border Point Condition

Fewer than 4 neighbors, but lies within ε of at least one core point

Noise Point Condition

Neither a core point nor within ε of any core point


P1 (7, 4)

Neighbors within ε = 1.9: P2 (1.00), P5 (1.41), P9 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point


P2 (6, 4)

Neighbors within ε = 1.9: P1 (1.00), P5 (1.00), P9 (1.00) → plus itself = 4

✅ 4 ≥ MinPts → Core Point


P3 (5, 6)

Neighbors within ε = 1.9: P8 (1.41), P9 (1.41) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P3 lies within ε of P8 and P9, both core points.

Border Point (of P8, P9)


P4 (4, 2)

Neighbors within ε = 1.9: P6 (1.00), P7 (1.41) → plus itself = 3

❌ 3 < MinPts. P4's neighbors (P6, P7) are not core points, so P4 is unreachable from any core.

Noise Point


P5 (6, 3)

Neighbors within ε = 1.9: P1 (1.41), P2 (1.00), P6 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point


P6 (5, 2)

Neighbors within ε = 1.9: P4 (1.00), P5 (1.41) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P6 lies within ε of P5, a core point.

Border Point (of P5)


P7 (3, 3)

Neighbors within ε = 1.9: P4 (1.41), P11 (1.41) → plus itself = 3

❌ 3 < MinPts. Neither P4 nor P11 are core points, so P7 is not reachable from any core.

Noise Point


P8 (4, 5)

Neighbors within ε = 1.9: P3 (1.41), P10 (1.41), P11 (1.00) → plus itself = 4

✅ 4 ≥ MinPts → Core Point


P9 (6, 5)

Neighbors within ε = 1.9: P1 (1.41), P2 (1.00), P3 (1.41) → plus itself = 4

✅ 4 ≥ MinPts → Core Point


P10 (3, 6)

Neighbors within ε = 1.9: P8 (1.41) → plus itself = 2

⚠️ 2 < MinPts → not a core point. But P10 lies within ε of P8, a core point.

Border Point (of P8)


P11 (4, 4)

Neighbors within ε = 1.9: P7 (1.41), P8 (1.00) → plus itself = 3

⚠️ 3 < MinPts → not a core point. But P11 lies within ε of P8, a core point.

Border Point (of P8)


P12 (8, 2)

Neighbors within ε = 1.9: none → plus itself = 1

❌ 1 < MinPts, and no core point is within ε of P12.

Noise Point


Final Classification Summary

Type

Points

Core

P1, P2, P5, P8, P9

Border

P3, P6, P10, P11

Noise

P4, P7, P12

Full Point-wise Table

Point

Neighbors (within ε=1.9)

Count

Type

P1

P2, P5, P9

4

✅ Core

P2

P1, P5, P9

4

✅ Core

P3

P8, P9

3

⚠️ Border (P8, P9)

P4

P6, P7

3

❌ Noise

P5

P1, P2, P6

4

✅ Core

P6

P4, P5

3

⚠️ Border (P5)

P7

P4, P11

3

❌ Noise

P8

P3, P10, P11

4

✅ Core

P9

P1, P2, P3

4

✅ Core

P10

P8

2

⚠️ Border (P8)

P11

P7, P8

3

⚠️ Border (P8)

P12

None

1

❌ Noise


Key Takeaway

This numerical solution demonstrates how DBSCAN clustering builds clusters purely from density reachability — no preset number of clusters required. The five core points (P1, P2, P5, P8, P9) anchor two overlapping density regions, the four border points attach to those regions at the edges, and three noise points (P4, P7, P12) remain isolated outliers.

This behavior is what makes DBSCAN clustering especially useful in real-world machine learning tasks where cluster shapes are irregular and outliers are expected. To understand the theory behind this algorithm before working numericals, see the DBSCAN Clustering concept note. For comparison with centroid-based data clustering, refer to K-Means Clustering.


Related Notes:

DBSCAN Clustering (Theory) → · K-Means Clustering → · Dimensionality Reduction