Statistics

Feb 20, 2025
Updated 3 hours ago
5 min read

Statistics for Machine Learning: Practice Numericals With Solutions

Topic: A complete statistics practice set for Data Science and Machine Learning students — covering mean, median, mode, standard deviation, variance, quartiles, grouped data, and skewness with step-by-step solutions.

Before you can understand how a machine learning model learns, you need statistics. Every algorithm you'll ever use — linear regression, naive Bayes, neural networks, clustering — is built on statistical concepts. Mean and variance tell you about your data's center and spread. Standard deviation flags outliers. Skewness tells you if your distribution is lopsided and whether to normalize.

This practice set covers everything from ungrouped data calculations to grouped frequency distributions — the exact problems that appear in university exams and data science interviews. Every question includes a full step-by-step solution.


Important Formulas — Keep These Open While Solving

Formula

Expression

Arithmetic Mean

Mean (Frequency Distribution)

Standard Deviation

Variance

Median (Grouped Data)

Mode (Grouped Data)

Coefficient of Variation

Karl Pearson's Skewness

or


Part 1: Mean, Median, Mode (Ungrouped Data)


Q1. Arithmetic Mean

Find the mean of: 12, 15, 18, 20, 25, 30, 35

Solution:

Why it matters in ML: The mean is the foundation of linear regression. The regression line always passes through .


Q2. Median

Find the median of: 7, 12, 15, 18, 21, 24, 27, 30

Solution:

Data is already sorted. n = 8 (even).

Why it matters in ML: The median is robust to outliers — which is why median absolute deviation is preferred over standard deviation for outlier detection in real datasets.


Q3. Mode

Find the mode of: 4, 5, 7, 8, 5, 9, 5, 10, 7, 8

Solution:

Count frequencies:

  • 4 → 1 time

  • 5 → 3 times

  • 7 → 2 times

  • 8 → 2 times

  • 9 → 1 time

  • 10 → 1 time

Why it matters in ML: In classification problems, predicting the most frequent class is the baseline — called the "zero-rule classifier." If your model can't beat the mode, it's not learning anything.


Q4. Mean and Median

Find both mean and median of: 22, 25, 28, 30, 32, 35, 40, 45

Solution:

Mean:

Median (n = 8, even):


Q5. Missing Frequency

The mean of 10, 15, 20, 25, x is 18. Find x.

Solution:


Part 2: Standard Deviation & Variance

Standard deviation measures how spread out your data is around the mean. A small σ means the data clusters tightly. A large σ means it's scattered. In ML, high-variance features can dominate distance-based algorithms like KNN — which is why feature scaling (normalization/standardization) matters.


Q6. Variance and Standard Deviation

Find variance and standard deviation of: 5, 7, 9, 11, 13

Solution:

Step 1 — Find mean:

Step 2 — Find deviations and squared deviations:

x

5

−4

16

7

−2

4

9

0

0

11

+2

4

13

+4

16

Total

40

Step 3:


Q7. Standard Deviation

Find the standard deviation of: 2, 4, 4, 4, 5, 5, 7, 9

Solution:

Mean:

Squared deviations:

x

2

−3

9

4

−1

1

4

−1

1

4

−1

1

5

0

0

5

0

0

7

+2

4

9

+4

16

Total

32

Note: This is the same dataset used in many ML textbooks to explain why for a "nicely behaved" distribution. Notice how the mean is exactly 5 and the values are symmetric around it.


Q8. Coefficient of Variation

Find mean, standard deviation, and CV for: 10, 12, 15, 18, 20

Solution:

Mean:

Squared deviations:

x

10

−5

25

12

−3

9

15

0

0

18

+3

9

20

+5

25

Total

68

Why it matters in ML: CV compares variability across features with different units (e.g., salary in thousands vs age in years). A high CV feature needs normalization before using distance-based models.


Part 3: Range, Quartiles, and Dispersion


Q9. Range

Find the range of: 18, 25, 12, 30, 45, 28, 35

Solution:


Q10. Quartiles

Find Q1, Q2 (Median), and Q3 for: 5, 8, 10, 12, 15, 18, 20, 22, 25

Solution:

Sorted data (n = 9):

Interquartile Range (IQR):

Why it matters in ML: The IQR is how box plots detect outliers. Any value below or above is flagged. This is one of the most common data cleaning steps in any ML pipeline.


Part 4: Grouped Data / Class Interval Questions

These are the most important question types for university exams. The key difference from ungrouped data: you use the midpoint of each class interval as your representative value.


Q11. Mean from Frequency Distribution (Direct Method)

Class Interval

Frequency (f)

Midpoint (x)

fx

0–10

5

5

25

10–20

8

15

120

20–30

12

25

300

30–40

10

35

350

40–50

5

45

225

Total

40

1020


Q12. Median from Grouped Data

Class Interval

Frequency

Cumulative Frequency

0–10

4

4

10–20

6

10

20–30

10

20

30–40

8

28

40–50

2

30

, so

The cumulative frequency just exceeds 15 at the 20–30 class → Median class = 20–30

  • , , ,


Q13. Mode from Grouped Data

Class Interval

Frequency

0–10

3

10–20

7

20–30

12 ← Modal class

30–40

9

40–50

4

  • , , , ,


Q14. Standard Deviation from Grouped Data

Class

f

Midpoint (x)

fx

10–20

5

15

75

−20

400

2000

20–30

8

25

200

−10

100

800

30–40

15

35

525

0

0

0

40–50

10

45

450

+10

100

1000

50–60

7

55

385

+20

400

2800

Total

45

1635

6600


Part 5: Previous-Year Style Mixed Numericals


Q15. Mean, Median & Mode from Frequency Distribution

Marks

No. of Students (f)

Midpoint (x)

fx

CF

0–10

5

5

25

5

10–20

9

15

135

14

20–30

12

25

300

26

30–40

8

35

280

34

40–50

6

45

270

40

Total

40

1010

Mean:

Median:

→ Median class = 20–30 (CF crosses 20)

Mode: Highest frequency = 12 → Modal class = 20–30


Q16. Mean Deviation About Mean

Data: 14, 18, 20, 22, 25, 30

Mean:

| x | | |---|---| | 14 | 7.5 | | 18 | 3.5 | | 20 | 1.5 | | 22 | 0.5 | | 25 | 3.5 | | 30 | 8.5 | | Total | 25 |


Q17. Average Salary Problem

Given: Average salary of 8 employees = ₹24,000. One leaves, a new one joins at ₹30,000, new average = ₹25,000. Find the salary of the employee who left.

Solution:

Total salary (original 8 employees):

Total salary (new 8 employees):

Difference after swap:


Q18. Karl Pearson's Coefficient of Skewness

Given: Mean = 45, Median = 42, σ = 6

Since , the distribution is positively skewed (tail on the right).

Why it matters in ML: Skewed features hurt linear models and tree-based models. A positive skew > 1 is a strong signal to apply log transformation before feeding data into your model. This is one of the first checks in any exploratory data analysis (EDA).


Quick Revision: What Each Measure Tells You

Measure

What It Captures

ML Relevance

Mean

Center of data

Regression baseline, feature scaling

Median

Robust center

Outlier-resistant imputation

Mode

Most frequent value

Baseline classifier, categorical imputation

Variance / σ

Spread around mean

Feature importance, noise detection

IQR

Middle 50% spread

Outlier detection (box plot rule)

CV

Relative variability

Comparing features with different units

Skewness

Symmetry of distribution

Signals need for log/sqrt transformation


Also Explore These Topics

Some Authority post


Written by Abhijeet Singh Rajput · Published on Notehub