Bayesian Classifier (naive Boryes)
Bayesian Classifier
The Bayesian Classifier, commonly known as the Naive Bayes classifier, is a supervised machine learning algorithm based on Bayes’ Theorem. It is widely used for classification tasks such as spam detection, sentiment analysis, and document categorization.
At its core, the algorithm calculates the probability that a given data point belongs to a particular class based on prior knowledge and observed features. It applies Bayes’ Theorem to compute the posterior probability of a class given the input features.
The term “naive” comes from the simplifying assumption that all features are conditionally independent of each other given the class label. Although this assumption is rarely true in real-world data, the classifier still performs surprisingly well in many practical applications.
Example of Naive Bayes Classification
Suppose we are given a training dataset containing information about different species based on features such as swimming ability, flying ability, and crawling behavior. Using the naive bayes algorithm, we need to classify a new instance with features:
Swim = Slow
Fly = Rarely
Crawl = No
The possible class labels are:
Animal
Bird
Fish
We will use prior probability and conditional probability to determine the most likely class for the given test instance.
Given the training data set, use naive Boryes algorithms to classify a particular species if its features are (slow, rarely, no).
s.no | Swim | Fly | Crowl | Class |
|---|---|---|---|---|
1 | Fast | No | No | Fish |
2 | Fast | No | Yes | Animal |
3 | Slow | No | No | Animal |
4 | Fast | No | No | Animal |
5 | No | Short | No | Bird |
6 | No | Short | No | Bird |
7 | No | Rarely | No | Animal |
8 | Slow | No | Yes | Animal |
9 | Slow | No | No | Fish |
10 | Slow | No | Yes | Fish |
11 | No | Large | No | Bird |
12 | Fast | No | No | Bird |
The class Labels are
Construct the frequency table which summaries the data [Not the part of algo]
Class | Swim (F1) | Fly (F2) | Crowl (F3) | Total | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
Fast | Swim | No | Long | Short | Rarely | No | Yes | No | ||||
Animal | 2 | 2 | 1 | 0 | 0 | 1 | 4 | 2 | 3 | 5 | ||
Bird | 1 | 0 | 3 | 1 | 2 | 0 | 1 | 0 | 4 | 4 | ||
Fish | 1 | 2 | 0 | 0 | 0 | 0 | 3 | 1 | 2 | 3 | ||
Total | 4 | 4 | 4 | 1 | 2 | 1 | 8 | 3 | 9 | 12 |
Step 1: Compute the probability
Step 2: Constructing Table of Conditional Propability
Class | Swim Fast Slow No | Fly Long Short Rarely No | Crowl Yes No | Total |
|---|---|---|---|---|
Animal | 2/5 2/5 1/5 | 0/5 0/5 1/5 4/5 | 2/5 3/5 | 5 |
Bird | 1/4 0/4 3/4 | 1/4 2/4 0/4 1/4 | 0/4 4/4 | 4 |
Fish | 1/3 2/3 0/3 | 0/3 0/3 0/3 3/3 | 1/3 2/3 | 3 |
The conditional probability are calculated as
Step 3: we now calculate the following numbers
Step 4: Find Maximum
Step 5: The maximum is as it corresponds to class
so we assign the class Label "Animal" to the test instance
Conclusion
The Bayesian Classifier, also known as the naive bayes algorithm, is a simple yet powerful supervised learning technique used for classification tasks in machine learning. By applying Bayes’ Theorem and assuming feature independence, it can efficiently classify data into different categories. Despite its “naive” assumption, the algorithm performs well in many real-world applications such as spam filtering, sentiment analysis, and document classification. In this example, the test instance was successfully classified as “Animal” based on the calculated probabilities.
To compare with other classification approaches, see the ID3 Algorithm and Decision Tree notes in the AI & ML collection.
This note is part of the AI & ML collection on NoteHub.
