R Programming for Machine Learning with Examples
Introduction
R Programming is one of the most popular languages for data analysis, statistics, and machine learning. It provides powerful libraries and built-in datasets that make it easy to perform data visualization, predictive modeling, clustering, classification, and association rule mining.
In this tutorial, you will learn important machine learning algorithms in R with practical code examples, including:
Decision Tree using C5.0
These examples are useful for students, beginners, data analysts, and machine learning enthusiasts.
What is R Programming?
R is an open-source programming language designed for:
Statistical computing
Data visualization
Machine learning
Data mining
Predictive analytics
R programming is widely used in:
Data Science
Artificial Intelligence
Research
Business Analytics
Machine Learning projects
1. Linear Regression in R
Introduction to Linear Regression
Linear Regression is a supervised machine learning algorithm used to predict continuous values.
Example:
Predicting height using weight
Predicting house prices
Predicting sales revenue
R Code for Linear Regression
height = c(140, 142, 150, 147, 139, 162, 164, 136, 148, 147)
weight = c(59, 61, 66, 62, 57, 68, 69, 58, 63, 62)
model_data = data.frame(
height,
weight
)
print(model_data)
linear_model = lm(
height ~ weight,
data = model_data
)
summary(linear_model)
coefficients(linear_model)2. Logistic Regression in R
Introduction to Logistic Regression
Logistic Regression is used for binary classification problems.
Examples:
Spam detection
Disease prediction
Pass or fail prediction
R Code for Logistic Regression
data(iris)
str(iris)
iris_subset = subset(
iris,
Species != "virginica"
)
iris_subset$Species = factor(
iris_subset$Species
)
logistic_model = glm(
Species ~ Sepal.Length + Sepal.Width,
data = iris_subset,
family = binomial
)
summary(logistic_model)
predicted_prob = predict(
logistic_model,
type = "response"
)
predicted_class = ifelse(
predicted_prob > 0.5,
"versicolor",
"setosa"
)
table(
Predicted = predicted_class,
Actual = iris_subset$Species
)3. K-Nearest Neighbors (KNN) in R
Introduction to KNN
KNN is a supervised machine learning algorithm used for classification and prediction.
It classifies data points based on the nearest neighbors.
R Code for KNN Algorithm
library(class)
data(iris)
features = iris[, 1:4]
class_labels = iris$Species
normalize = function(x) {
(x - min(x)) / (max(x) - min(x))
}
features_normalized = as.data.frame(
lapply(features, normalize)
)
set.seed(123)
n = nrow(features_normalized)
train_index = sample(
1:n,
0.7 * n
)
train_data = features_normalized[train_index, ]
test_data = features_normalized[-train_index, ]
train_label = class_labels[train_index]
test_label = class_labels[-train_index]
knn_prediction = knn(
train = train_data,
test = test_data,
cl = train_label,
k = 5
)
print(knn_prediction)
confusion_matrix = table(
Predicted = knn_prediction,
Actual = test_label
)
print(confusion_matrix)
accuracy = sum(diag(confusion_matrix)) /
sum(confusion_matrix)
print(
paste(
"Accuracy:",
round(accuracy * 100, 2),
"%"
)
)4. Apriori Algorithm in R
Introduction to Apriori Algorithm
Apriori is an association rule mining algorithm used in market basket analysis.
Examples:
Product recommendation
Shopping pattern analysis
Customer behavior prediction
R Code for Apriori Algorithm
library(arules)
transactions_list = list(
c("milk", "bread", "butter"),
c("milk", "bread"),
c("milk", "bread", "diaper", "butter"),
c("bread", "butter", "diaper"),
c("milk", "bread", "diaper", "butter"),
c("bread", "butter"),
c("milk", "bread", "butter", "diaper")
)
trans = as(
transactions_list,
"transactions"
)
inspect(trans)
frequent_itemsets = apriori(
trans,
parameter = list(
supp = 0.4,
target = "frequent itemsets"
)
)
inspect(frequent_itemsets)
rules = apriori(
trans,
parameter = list(
supp = 0.4,
conf = 0.7,
target = "rules"
)
)
rules_sorted = sort(
rules,
by = "lift",
decreasing = TRUE
)
inspect(rules_sorted)
library(arulesviz)
plot(
rules_sorted,
method = "graph",
engine = "htmlwidget"
)5. K-Means Clustering in R
Introduction to K-Means Clustering
K-Means is an unsupervised machine learning algorithm used for clustering data into groups.
Applications:
Customer segmentation
Pattern recognition
Data grouping
R Code for K-Means Clustering
data(iris)
iris_data = iris[, 1:4]
iris_scaled = scale(iris_data)
k = 3
set.seed(123)
kmeans_result = kmeans(
iris_scaled,
centers = k,
nstart = 25
)
cluster = kmeans_result$cluster
centers = kmeans_result$centers
print(cluster)
print(centers)
table(
Cluster = cluster,
Species = iris$Species
)
library(ggplot2)
iris_plot = data.frame(
iris_scaled,
cluster = factor(cluster)
)
ggplot(
iris_plot,
aes(
x = Sepal.Length,
y = Sepal.Width,
color = cluster
)
) +
geom_point(size = 3)6. Decision Tree in R using C5.0
Introduction to Decision Tree
Decision Tree is a supervised learning algorithm used for classification and prediction.
Applications:
Fraud detection
Medical diagnosis
Customer analysis
R Code for Decision Tree
library(C50)
data(iris)
iris$Species = as.factor(
iris$Species
)
str(iris)
set.seed(123)
n = nrow(iris)
train_index = sample(
1:n,
0.7 * n
)
train_data = iris[train_index, ]
test_data = iris[-train_index, ]
tree_model = C5.0(
Species ~ .,
data = train_data
)
summary(tree_model)
predictions = predict(
tree_model,
test_data
)
confusion_matrix = table(
Predicted = predictions,
Actual = test_data$Species
)
accuracy = sum(diag(confusion_matrix)) /
sum(confusion_matrix)
print(accuracy)
print(confusion_matrix)