Example 1: Classification
To use the R nnet
package for classification with user-defined data.
Step 1: Prepare the data
In this step, we generate a sample dataset for classification. It includes two numeric features x1
and x2
, and a target variable class
. We generate random values for x1
and x2
using the rnorm()
function. The class
variable is assigned as “Class A” if x1 + x2
is greater than 0, and “Class B” otherwise. The features and target variable are combined into a data frame my_data
. The target variable class
is converted to a factor using as.factor()
.
R
# Step 1: Prepare the Data # Creating a sample dataset for classification set.seed (123) # Generating random data n <- 200 # Number of observations x1 <- rnorm (n, mean = 0, sd = 1) x2 <- rnorm (n, mean = 0, sd = 1) # Creating two classes based on a linear separation class <- ifelse (x1 + x2 > 0, "Class A" , "Class B" ) # Combining the features and target variable into a data frame my_data <- data.frame (x1, x2, class) # Converting the target variable to a factor my_data$class <- as.factor (my_data$class) |
Let’s break down each line of code:
set.seed(123)
: This sets the seed for random number generation, ensuring theof
reproducibility of the results. Setting the same seed will generate the same random numbers each time the code is run.n <- 200
: This assigns the number of observations we want to generate in our dataset. In this case, we generate 200 observations.x1 <- rnorm(n, mean = 0, sd = 1)
: Thernorm()
the function is used to generate random numbers from a normal distribution. It takes the argumentsn
(number of observations),mean
(mean of the distribution), andsd
(standard deviation of the distribution). Here, we generaten
random numbers from a normal distribution with a mean of 0 and a standard deviation of 1 and assign them to the variablex1
.x2 <- rnorm(n, mean = 0, sd = 1)
: Similarly, we generaten
random numbers from a normal distribution with mean 0 and standard deviation 1, and assign them to the variablex2
.class <- ifelse(x1 + x2 > 0, "Class A", "Class B")
: We use theifelse()
function to create a target variableclass
based on a condition. If the sum ofx1
andx2
is greater than 0, we assign “Class A” tothe
, otherwise “Class B” is assigned. This creates two distinct classes based on a linear separation.my_data <- data.frame(x1, x2, class)
: We combine the featuresx1
,x2
, and the target variableclass
into a data frame calledmy_data
. This creates our dataset with the features and corresponding target values.my_data$class <- as.factor(my_data$class)
: Finally, we convert the target variableclass
to a factor using theas.factor()
function. This is necessary for classification models, as factors represent categorical variables with levels that can be used for classification purposes.
Step 2: Split the Data
In this step, we split the data into training and testing sets. We load the caret
package. The createDataPartition()
the function is used to randomly split the data based on the target variable my_data$class
. We specify p = 0.7
to allocate 70% of the data for training and 30% for testing. The list = FALSE
parameter ensures that the output is a vector of indices. The training set is assigned to train_data
, and the testing set is assigned to test_data
.
R
# Step 2: Split the Data # Splitting the data into training and testing sets library (caret) set.seed (123) split <- createDataPartition (my_data$class, p = 0.7, list = FALSE ) train_data <- my_data[split, ] test_data <- my_data[-split, ] # Ensure that the levels of the target variable are the same in train and test data train_data$class <- factor (train_data$class, levels = levels (my_data$class)) test_data$class <- factor (test_data$class, levels = levels (my_data$class)) |
Let’s break down each line of code:
library(caret)
: We load thecaret
package, which provides useful functions for data splitting, modeling, and evaluation.set.seed(123)
: This sets the seed for random number generation, ensuring the reproducibility of the results. Setting the same seed will generate the same random numbers each time the code is run.split <- createDataPartition(my_data$class, p = 0.7, list = FALSE)
: ThecreateDataPartition()
function from thecaret
package is used to split the data. It takes the target variablemy_data$class
and splits the data into two groups based on the proportion specified byp
. Here, we specifyp = 0.7
, which means that 70% of the data will be allocated for training. Thelist = FALSE
parameter ensures that the output is a vector of indices.train_data <- my_data[split, ]
: We create the training dataset by subsettingmy_data
using the indices obtained from thesplit
variable. This assigns the rows corresponding to the indices totrain_data
.test_data <- my_data[-split, ]
: We create the testing dataset by subsettingmy_data
using the negative of the indices obtained from thesplit
variable. This assigns the rows not included in the training set totest_data
.train_data$class <- factor(train_data$class, levels = levels(my_data$class))
: Thefactor()
the function is used to convert theclass
variable in thetrain_data
dataset to a factor variable. We specify thelevels
argument to be the same as the levels of the originalmy_data$class
variable. This ensures that the levelstrain_data$class
match the levels in the original dataset.test_data$class <- factor(test_data$class, levels = levels(my_data$class))
: Similarly, theclass
variable in thetest_data
dataset is converted to a factor variable using thefactor()
function. Thelevels
argument is set to the levels of the originalmy_data$class
variable to ensure consistency
By performing this step, we ensure that both the training and testing datasets have the same levels for the target variable. This is important for consistent modeling and evaluation.
Step 3: Model Training
In this step, we train the neural network using the nnet()
function from the nnet
package. We specify the formula class ~ x1 + x2
, which represents the relationship between the features x1
and x2
the target variable class
. The data
the parameter is set to the training data train_data
. The size
parameter specifies the number of nodes in the hidden layer, set to 5 in this example. The maxit
parameter sets the maximum number of iterations for the training process, set to 1000.
R
# Step 3: Model Training # Training the neural network using the nnet function library (nnet) # Define the formula representing the relationship between features and target formula <- class ~ x1 + x2 # Train the neural network model <- nnet (formula, data = train_data, size = 5, maxit = 1000) |
Output:
weights: 21
initial value 101.155606
iter 10 value 0.969817
iter 20 value 0.000362
final value 0.000089
converged
Let’s break down each line of code:
library(nnet)
: We load thennet
package, which provides functions for building and training neural networks.formula <- class ~ x1 + x2
: We define the formula that represents the relationship between the featuresx1
andx2
and the target variableclass
. The formula has the formattarget_variable ~ predictor_variable1 + predictor_variable2 + ...
. In this case, we specifyclass ~ x1 + x2
, indicating thatclass
is the target variable, andx1
andx2
are the predictor variables.model <- nnet(formula, data = train_data, size = 5, maxit = 1000)
: We use thennet()
function to train the neural network model. The function takes several arguments:formula
: The formula representing the relationship between the features and the target variable.data
: The training data, which istrain_data
in this case.size
: The number of nodes in the hidden layer of the neural network. Here, we set it to 5, but you can choose a different value based on your specific problem.maxit
: The maximum number of iterations for the training process. We set it to 1000, but you can adjust it based on the convergence of the model.
The nnet()
function fits a neural network model to the training data using the specified formula and parameters. The resulting model is stored in the model
object.
In summary, Step 3 uses the nnet()
function from the nnet
package to train a neural network model. It takes the formula representing the relationship between the features and the target variable, the training data, and parameters such as the number of nodes in the hidden layer and the maximum number of iterations.
Step 4: Model Evaluation
In this step, we evaluate the trained model using the testing data. The predict()
function is used to make predictions on the test_data
using the trained model
. The newdata
the parameter specifies the data to predict on. We set type = "class"
to obtain class predictions. Next, we use the confusionMatrix()
function from the caret
package to create a confusion matrix. It takes the predicted values predictions
and the actual values test_data$class
as inputs. The confusionMatrix()
the function computes various evaluation metrics such as accuracy, precision, recall, etc. Finally, we extract the accuracy from the confusion_matrix
using the $overall
attribute and print it using the print()
function.
R
# Step 4: Model Evaluation # Making predictions on the test data predictions <- predict (model, newdata = test_data, type = "class" ) # Ensure that the predicted class has the same levels as the reference class predictions <- factor (predictions, levels = levels (test_data$class)) # Evaluating the model's performance library (caret) confusion_matrix <- confusionMatrix (predictions, test_data$class) confusion_matrix accuracy <- confusion_matrix$overall[ "Accuracy" ] # Print the accuracy print ( paste ( "Accuracy:" , accuracy)) |
Output:
Confusion Matrix and Statistics
Reference
Prediction Class A Class B
Class A 31 0
Class B 0 28
Accuracy : 1
95% CI : (0.9394, 1)
No Information Rate : 0.5254
P-Value [Acc > NIR] : < 2.2e-16
Kappa : 1
Mcnemar's Test P-Value : NA
Sensitivity : 1.0000
Specificity : 1.0000
Pos Pred Value : 1.0000
Neg Pred Value : 1.0000
Prevalence : 0.5254
Detection Rate : 0.5254
Detection Prevalence : 0.5254
Balanced Accuracy : 1.0000
'Positive' Class : Class A
Let’s break down each line of code:
library(caret)
: We load thecaret
package, which provides functions for model evaluation and performance metrics.predictions <- predict(model, newdata = test_data, type = "class")
: We use thepredict()
function to make predictions on the testing data using the trainedmodel
. The function takes several arguments:model
: The trained neural network model obtained from Step 3.newdata
: The data on which we want to make predictions. Here, we usetest_data
, which is the testing dataset.type
: The type of prediction we want. Since this is a classification problem, we set it to"class"
to obtain class predictions.
The predicted class labels are stored in the
predictions
variable.confusion_matrix <- confusionMatrix(predictions, test_data$class)
: We use theconfusionMatrix()
function from thecaret
package to create a confusion matrix. The function takes the predicted values (predictions
) and the actual values (test_data$class
) as inputs. It computes various evaluation metrics such as accuracy, precision, recall, etc., and returns aconfusionMatrix
object. The resulting confusion matrix is stored in theconfusion_matrix
variable.accuracy <- confusion_matrix$overall["Accuracy"]
: We extract the accuracy from theconfusion_matrix
object using the$overall
attribute. The$overall
attribute contains overall performance metrics and"Accuracy"
specifies the accuracy value.print(paste("Accuracy:", accuracy))
: We print the accuracy value using theprint()
function. Thepaste()
function is used to concatenate the string"Accuracy:"
with the actual accuracy value.
[1] "Accuracy: 1"
In summary, Step 4 evaluates the trained neural network model by making predictions on the testing data and computing the accuracy. The predict()
a function is used to obtain class predictions, the confusionMatrix()
the function creates a confusion matrix to evaluate model performance, and the accuracy value is extracted from the confusion matrix and printed.
This is how we can perform the classification of data using nnet package in R.
Neural Networks Using the R nnet Package
A neural network is a computational model inspired by the structure and function of the human brain. It consists of interconnected nodes, called neurons, organized into layers. The network receives input data, processes it through multiple layers of neurons, and produces an output or prediction.
The basic building block of a neural network is the neuron, which represents a computational unit. Each neuron takes input from other neurons or from the input data, performs a computation, and produces an output. The output of a neuron is typically determined by applying an activation function to the weighted sum of its inputs.
A neural network typically consists of three types of layers:
- Input Layer: This layer receives the input data and passes it to the next layer. Each neuron in the input layer corresponds to a feature or attribute of the input data.
- Hidden Layers: These layers are placed between the input and output layers and perform computations on the data. Each neuron in a hidden layer takes input from the neurons in the previous layer and produces an output that is passed to the neurons in the next layer. Hidden layers enable the network to learn complex patterns and relationships in the data.
- Output Layer: This layer produces the final output or prediction of the neural network. The number of neurons in the output layer depends on the nature of the problem. For example, in a binary classification problem, there may be one neuron representing the probability of one class and another neuron representing the probability of the other class. In a regression problem, there may be a single neuron representing the predicted numerical value.
During training, the neural network adjusts the weights and biases associated with each neuron to minimize the difference between the predicted output and the true output. This is achieved using an optimization algorithm, such as gradient descent, which iteratively updates the weights and biases based on the error or loss between the predicted and actual outputs.
The choice of activation function for the neurons is important, as it introduces non-linearity into the network. Common activation functions include the sigmoid function, ReLU (Rectified Linear Unit), and softmax. The activation function determines the output range of a neuron and affects the network’s ability to model complex relationships.
Neural networks can be applied to a wide range of tasks, including classification, regression, image recognition, natural language processing, and more. They have shown great success in many domains, but their performance depends on the quality and size of the training data, the network architecture, and the appropriate selection of hyperparameters.