Train the model

I’ll use the initialization method and show you how to train a small neural network on a custom-defined dataset using this weight initialization methodology. In this example, we’ll generate a fictitious dataset for binary classification and train a neural network to complete the task.

Here’s a step-by-step implementation using PyTorch:

We constructed a new binary classification dataset, designed a basic neural network using He initialization, and trained the model utilizing stochastic gradient descent (SGD) as the optimizer in this example. We will import same libraries, as discussed before.

Generating Dataset

We have define generate_dataset function to generate random 2D features and assign labels to a custom defined dataset for binary classification using Pytorch framework.

Python3

# Create a custom-defined dataset
def generate_dataset(num_samples=100):
    np.random.seed(0)
    features = np.random.rand(num_samples, 2)  # Random 2D features
    labels = (features[:, 0] + features[:, 1] >
              1).astype(int)  # Binary classification
 
    return torch.tensor(features, dtype=torch.float32), torch.tensor(labels, dtype=torch.float32)

Defining the Neural network with He Initialization

Here, we have define a class SimpleClassifier with two fully connected layers. We have applied He initialization to the weights of these layers.

Python3

# Define the neural network with He initialization
class SimpleClassifier(nn.Module):
    def __init__(self):
        super(SimpleClassifier, self).__init__()
        self.fc1 = nn.Linear(2, 64)
        self.fc2 = nn.Linear(64, 1)
 
        # Apply He initialization to the layers
        nn.init.kaiming_normal_(self.fc1.weight)
        nn.init.kaiming_normal_(self.fc2.weight)
 
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.sigmoid(self.fc2(x))
        return x

Setting Hyperparameters

We have set hyperparameters for the model. Setting hyperparameters is an important step in ML workflow. It allows you to fine-tune the model.

Python3

# Hyperparameters
learning_rate = 0.01
epochs = 1000
batch_size = 16

To explore deeper and tailor it to your individual needs, feel free to change the hyperparameters, dataset size, or model architecture.

Creating Dataset and Dataloader

In this code, we have defined the batch_size that determines number of samples are processed in each iteration. We have created DataLoader that allows to iterate through the dataset in batches.

Python3

# Create the dataset and dataloader
features, labels = generate_dataset()
dataset = torch.utils.data.TensorDataset(features, labels)
dataloader = torch.utils.data.DataLoader(
    dataset, batch_size=batch_size, shuffle=True)

Initializing model and optimizer

Here, we have initialize a simple neural network classifier model, an optimizer and binary cross-entropy loss function using Pytorch framework. The optimizer set for the training is Stochastic Gradient Descent (SGD) as the optimizer algorithm. The learning rate determine the rate of updating weights.

As it is a binary classification model, the loss function is set to binary cross-entropy to measure the dissimilarity between the predicted probabilities and actual binary labels.

Python3

# Initialize the model and optimizer
model = SimpleClassifier()
optimizer = optim.SGD(model.parameters(), lr=learning_rate)
criterion = nn.BCELoss()  # Binary Cross-Entropy Loss

After setting up these components, let’s proceed to train your model.

Training Loop

In this code snippet, we design the loop to train the model for specified epochs using DataLoader to iterate through the dataset.

optimizer.zero_grad() : used to reset gradient of the model’s parameters to 0 at the starting of each batch
model(inputs): generate predictions on the input data
optimizer.zero_grad() : calculate binary cross-entropy loss
loss.backward() : computes loss using backpropagation
optimizer.step() : updates the weights
loss.item() : keeps the track of the cummulative loss for the entire epoch

Python3

# Training loop
for epoch in range(epochs):
    total_loss = 0
    for inputs, targets in dataloader:
        optimizer.zero_grad()  # Zero the gradients
        outputs = model(inputs)  # Forward pass
        loss = criterion(outputs, targets.view(-1, 1))  # Calculate the loss
        loss.backward()  # Backpropagation
        optimizer.step()  # Update weights
 
        total_loss += loss.item()
 
    # Print the average loss for this epoch
    if (epoch + 1) % 100 == 0:
        average_loss = total_loss / len(dataloader)
        print(f"Epoch [{epoch + 1}/{epochs}] - Loss: {average_loss:.4f}")

Output:

Epoch [100/1000] - Loss: 0.4184
Epoch [200/1000] - Loss: 0.2807
Epoch [300/1000] - Loss: 0.2209
Epoch [400/1000] - Loss: 0.1875
Epoch [500/1000] - Loss: 0.1531
Epoch [600/1000] - Loss: 0.1704
Epoch [700/1000] - Loss: 0.1382
Epoch [800/1000] - Loss: 0.1160
Epoch [900/1000] - Loss: 0.1246
Epoch [1000/1000] - Loss: 0.1028

Evaluation:

In the last, we evaluate the performance of the model on test dataset and print accuracy, which gives an idea of how well the model is performing on unseen data.

model.eval() : set the model to evaluation mode
with torch.no_grad() : temporarily disables gradient computation
predictions = model(test_samples).round().squeeze().numpy() : pass test samples through trained model to get predictions
- model(test_sample) : forward pass the test sample
- .round() : rounds the raw predictions to 0 or 1
- .squeeze() : remove unnecessary dimensions
- .numpy() : converts prediction from pytorch tensor to numpy array

Finally, we print the accuracy.

Python3

# Evaluate the trained model
model.eval()
with torch.no_grad():
    test_samples, test_labels = generate_dataset(num_samples=20)
    predictions = model(test_samples).round().squeeze().numpy()
    accuracy = (predictions == test_labels.numpy()).mean()
 
    print(f"Test Accuracy: {accuracy * 100:.2f}%")

Output:

Test Accuracy: 100.00%

Selecting weight initialization depends on the activation function, network architecture and nature of the problem. A recommended approach is to try out various weight initialization techniques and closely observe the training process, including metrics such as training loss and convergence speed. This way, you can identify the most suitable initialization method for your particular problem.

Choosing the appropriate weight initialization technique is an important step in creating successful deep neural networks using PyTorch. It can have a considerable influence on your model’s convergence speed and overall performance. We may make an educated selection regarding the weight initialization technique to utilize by taking into account parameters like as activation functions, network depth, and the existence of batch normalization. Remember that testing and fine-tuning are essential for determining the best weight initialization for your particular scenario. With PyTorch’s strength at your disposal, you can build DNNs that learn effectively and provide astounding outcomes.

Select the right Weight for deep Neural Network in Pytorch

PyTorch has developed a strong and adaptable framework for creating deep neural networks (DNNs) in the field of deep learning. Choosing the proper weight for your model is an important component in designing an efficient DNN. Initialization of weights is critical in deciding how successfully your neural network will learn from input and converge to a suitable answer. In this post, we will discuss the significance of weight initialization and give advice for choosing the appropriate weight for your deep neural network in PyTorch.

Table of Content

Understanding the Significance of Weight Initialization
The Role of PyTorch in Weight Initialization
Guidelines for Selecting the Right Weight Initialization
Implementation in PyTorch
Train the model

Train the model

Generating Dataset

Python3

Defining the Neural network with He Initialization

Python3

Setting Hyperparameters

Python3

Creating Dataset and Dataloader

Python3

Initializing model and optimizer

Python3

Training Loop

Python3

Evaluation:

Python3

Select the right Weight for deep Neural Network in Pytorch

Table of Content

Categories

Contact US

Train the model

Generating Dataset

Python3

Defining the Neural network with He Initialization

Python3

Setting Hyperparameters

Python3

Creating Dataset and Dataloader

Python3

Initializing model and optimizer

Python3

Training Loop

Python3

Evaluation:

Python3

Select the right Weight for deep Neural Network in Pytorch

Table of Content

Similar Reads

Categories

Contact US