Visualizing Training Progress Using TensorBoard
In order to visualize the training process in a deep learning model, we can use SummaryWriter class from torch.utils.tensorboard
module, which seamlessly integrates with TensorBoard, a visualization tool developed by TensorFlow.
- Integration: PyTorch provides a
SummaryWriter
class in thetorch.utils.tensorboard
module, which integrates seamlessly with TensorBoard for visualization. - Logging: Inside the training loop, you can use
SummaryWriter
to log various metrics like loss, accuracy, etc., for visualization. - Visualization: TensorBoard provides interactive and real-time visualizations of the logged metrics, allowing you to monitor the training progress dynamically.
- Monitoring: TensorBoard enables you to monitor multiple aspects of training, such as learning curves, model graphs, and histogram of weights, providing insights for optimizing your model.
Install the TensorBoard library using the following command:
pip install tensorboard
Step 1: Import Libraries
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
Step 2: Define a Simple Neural Network
Lets define SimpleNN a class declaration of a simple neural network containing two layers that are fully connected, along with forward function that defines the forward pass of the network.
# Define a simple neural network
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = torch.flatten(x, 1)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
Step 3: Load MNIST Dataset
Let us load the MINST data for training, divide it into batches and apply transformation by using some preprocessing techniques.
# Load a smaller subset of MNIST dataset
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
small_train_dataset = torch.utils.data.Subset(train_dataset, range(1000)) # Subset of first 1000 samples
train_loader = DataLoader(small_train_dataset, batch_size=64, shuffle=True)
Step 4: Initialize Model, Loss Function, and Optimizer
Now, initialize model. Along with it we will be using cross-entropy loss function and adam optimizer for updating model parameters.
# Initialize model, loss function, and optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
Step 5: Initialize SummaryWriter for Logging
SummaryWriter is object of imported module to write logs to be visualized in TensorBoard.
# Initialize SummaryWriter for logging
writer = SummaryWriter('logs_small')
Step 6: Training Loop
- Training Loop: Goes through the epochs and batches, performs forward pass, computes loss, backward pass and updates model parameters.
- Log Loss and Accuracy: The epoch-wise training loss and accuracy are being logged.
# Training loop
epochs = 5
for epoch in range(epochs):
running_loss = 0.0
correct = 0
total = 0
for i, (inputs, labels) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
# Calculate accuracy
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
# Log loss
writer.add_scalar('Loss/train', loss.item(), epoch * len(train_loader) + i)
# Log accuracy
accuracy = 100 * correct / total
writer.add_scalar('Accuracy/train', accuracy, epoch)
print(f'Epoch [{epoch+1}/{epochs}], Loss: {running_loss / len(train_loader)}, Accuracy: {accuracy}%')
print('Finished Training')
writer.close()
Complete Code:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
# Define a simple neural network
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = torch.flatten(x, 1)
x = torch.relu(self.fc1(x))
x = self.fc2(x)
return x
# Load a smaller subset of MNIST dataset
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
small_train_dataset = torch.utils.data.Subset(train_dataset, range(1000)) # Subset of first 1000 samples
train_loader = DataLoader(small_train_dataset, batch_size=64, shuffle=True)
# Initialize model, loss function, and optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Initialize SummaryWriter for logging
writer = SummaryWriter('logs_small')
# Training loop
epochs = 5
for epoch in range(epochs):
running_loss = 0.0
correct = 0
total = 0
for i, (inputs, labels) in enumerate(train_loader):
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
running_loss += loss.item()
# Calculate accuracy
_, predicted = torch.max(outputs, 1)
total += labels.size(0)
correct += (predicted == labels).sum().item()
# Log loss
writer.add_scalar('Loss/train', loss.item(), epoch * len(train_loader) + i)
# Log accuracy
accuracy = 100 * correct / total
writer.add_scalar('Accuracy/train', accuracy, epoch)
print(f'Epoch [{epoch+1}/{epochs}], Loss: {running_loss / len(train_loader)}, Accuracy: {accuracy}%')
print('Finished Training')
writer.close()
Output:
Epoch [1/5], Loss: 1.9477481991052628, Accuracy: 37.9%
Epoch [2/5], Loss: 1.207241453230381, Accuracy: 73.4%
Epoch [3/5], Loss: 0.8120987266302109, Accuracy: 80.9%
Epoch [4/5], Loss: 0.6294657941907644, Accuracy: 84.7%
Epoch [5/5], Loss: 0.5223450381308794, Accuracy: 86.5%
Finished Training
Visualizing Training Progress in PyTorch
In order to run TensorBoard, you should open a terminal and go to the folder in which you keep your logs (stored in previous step); then to run tensorboard use command:
tensorboard --logdir=logs
Output:
Accessing TensorBoard requires : Opening a browser and enter into the web address given by TensorBoard (normally http://localhost:6006/).
TensorBoard provides a web-based dashboard with tabs and visualizations representing various training aspects. Scalar metrics visualize values like loss or accuracy over epochs, offering different perspectives on training dynamics. Additionally, TensorBoard can display histograms, embeddings, and specialized visualizations based on logged information.
In this blog, we have covered how can we visualize training progress for deep learning framework using matplotlib and tensorboard.
How to visualize training progress in PyTorch?
Deep learning and understanding the mechanics of learning and progress during training is vital to optimize performance while diagnosing problems such as underfitting or overfitting. The process of visualizing training progress offers valuable insights into the dynamics of learning that allow us to make sound decisions. In this article, we will learn how to visualize the training progress in Pytorch.
Two methods by which training progress must be visualized are:
- Using Matplotlib
- Using Tensor Board