CIFAR10 Dataset - Using Pytorch to build CNN + activate GPU + output the result to TensorBoard
Introduction
I recently enrolled in an AI course, and this is the third assignment. It mainly refers to the following websites:
- Teaching how to build a CNN using PyTorch: Pytorch Tutorial
- Teaching how to use TensorBoard with PyTorch: Pytorch TensorBoard Tutorial
- Tutorial on using TensorBoard in CoLab: TensorBoard in CoLab Tutorial
The main purpose of this article is to understand CNNs, try to build a deeper network, use GPU to improve efficiency, and finally display the results of Loss and mispredicted results on TensorBoard.
Environment Setup and Homework Requirements
Environment setup:
- Python 3.10.9
- Pytorch 2.0.1
Homework Requirements
Task:
- First build a CNN: Train the same network as in the PyTorch CNN tutorial.
- Build a CNN that meets the following requirements: Change the network architecture as follows and train the network:
- Conv layer with 3x3 kernel and depth = 8, ReLu activation
- Conv layer with 3x3 kernel and depth = 16, ReLu activation
- Max pooling with 2x2 kernel
- Conv layer with 3x3 kernel and depth = 32, ReLu activation
- Conv layer with 3x3 kernel and depth = 64, ReLu activation
- Max pooling with 2x2 kernel
- Fully connected with 4096 nodes, ReLu activation
- Fully connected with 1000 nodes, ReLu activation
- Fully connected with 10 nodes, no activation
- Use GPU and compare with CPU results: Run the training on the GPU and compare the training time to CPU.
- Log Training Loss to TensorBoard: Log the training loss in TensorBoard.
- Modify the criterion for correctness to include predictions in the top three outputs: Change the test metric as follows: A prediction is considered „correct“ if the true label is within the top three outputs of the network. Print the accuracy on the test data (with respect to this new definition).
- Randomly select five examples of incorrect predictions and display them on TensorBoard: Randomly take 5 examples on which the network was wrong on the test data (according to the new definition of correct) and plot them to TensorBoard together with the true label.
- Display TensorBoard in the notebook: Show the TensorBoard widget at the end of your notebook.
- Bonus: See if you can improve results by using a deeper network (or another architecture).
Preliminary Preparation
- First, load the necessary packages
1 | import torch |
Task 1+2 Build a CNN
- First build a CNN: Train the same network as in the PyTorch CNN tutorial.
- Build a CNN that meets the following requirements: Change the network architecture as follows and train the network.
Build a CNN
Task 2. Build a CNN that meets the following requirements: Change the network architecture as follows and train the network:
- Conv layer with 3x3 kernel and depth = 8, ReLu activation
- Conv layer with 3x3 kernel and depth = 16, ReLu activation
- Max pooling with 2x2 kernel
- Conv layer with 3x3 kernel and depth = 32, ReLu activation
- Conv layer with 3x3 kernel and depth = 64, ReLu activation
- Max pooling with 2x2 kernel
- Fully connected with 4096 nodes, ReLu activation
- Fully connected with 1000 nodes, ReLu activation
- Fully connected with 10 nodes, no activation
1 | import torch.nn as nn |
Task 3 + 4 GPU and Loss on TensorBoard
- Use GPU and Compare CPU Results: Run the training on the GPU and compare the training time to CPU.
- Log Training Loss on TensorBoard: Log the training loss in TensorBoard.
Accelerate Network Using GPU
Since I am using a Mac, I input mps
, but if you are using a Windows system, please input cuda
.
Initialize function and optimizer.
1 | # device = torch.device("cuda" if torch.backends.mps.is_available() else "cpu") |
Build Training Model
Start writing the training model, and log the results to TensorBoard.
1 | start_time = time.time() |
Result
1 | [1, 3600] loss: 1.977 time elapsed: 1 min |
Then you can write a cpu to compare the time
1 | # Use CPU |
Save Training Results
I am currently saving the model in ./model/cifar_net.pth
, and then reading it back later, so that I don’t have to retrain next time.
1 | PATH = './model/cifar_net.pth' |
Evaluate the Model Using Test Data
Task 5. Modify the Criterion for Correctness to Include Predictions in the Top Three Outputs: Change the test metric as follows: A prediction is considered „correct“ if the true label is within the top three outputs of the network. Print the accuracy on the test data (with respect to this new definition).
According to the assignment requirements, we need to do the following:
- TODO 1 Adjust the definition of accuracy to consider a prediction correct if the answer is among the top three outputs.
- TODO 2 Print the accuracy, here I print out the accuracy “for each category” and “overall accuracy.”
- TODO 3 Since we need to record the wrong images, outputs, and labels, we first record them and then randomly select five wrong ones for later use.
1 | correct = 0 |
Result
1 | Predicted: tensor([3, 5, 2]) Actual: 3 Correct: True |
Print Accuracy
Now we can print the accuracy. We can see that the accuracy is 0.1 because we have only 10 classes, so the random guessing accuracy is 0.1.
1 | print(f'Accuracy on test data (top-3): {100 * accuracy:.2f}%') |
Result
1 | Accuracy on test data (top-3): 91.60% |
Task 6 Random 5 errors img
Task 6. Randomly select five examples that were incorrectly predicted by the model and display them in TensorBoard:
Randomly take 5 examples on which the network was wrong on the test data (according to the new definition of correct) and plot them to TensorBoard together with the true label.
Setting up the Image Transformation Function
In order to display the images later, we need to create a function for displaying images. Since the output of torchvision datasets is PILImage images with a range of [0, 1], we need to convert them to tensors with a normalized range of [-1, 1]. If we want to display the images, we need to perform the reverse normalization to go from the normalized range [-1, 1] back to [0, 1]. We can achieve this using the formula .
1 | # Functions to show an image |
Randomly Select 5 Errors
We have just collected all the images, predictions, and labels for errors. According to the task requirements, we need to randomly select 5 images with incorrect predictions and print them out.
1 | def plot_classes_preds(all_errors): |
Result
Put the image into tensorBoard
1 | # put on tensorBoard |
Task 7 Show tensorBoard on notebook
- Display TensorBoard in the notebook: Show the TensorBoard widget at the end of your notebook.
1 | # Displaying TensorBoard in the notebook |
Supplementary Information
Normalization vs. Standardization
Normalization vs. Standardization: What’s the Difference?
Normalization
: Scaling data proportionally to fit into a small specific range, such as [0, 1] or [-1, 1].- Formula:
Standardization
: Scaling data proportionally to fit into a distribution with a mean of 0 and a standard deviation of 1, so extreme values may not fall within [0, 1].- Formula:
Common Standards for Both
- Both techniques scale individual features (columns) and not the feature vectors of individual samples (rows).
Why Normalize Data?
- Improved Precision: Many machine learning algorithms are based on objective functions that assume all features have zero mean and the same order of magnitude for variances. If the variance of a feature is orders of magnitude larger than that of other features, it will dominate the learning algorithm and prevent it from learning correctly. Therefore, normalization is done to make different dimensions of features comparable, significantly improving the classifier’s accuracy.
- Faster Convergence: After normalization, the process of finding the optimal solution is noticeably smoother, making it easier to converge to the optimal solution.
dim?
The dim
parameter in PyTorch determines the dimension along which ranking and obtaining the maximum values occur. Let’s explain the differences using an example:
- If you set
dim=0
, it will look at the maximum values for the entire column. - If you set
dim=1
, it will look at the maximum values for the entire row.
1 | import torch |
1 | Output tensor: |