Blog - My Blog - Page 14

Research Paper

184. Pyramid Vision Transformers

Background When using Traditional CNN-backboned architecture models, due to the convolutional filter’s weights being fully fixed after training, they suffered to adapt to different inputs dynamically. Vision Transformers attempted to remove the convolution from the backbone, but since it is…

August 14, 2022

Object Detection

183. Additional Parameters To Consider For 3D Object Detection

Compared with 2D Object Detection, there are several additional parameters to consider when it comes to 3D object detection 2D Object Detection: X coordinate for the center of the bounding box Y coordinate for the center of the bounding box…

August 13, 2022

Pytorch

182. Save/Load Models Using Pytorch

I’d like to share 2 different ways to save and load a model using Pytorch. Saving The Entire Model #save model torch.save(model, PATH) #load model model = torch.load(PATH) model.eval() This save/load process has the least amount of code to implement.…

August 12, 2022

Book Review

181. Books for Deep Learning

These are my top3 books that are helping me learn deep learning! 1. Deep Learning by Aaron Courville, Ian Goodfellow, and Yoshua Bengio 2. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems…

August 11, 2022

Image Segmentation, Research Paper

180. Polynomial Learning Rate

Polynomial Learning Rate For deep learning models, the learning rate is one of the most important hyper-parameters in any deep neural network optimization process. Polynomial Learning Rate is a proposed technique to apply learning rate decay and optimize such process.…

August 10, 2022

Image Segmentation

179. Transfer Learning PIDNet

Today I tried to do transfer learning using PIDNet (Since I just learned about PIDNet). Compared to my first attempt, the output is getting slightly better but still not to the level where it is actually useful.

August 9, 2022

Image Segmentation

178. Bilinear Interpolation For Images

Today I’ve learned about bilinear interpolation for images, so I’d like to share it here. To simplify the concept, here is an example if we were to upscale an image by a factor of 2 using bilinear interpolation. For semantic…

August 8, 2022

Research Paper

177. PIDNet

PIDNet Today I’ve learned about PIDNet, so I’d like to share it here. Previously, I learned about BiSeNet which had a two-branched architecture to solve high latency problems. However, this architecture suffers another problem called “overshoot” where the boundary of…

August 7, 2022

Image Segmentation

176. CrossEntropyLoss for Segmentation Models

torch.nn.CrossEntropyLoss() Using torch.nn.CrossEntropyLoss() as a loss function for semantic segmentation models was first confusing for me, so I’d like to share it here. CrossEntropyLoss is for multi-class models and it expects at least 2 arguments. One for the model prediction…

August 6, 2022

Machine Learning

175. Bagging

Today I’ve learned about Bagging, so I’d like to share it here. Bagging is when you have multiple models to vote for the correct answer. This helps decrease generalization errors.

August 5, 2022