Category AI

180. Polynomial Learning Rate

Polynomial Learning Rate For deep learning models, the learning rate is one of the most important hyper-parameters in any deep neural network optimization process. Polynomial Learning Rate is a proposed technique to apply learning rate decay and optimize such process.…

179. Transfer Learning PIDNet

Today I tried to do transfer learning using PIDNet (Since I just learned about PIDNet). Compared to my first attempt, the output is getting slightly better but still not to the level where it is actually useful.

177. PIDNet

PIDNet Today I’ve learned about PIDNet, so I’d like to share it here. Previously, I learned about BiSeNet which had a two-branched architecture to solve high latency problems. However, this architecture suffers another problem called “overshoot” where the boundary of…

176. CrossEntropyLoss for Segmentation Models

torch.nn.CrossEntropyLoss() Using torch.nn.CrossEntropyLoss() as a loss function for semantic segmentation models was first confusing for me, so I’d like to share it here. CrossEntropyLoss is for multi-class models and it expects at least 2 arguments. One for the model prediction…

175. Bagging

Today I’ve learned about Bagging, so I’d like to share it here. Bagging is when you have multiple models to vote for the correct answer. This helps decrease generalization errors.

174. Non-Max Suppression

Non-Max Suppression is a post-processing method for object detection tasks. In most cases, an object detection model will predict multiple boxes for a single object like the picture in my note. However, we don’t want this crowded output. We instead…

173. BiSeNet

Background Most of the previous semantic segmentation model’s architecture can be categorized into 2 types. Encoder-Decoder Backbone: (Ex. FCN, UNet) This architecture requires all information to flow through the deep encoding-decoding structure leading to high latency, also suffering in restoring…

172. Hessian Matrix

Hessian Matrix packages all the information of the second derivative of a function. This matrix can be used to determine saddle points or the local extremum of a function.

171. Bayes Theorem

Bayes Theorem Bayes Theorem is about considering the posterior distribution considering prior distribution and currently available data. Let’s say we want to predict a man’s occupation. Is this man a librarian or a farmer given the following description? – He…