Blog - My Blog - Page 16

Pytorch

164. Combining Losses

Here is one way to combine losses using PyTorch. Let’s say we want to train a binary semantic segmentation model using Binary Cross Entropy Loss. When you look at your data, you’ve noticed that the data is quite imbalanced. So…

July 25, 2022

Deep Learning

163. Why Normalize Inputs?

Why Do we Normalize Inputs? When the input is not normalized, the shape of the cost function can become distorted like the diagram on the left. This leads to instability when optimizing the model. The training speed decreases depending on…

July 24, 2022

Computer Vision, Research Paper

162. Residual Blocks

Why are residual blocks called “residual” blocks? The reason why I was confused was that the equation in the diagram explaining the residual blocks on the research paper was f(x) + x. So I thought, “Where is the residual..?” When…

July 23, 2022

AI, Research Paper

161. ESRGAN

Abstract Even though SR-GAN was able to make a huge improvement, there was still a gap between the generated image and the ground truth image. The proposed ESR-GAN further enhances the performance. Three Key Modification Components Network Remove all batch…

July 22, 2022

Research Paper

160. DeepPose

DeepPose DeepPose is a research done by Google for human pose estimation. Pose Vector First, the paper encodes all “k” body joints into a pose vector. To avoid using absolute coordinates for the body joints like right now, the paper…

July 21, 2022

Research Paper

159. M-RNN

M-RNN Multi-Modal Recurrent Neural Network is a research done by The University of California and the Baidu Research Team which generates captions for images. In this research,Deep Recurrent Neural Network is used for sentences, and Deep Convolutional Neural Network is…

July 20, 2022

Research Paper

158: SR-GAN

SR-GAN Today I learned about SR(Super Resolution)-GAN, so I’d like to share it here. In previous research, super-resolution tasks(Enhancing resolution) struggled when recovering finer text details at large upscale factors. SR-GAN is the first to be able to infer images…

July 19, 2022

Research Paper

157. CycleGAN

CycleGAN Today I’ve learned about CycleGAN, so I’d like it here. Before this research, Image-to-Image translation tasks(Learning how to map an input image to a different style image) required “PAIR” data sets for training. Unfortunately, in most cases, you don’t…

July 18, 2022

AI, Computer Vision

156. Highway Networks

Highway Networks Training models with DEEP networks becomes difficult, even when using variance-preserving initialization. By adding an information highway (Learning how to route information through the network), it makes it easier to train models even when it is really DEEP.…

July 17, 2022

Computer Vision, Pytorch

155. S3D (Separable 3D CNN)

I’ve learned about S3D(Separable 3D CNN) today so I like to share it here. S3D helps solve three challenges for video analysis. How to understand spatial information. (Recognizing the appearance of an object) How to understand temporal information. (Such as…

July 16, 2022