Category AI

Research Paper

160. DeepPose

DeepPose DeepPose is a research done by Google for human pose estimation. Pose Vector First, the paper encodes all “k” body joints into a pose vector. To avoid using absolute coordinates for the body joints like right now, the paper…

Kyosuke
July 21, 2022

Research Paper

159. M-RNN

M-RNN Multi-Modal Recurrent Neural Network is a research done by The University of California and the Baidu Research Team which generates captions for images. In this research,Deep Recurrent Neural Network is used for sentences, and Deep Convolutional Neural Network is…

Kyosuke
July 20, 2022

Research Paper

158: SR-GAN

SR-GAN Today I learned about SR(Super Resolution)-GAN, so I’d like to share it here. In previous research, super-resolution tasks(Enhancing resolution) struggled when recovering finer text details at large upscale factors. SR-GAN is the first to be able to infer images…

Kyosuke
July 19, 2022

Research Paper

157. CycleGAN

CycleGAN Today I’ve learned about CycleGAN, so I’d like it here. Before this research, Image-to-Image translation tasks(Learning how to map an input image to a different style image) required “PAIR” data sets for training. Unfortunately, in most cases, you don’t…

Kyosuke
July 18, 2022

AI, Computer Vision

156. Highway Networks

Highway Networks Training models with DEEP networks becomes difficult, even when using variance-preserving initialization. By adding an information highway (Learning how to route information through the network), it makes it easier to train models even when it is really DEEP.…

Kyosuke
July 17, 2022

Computer Vision, Pytorch

155. S3D (Separable 3D CNN)

I’ve learned about S3D(Separable 3D CNN) today so I like to share it here. S3D helps solve three challenges for video analysis. How to understand spatial information. (Recognizing the appearance of an object) How to understand temporal information. (Such as…

Kyosuke
July 16, 2022

154. Approaches For Tuning Models

There are mainly 2 approaches to tuning a model. Panda Approach: Tune 1 model at a time Caviar Approach: Tune Multiple model at once

Kyosuke
July 14, 2022
1 Comment

Computer Vision, Deep Learning

153. Non-Local Neural Networks

“Local” means only understanding the CURRENT “time” and “space”. To understand “non-local” nuances (What will the person in the image do next? Where will the soccer ball being kicked head towards?), if we were to use traditional methods such as…

Kyosuke
July 13, 2022

AI, Statistics

152. KL Divergence

KL Divergence measures the distance between 2 distributions. This can be used to understand Cross-Entropy and deep learning model architectures such as VAE. For Example, lets say there is a coin which has 50% chance of being HEADS and 50%…

Kyosuke
July 12, 2022

AI, Statistics

151. Different Types of Optimizers

There are mainly two approaches for optimizing gradient descent. Adjusting the Learning Rate or Adjusting the Gradients.

Kyosuke
July 11, 2022