Category Research Paper

173. BiSeNet

Background Most of the previous semantic segmentation model’s architecture can be categorized into 2 types. Encoder-Decoder Backbone: (Ex. FCN, UNet) This architecture requires all information to flow through the deep encoding-decoding structure leading to high latency, also suffering in restoring…

162. Residual Blocks

Why are residual blocks called “residual” blocks? The reason why I was confused was that the equation in the diagram explaining the residual blocks on the research paper was f(x) + x. So I thought, “Where is the residual..?” When…

161. ESRGAN

Abstract Even though SR-GAN was able to make a huge improvement, there was still a gap between the generated image and the ground truth image. The proposed ESR-GAN further enhances the performance. Three Key Modification Components Network Remove all batch…

160. DeepPose

DeepPose DeepPose is a research done by Google for human pose estimation. Pose Vector First, the paper encodes all “k” body joints into a pose vector. To avoid using absolute coordinates for the body joints like right now, the paper…

159. M-RNN

M-RNN Multi-Modal Recurrent Neural Network is a research done by The University of California and the Baidu Research Team which generates captions for images. In this research,Deep Recurrent Neural Network is used for sentences, and Deep Convolutional Neural Network is…

158: SR-GAN

SR-GAN Today I learned about SR(Super Resolution)-GAN, so I’d like to share it here. In previous research, super-resolution tasks(Enhancing resolution) struggled when recovering finer text details at large upscale factors. SR-GAN is the first to be able to infer images…

157. CycleGAN

CycleGAN Today I’ve learned about CycleGAN, so I’d like it here. Before this research, Image-to-Image translation tasks(Learning how to map an input image to a different style image) required “PAIR” data sets for training. Unfortunately, in most cases, you don’t…