Category Image Segmentation

140. Spatial Pyramid Pooling

Spatial Pyramid Pooling helps the network output the same shape regardless of any aspect ratio and input size. Instead of Pooling with a fixed filter size, it divides the input with different levels of ratio, so the output would not…

135. UNet

Unet may be one of the most basic researches on segmentation tasks. It consists of 3 parts: Encoding Phase (Apply Convolutions to classify object) -> Bridge -> Decoding Phase(Restore information so that the output would be 388×388). During the final…

134. HRNet

HRNet was a research done by Microsoft which lead to higher performance compared with state-of-the-art architectures. Traditional segmentation models utilize skip connections in order to recover spatial information from previous layers. The problem with this method is that it can’t…

133. FCN Upscaling

In order to classify images more precisely, the traditional way is to apply convolution and pooling to lower the dimension of the input so that the model can understand more complex features. This is ok for classification tasks because you…

132. Attention UNet

Attention Unet highlights only relevant activations during training. This can not only perform better when the target you want to detect is relatively tiny compared to the size of the picture, but it can also reduce unnecessary computations. The overall…

126. ArgMax Function

Argmax compares pixels in the same position across channels, and acquires the index of the highest channel. This can be useful for semantic segmentation. Semantic segmentation models outputs the same width and height as the input image and creates a…

124. Preprocessing for Deepstream

I found out why my TensorRT engine model was not working as expected. I messed up with configuring the preprocessing step for Deepstream. When you use Deepstream to run inference, there is a property called net-scale-factor and offsets which you…