Pruning
State-of-the-art deep learning techniques rely on over-parameterized models which makes it hard when the deploying destination has limited resources.
Pruning is used to learn the differences between over-parameterized and under-parameterized networks and sparsify your neural networks.
In Pytorch, you can use ‘torch.nn.utils.prune’ to prune your model by module. Instead of removing the unnecessary connections, it creates a binary mask that identifies which connections are present and which have been pruned.
I’ve also made a blog post on the differences between structured and unstructured pruning. If you are interested, please go check it out.
Reference: Official tutorial.