169. Pruning Models

Pruning

Pruning” means sparsing the network for faster inference. Most of the weights inside networks are quite useless, so this can help when you have limited resources such as running inference on the edge.

Methods

There are mainly 2 methods to prune a model.

  1. Unstructured Pruning
    This method just simply removes all the unnecessary weights. All neurons will remain, which means some neurons might be fully connected while others are sparsely connected.

  2. Structured Pruning
    This method removes neurons that are connected with unnecessary weights so that all remaining neurons would be fully connected.