169. Pruning Models

Pruning

“Pruning” means sparsing the network for faster inference. Most of the weights inside networks are quite useless, so this can help when you have limited resources such as running inference on the edge.

Methods

There are mainly 2 methods to prune a model.

Unstructured Pruning
This method just simply removes all the unnecessary weights. All neurons will remain, which means some neurons might be fully connected while others are sparsely connected.
Structured Pruning
This method removes neurons that are connected with unnecessary weights so that all remaining neurons would be fully connected.

Pruning

Methods

Related Posts

414. Graph Neural Network Basics

413. Tips For Developing Vector Databases

412. Augmenting LLMs with Private Data