201. PaDiM

PaDiM

Several methods have been proposed to combine anomaly detection(Give anomaly score to images) and localization(Assign each pixel an anomaly score to output anomaly map) in a one-class learning setting(Whether an image is normal or not). However, either they require deep neural network training, which can become huge, or they use KNN on the entire training dataset at test time. The linear complexity of the KNN algorithm increases the time and space complexity as the size of the training set grows.

PaDiM was proposed to mitigate the scaling issues mentioned above.

Architecture

  1. Embedding Extraction
    During the training phase, each patch of the normal images is associated with its spatially corresponding activation vectors in the pre-trained CNN activation maps. Activation vectors from different layers are then concatenated to get embedding vectors carrying information from different semantic levels and resolutions, in order to encode fine-grained and global contexts.

  2. Learning the normality
    To learn the normal image characteristics at position (i, j), we first compute the set of patch embedding vectors at (i, j), Xij = {x k ij , k ∈ [[1, N]]} from the N normal training images. To sum up the information carried by this set we make the assumption that Xij is generated by a multivariate Gaussian distribution N. PaDiM tries to learn this distribution.

  3. Inference
    The paper uses the Mahalanobis distance M(xij ) to give an anomaly score to the patch in position (i, j) of a test image

Reference: PaDiM: a Patch Distribution Modeling Framework
for Anomaly Detection and Localization