Attention Unet highlights only relevant activations during training. This can not only perform better when the target you want to detect is relatively tiny compared to the size of the picture, but it can also reduce unnecessary computations.
The overall architecture is the same as Unet, but it adds a layer called ATTENTION GATE inside the skip connection. By inserting a RELU, after combining the feature maps from both shallow and deep layers, weights that are aligned will be magnified, and the weights that are not aligned will be shrunk. Then it applies sigmoid and converges the weights to 0 ~ 1, and multiplies element-wise with the original input so that the area that might be highly related to the target would be magnified.