Multi-Class = 1 class per image
Multi-Label = Includes multiple label in a single image
Softmax = Scale output to 0~1 and make the sum equal to 1 so that it becomes probabilities. Useful for multi-class classification.
Sigmoid = Scale output to 0~1. Only decides whether it’s that class or not. It doesn’t get any impact from other classes. Useful for binary classification.