Problem 18 - Entrance Test

Which loss function is most suitable for multi-class classification using a softmax output layer?

A. Mean Squared ErrorB. Hinge LossC. Categorical Cross-EntropyD. L1 Loss

Correct: C

Categorical cross-entropy compares the predicted probability distribution (softmax) to the true one-hot distribution, providing gradients that push the correct class probability toward 1.