Which loss function is most suitable for multi-class classification using a softmax output layer?
Correct: C
Categorical cross-entropy compares the predicted probability distribution (softmax) to the true one-hot distribution, providing gradients that push the correct class probability toward 1.