Well defined Negative Log Likelihood - SoojungHong/MachineLearning GitHub Wiki

This reference has easy explanation about Negative Log Likelihood : https://medium.com/deeplearningmadeeasy/negative-log-likelihood-6bd79b55d8b6

Likelihood refers to the chances of some calculated parameters producing some known data.

Why we want to wrap everything with a logarithm? Computers are capable of almost anything, except exact numeric representation. With Log, it can represent with value that computer can easily represent.

Example of calculating the negative log likelihood Oooook... then how do they play together? Well, to calculate the likelihood we have to use the probabilities. To continue with the example above, imagine for some input we got the following probabilities: [0.1, 0.3, 0.5, 0.1], 4 possible classes. If the true answer would be the forth class, as a vector [0, 0, 0, 1], the likelihood of the current state of the model producing the input is:

00.3 + 00.1 + 00.5 + 10.1 = 0.1.

NLL: -ln(0.1) = 2.3

Usage : classification It can be used for any classification (binary classification, multi-label classification)

Other very good reference about Likelihood, Log-Likelihood and Negative Log-Likelihood

http://d2l.ai/chapter_appendix-mathematics-for-deep-learning/maximum-likelihood.html