Multiclass Classification with Softmax - utkaln/machine-learning GitHub Wiki

Softmax

  • softmax is a popular algorithm when the classification can have more than 2 values.
  • The fundamentals of softmax lies in the probability calculation of the possible values
z_j = W[j] * X + b[j]
a_j = np.exp(z_j) / ((np.exp(z_1) + (np.exp(z_2) + ... + (np.exp(z_n) +)

  • Loss calculation:
loss = 
-log a1 if y = 1
-log a2 if y = 2 
...

Implementation with Tensorflow

  • Important to note is that when calculation of done a as a prior step to calculating loss, the round off error can provide inaccurate results
  • To rectify the round off error, it is suggested to use the z value calculation directly into loss calculation
  • The above is achieved in Tensorflow code with few minimal modifications
    • Instead of using activation = softmax use activation = linear
    • in model.compile send additional attribute from_logits=True
model = Sequential([
  Dense(units = 20, activation = 'relu')
  Dense(units = 10, activation = 'relu')
  Dense(units = 5, activation = 'linear') # use linear instead of softmax
])

# Loss calculation
model.compile(loss=SparseCategoricalCrossEntropy(from_logits=True))

Multi-label Classification

  • This is very different than multi class classification
  • This is a a vector of binary classifications
Multiclass Multi-label
NOT Mutually Exclusive Mutually Exclusive
Output can take classification values more than 2 Output takes a shape of binary vectors [1],[0],[0](/utkaln/machine-learning/wiki/1],[0],[0)
One output impacts probability of other outputs One output does not impact probability of other outputs