softmax - chunhualiao/public-docs GitHub Wiki

The softmax function is a mathematical function often used in machine learning—especially in classification and reinforcement learning—to convert a vector of raw scores (called logits) into a probability distribution over possible classes or actions.


✅ Definition

Given a vector of real numbers $[z_1, z_2, ..., z_n]$, the softmax function outputs:

$$ \text{softmax}(z_i) = \frac{e^{z_i}}{\sum_{j=1}^n e^{z_j}} $$

for each $i = 1, 2, ..., n$


✅ Intuition

  • It exponentiates each number to make them all positive.
  • Then it normalizes them by dividing by the sum, so they add up to 1.
  • This turns any list of numbers into something that looks like a probability distribution.

image


📌 Example

Say you have three scores:

z = [2.0, 1.0, 0.1]

Softmax will compute something like:

softmax(z) ≈ [0.65, 0.24, 0.11]

These numbers sum to 1 and indicate relative confidence or preference—e.g., in classification, which class the model thinks is most likely.


✅ Use Cases

  • Classification (e.g., neural networks): The final layer uses softmax to output class probabilities.
  • Policy networks (in RL): The output of the network is passed through softmax to choose an action probabilistically.
  • Temperature scaling: In reinforcement learning or generation, softmax can be modified as:

$\text{softmax}(z_i / T)$

where $T$ is a temperature parameter that controls exploration:

  • $T \to 0$: becomes greedy (sharp distribution)
  • $T \to \infty$: becomes uniform (random behavior)

Let me know if you’d like a visual explanation or a small Python code example.