softmax - chunhualiao/public-docs GitHub Wiki

The softmax function is a mathematical function often used in machine learning—especially in classification and reinforcement learning—to convert a vector of raw scores (called logits) into a probability distribution over possible classes or actions.

✅ Definition

Given a vector of real numbers $[z_1, z_2, ..., z_n]$, the softmax function outputs:

$$ \text{softmax}(z_i) = \frac{e^{z_i}}{\sum_{j=1}^n e^{z_j}} $$

for each $i = 1, 2, ..., n$

✅ Intuition

It exponentiates each number to make them all positive.
Then it normalizes them by dividing by the sum, so they add up to 1.
This turns any list of numbers into something that looks like a probability distribution.

📌 Example

Say you have three scores:

z = [2.0, 1.0, 0.1]

Softmax will compute something like:

softmax(z) ≈ [0.65, 0.24, 0.11]

These numbers sum to 1 and indicate relative confidence or preference—e.g., in classification, which class the model thinks is most likely.

✅ Use Cases

Classification (e.g., neural networks): The final layer uses softmax to output class probabilities.
Policy networks (in RL): The output of the network is passed through softmax to choose an action probabilistically.
Temperature scaling: In reinforcement learning or generation, softmax can be modified as:

$\text{softmax}(z_i / T)$

where $T$ is a temperature parameter that controls exploration:

$T \to 0$: becomes greedy (sharp distribution)
$T \to \infty$: becomes uniform (random behavior)

Let me know if you’d like a visual explanation or a small Python code example.