softmax - chunhualiao/public-docs GitHub Wiki
The softmax function is a mathematical function often used in machine learning—especially in classification and reinforcement learning—to convert a vector of raw scores (called logits) into a probability distribution over possible classes or actions.
✅ Definition
Given a vector of real numbers $[z_1, z_2, ..., z_n]$, the softmax function outputs:
$$ \text{softmax}(z_i) = \frac{e^{z_i}}{\sum_{j=1}^n e^{z_j}} $$
for each $i = 1, 2, ..., n$
✅ Intuition
- It exponentiates each number to make them all positive.
- Then it normalizes them by dividing by the sum, so they add up to 1.
- This turns any list of numbers into something that looks like a probability distribution.
📌 Example
Say you have three scores:
z = [2.0, 1.0, 0.1]
Softmax will compute something like:
softmax(z) ≈ [0.65, 0.24, 0.11]
These numbers sum to 1 and indicate relative confidence or preference—e.g., in classification, which class the model thinks is most likely.
✅ Use Cases
- Classification (e.g., neural networks): The final layer uses softmax to output class probabilities.
- Policy networks (in RL): The output of the network is passed through softmax to choose an action probabilistically.
- Temperature scaling: In reinforcement learning or generation, softmax can be modified as:
$\text{softmax}(z_i / T)$
where $T$ is a temperature parameter that controls exploration:
- $T \to 0$: becomes greedy (sharp distribution)
- $T \to \infty$: becomes uniform (random behavior)
Let me know if you’d like a visual explanation or a small Python code example.