AI HW 08 Griffin - TheEvergreenStateCollege/upper-division-cs-23-24 GitHub Wiki
AI HW 08
Attention Mechanisms:
Example of self-attention:
outputs a context vector. Can be considered as an enriched embedding vector.
Q. How is it computed?
A.
in the example above, we calculate the attention score for each embedded token based on a selected embedded query token X^2
Dot Product
Example: (x1 * y1) + (x2 * y2) + (x3 * y3)
Next,
Attention weights are obtained by normalizing the attention scores:
Then, (we calculate the context vector)