AI HW 08 Griffin - TheEvergreenStateCollege/upper-division-cs-23-24 GitHub Wiki

AI HW 08

Attention Mechanisms: image

Example of self-attention: image outputs a context vector. Can be considered as an enriched embedding vector. Q. How is it computed? A.

image

image in the example above, we calculate the attention score for each embedded token based on a selected embedded query token X^2

image

Dot Product

image Example: (x1 * y1) + (x2 * y2) + (x3 * y3)

Next,

Attention weights are obtained by normalizing the attention scores:

Then, (we calculate the context vector)

image

image image

3.4 Implementing Self-Attention with Trainable Weights