Lecture 4 - AsyDynamics/CS231n GitHub Wiki

Back propagation

Local gradient $/times$ upstream gradient
Combine some of the operation as pattern, e.g. the sigmoid function
add gate - gradient distributor
max gate - gradient router
mul gate - gradient switcher
Jacobian matrix
API forward backward

Neural netword

before: linear score function
now, with multiple layer
activation function, many types