neural - cccnqu/ai106b GitHub Wiki

主題:神經網路

理論基礎

後半 Becoming a Backprop Ninja 當中,da,db,dc,dx 代表梯度,而不是微分符號 (這很容易誤解)

da = a.grad
db = b.grad
....
dx = x.grad

所以才會有下列推導

var x = a * b;
// and given gradient on x (dx), we saw that in backprop we would compute:
var da = b * dx;
var db = a * dx;

這個推導其實對應到下列程式

var multiplyGate = function(){ };
multiplyGate.prototype = {
  forward: function(u0, u1) {
    // store pointers to input Units u0 and u1 and output unit utop
    this.u0 = u0;  // u0 = a
    this.u1 = u1;  // u1 = b
    this.utop = new Unit(u0.value * u1.value, 0.0);
    return this.utop;
  },
  backward: function() {
    // take the gradient in output unit and chain it with the
    // local gradients, which we derived for multiply gate before
    // then write those gradients to those Units.
    this.u0.grad += this.u1.value * this.utop.grad; // da = b * dx
    this.u1.grad += this.u0.value * this.utop.grad; // db = a * dx
  }
}