Gradient descent method - SilverQ/dl_study GitHub Wiki

๊ฒฝ์‚ฌํ•˜๊ฐ•๋ฒ•(Wiki)

  • ๊ธฐ๋ณธ ์•„์ด๋””์–ด๋Š” cost function์˜ ๋น„์šฉ์ด ๋‚ฎ์€ ์ชฝ์œผ๋กœ w(๊ธฐ์šธ๊ธฐ)๋ฅผ ๊ณ„์† ์ด๋™์‹œํ‚ค๋ฉฐ ๊ฐ€์žฅ ๋‚ฎ์€ cost๋ฅผ ๊ฐ–๋Š” w๋ฅผ ์ฐพ๋Š” ๋ฐฉ๋ฒ•
  • ๊ฐ€์žฅ ๋จผ์ € ๊ฒฐ์ •ํ•  ๊ฒƒ์€ ์ตœ์ ํ™”ํ•  ํ•จ์ˆ˜์ด๋ฉฐ, ์•ž์„œ ์ •์˜ํ•œ cost function์ด ๋Œ€์ƒ์ด๋‹ค.
  • w๋ฅผ ์ด๋™์‹œํ‚ค๋Š” ํฌ๊ธฐ๋Š”, ํ˜„์žฌ w๊ฐ’์—์„œ์˜ cost function์˜ ๊ธฐ์šธ๊ธฐ(๋ฏธ๋ถ„๊ฐ’)์— ์ž„์˜์˜ ๋น„์œจ(learning rate)์„ ๊ณฑํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค.
# ๋ฏธ๋ถ„, ๊ธฐ์šธ๊ธฐ, ์ˆœ๊ฐ„๋ณ€ํ™”๋Ÿ‰
# x์ถ•์œผ๋กœ 1๋งŒํผ ์›€์ง์˜€์„ ๋•Œ y์ถ•์œผ๋กœ ์›€์ง์ธ ๊ฑฐ๋ฆฌ
# broadcasting์„ ํ†ตํ•ด ์—ฐ์‚ฐ์ด ์ด๋ฃจ์–ด์ง€๋ฏ€๋กœ, ๋ฐ์ดํ„ฐ์˜ ๋””๋ฉ˜์ ผ์˜ ๋ณ€ํ™”๋ฅผ ์ž˜ ๋”ฐ๋ผ๊ฐ€์•ผ ํ•œ๋‹ค.
# epoch๋Š” ๊ฒฝ์‚ฌ ํ•˜๊ฐ•์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํšŸ์ˆ˜์ด๋‹ค.

def gradient_descent(x, y, w):
    c = 0
    hx = w * x
    return sum((hx-y)*x) / len(x)

def show_gradient_descent(x, y, w, epoch=10, lr=0.1):
    for i in range(epoch):
        g = gradient_descent(x, y, w)
        w -= 0.1 * g
        print(i, w)
    return w

x = np.array([1, 2, 3])
y = np.array([1, 2, 3])
w = 100
epoch = 100
# lr = 0.1
w = show_gradient_descent(x, y, w, epoch=epoch)
print("x = 3 ์ผ ๋•Œ y =", w*3)