DeepLearning_Lec03 - 8BitsCoding/RobotMentor GitHub Wiki


์ด๋ฏธ์ง€

Hypothesis๋Š” H(x)์™€ ๊ฐ™์ด ์ฃผ์–ด์ง€๊ณ 

cost(W, b) ํ•จ์ˆ˜๋„ ์œ„์™€ ๊ฐ™์ด ์ฃผ์–ด์ง€๋ฉฐ ์šฐ๋ฆฌ๋Š” cost๋ฅผ ์ตœ์†Œํ™” ํ•˜๋ คํ•œ๋‹ค.


์ข€ ๋” ๋ฌธ์ œ๋ฅผ ๊ฐ„๋‹จํžˆ ํ•ด๋ณด์ž.

์ด๋ฏธ์ง€

bias(b) ํ…€์„ ์—†์• ๊ณ  costํ•จ์ˆ˜๋ฅผ ๋„์ถœํ•˜๋ฉด ์œ„์™€ ๊ฐ™๋‹ค


์œ„ ์‹์„ ๋ฐ”ํƒ•์œผ๋กœ cost๋ฅผ ์ตœ์†Œํ™” ํ•ด๋ณด์ž

์ด๋ฏธ์ง€

W์— ๋”ฐ๋ฅธ cost๊ฐ’์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ๊ทธ๋ ค๋ณด์ž

์ด๋ฏธ์ง€

์ด๋ฏธ์ง€

๋ชฉํ‘œ๋Š” cost๊ฐ€ ์ตœ์†Œ๊ฐ€ ๋˜๋Š” ๊ฐ’์„ ์ฐพ๋Š” ๊ฒƒ์ด๋‹ค.

์—ฌ๊ธฐ์— ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ์•Œ๊ณ ๋ฆฌ๋“ฌ์ด Gradient descent algorithm

Gradient(๊ฒฝ์‚ฌ) descent(๋‚ด๋ ค๊ฐ€๋Š”) ์•Œ๊ณ ๋ฆฌ์ฆ˜ = ๊ฒฝ์‚ฌ๋ฅผ ๋”ฐ๋ผ ๋‚ด๋ ค๊ฐ€๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜

์ด๋ฏธ์ง€

์ด๋ฏธ์ง€


์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐํ•˜๋‚˜?

์ด๋ฏธ์ง€

๋ฏธ๋ถ„์„ ์ˆ˜์›”ํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด์„œ 1/2๋ฅผ ํ•˜๊ณ  ๋ฏธ๋ถ„์„ ์ˆ˜ํ–‰

์ด๋ฏธ์ง€

์ด๋ฏธ์ง€

Gradient descent algorithm์„ ์œ„์™€ ๊ฐ™์ด ์ •์˜ ํ•  ์ˆ˜ ์žˆ๋‹ค.


Convex function

์ด๋ฏธ์ง€

์ตœ ์ €์ ์ด ์˜ค์ง ํ•˜๋‚˜์ธ ํ•จ์ˆ˜ -> Gradient descent algorithm์—์„œ ํ•ญ์ƒ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Œ.

ํ•ญ์ƒ cost function์˜ ๋ชจ์–‘์ด Convex function์ธ์ง€๋ฅผ ํ™•์ธํ•ด์•ผํ•œ๋‹ค.