CV 9. Deep Learning - waegari/waegari.github.io GitHub Wiki

์ฑ•ํ„ฐ 9: Deep Learning (๋”ฅ๋Ÿฌ๋‹)


1. ๋”ฅ๋Ÿฌ๋‹, ์™œ ํ•„์š”ํ•œ๊ฐ€?

  • ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์˜ ๋‚œ์ :

    • Viewpoint ๋ณ€ํ™”(์นด๋ฉ”๋ผ ๊ฐ๋„/๊ฑฐ๋ฆฌ/์Šค์ผ€์ผ), ์กฐ๋ช…, ๋ณ€ํ˜•, ๊ฐ€๋ฆผ(occlusion), ๋ฐฐ๊ฒฝ ๋ณต์žก๋„ ๋“ฑ
    • ๊ธฐ์กด ์ „ํ†ต์  ๋ฐฉ์‹(ํ…œํ”Œ๋ฆฟ ๋งค์นญ, ํ”ผ์ฒ˜ ๊ธฐ๋ฐ˜ ๋งค์นญ)์œผ๋กœ๋Š” ํ•œ๊ณ„
  • ๋”ฅ๋Ÿฌ๋‹(Deep Neural Network):

    • ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ/๋ณต์žกํ•œ ๋ฌธ์ œ์—์„œ๋„ โ€˜ํŠน์ง• ์ถ”์ถœ+๋ถ„๋ฅ˜โ€™๊นŒ์ง€ ์ž๋™ ํ•™์Šต
    • ์ธ๊ฐ„ ์ˆ˜์ค€/์ดˆ์›”ํ•˜๋Š” ์ธ์‹ ์ •ํ™•๋„ ๋‹ฌ์„ฑ

2. ๋”ฅ๋Ÿฌ๋‹ ์ด์ „์˜ ์ ‘๊ทผ๋ฒ•

  • ํ…œํ”Œ๋ฆฟ ๊ธฐ๋ฐ˜ ๋งค์นญ:

    • โ€œ์ด ํŒจํ„ด์ด๋ž‘ ์–ผ๋งˆ๋‚˜ ๋น„์Šท?โ€ โ†’ ํ…œํ”Œ๋ฆฟ ์ด๋ฏธ์ง€์™€ ํ”ฝ์…€๋ณ„ ๋น„๊ต
  • ํ”ผ์ฒ˜ ๊ธฐ๋ฐ˜ ๋งค์นญ:

    • ์—์ง€/์ฝ”๋„ˆ ๋“ฑ ๊ฐ•ํ•œ ํŠน์ง• ๊ฒ€์ถœ โ†’ ๋ถ„๋ฅ˜๊ธฐ๋Š” SVM, KNN ๋“ฑ ์ „ํ†ต ML ์‚ฌ์šฉ
  • ํ”ผ์ฒ˜ ๋””์Šคํฌ๋ฆฝํ„ฐ+๋ถ„๋ฅ˜๊ธฐ:

    • HOG, LBP, SIFT ๋“ฑ ํ”ผ์ฒ˜ ์ถ”์ถœ ํ›„ ๋ถ„๋ฅ˜

3. ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ณธ ๊ตฌ์กฐ(์‹ ๊ฒฝ๋ง ๋„คํŠธ์›Œํฌ)

  • Neural Network:

    • ์ž…๋ ฅ์ธต(Features) โ†’ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์€๋‹‰์ธต(Hidden Layers) โ†’ ์ถœ๋ ฅ์ธต(Classes)
    • ๊ฐ ์—ฐ๊ฒฐ๋งˆ๋‹ค Weight, Bias
    • Activation Function(ReLU, Sigmoid, Softmax ๋“ฑ)
    • ํ•™์Šต: ์ •๋‹ต(๋ผ๋ฒจ)๊ณผ ์˜ˆ์ธก๊ฐ’์˜ ์ฐจ์ด(Loss)๋ฅผ ์—ญ์ „ํŒŒ(Backpropagation)๋กœ ์ตœ์†Œํ™”
  • Supervised/Unsupervised/Semi-supervised Learning

    • ๋Œ€ํ‘œ์  ๋ฐ์ดํ„ฐ์…‹: MNIST(์†๊ธ€์”จ), CIFAR, ImageNet ๋“ฑ

4. CNN (Convolutional Neural Network) ๊ตฌ์กฐ ๋ฐ ํ•ต์‹ฌ ์—ฐ์‚ฐ

  • Convolution Layer(ํ•ฉ์„ฑ๊ณฑ์ธต):

    • ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ํ•„ํ„ฐ(์ปค๋„) ์ ์šฉ, ๋กœ์ปฌ ํŠน์ง•(Edge, Texture, Object part ๋“ฑ) ์ถ”์ถœ
    • Stride, Padding ๊ฐœ๋… ์ค‘์š” (Feature Map ํฌ๊ธฐ ์กฐ์ ˆ)
  • ReLU(Activation):

    • ๋น„์„ ํ˜•์„ฑ ์ถ”๊ฐ€, ์Œ์ˆ˜๋Š” 0์œผ๋กœ ์ฒ˜๋ฆฌ
  • Pooling Layer:

    • Max/Avg Pooling ๋“ฑ, ๊ณต๊ฐ„ ํฌ๊ธฐ ์ถ•์†Œ/๋ถˆ๋ณ€์„ฑ ๊ฐ•ํ™”(์œ„์น˜, ๋…ธ์ด์ฆˆ ๋“ฑ)
  • Fully Connected Layer:

    • ๋งˆ์ง€๋ง‰ ๋‹จ๊ณ„์—์„œ ๋ชจ๋“  ๋…ธ๋“œ ์—ฐ๊ฒฐ, ํด๋ž˜์Šค๋ณ„ ํ™•๋ฅ  ์ถœ๋ ฅ(Softmax)
  • Loss Function:

    • Cross-Entropy Loss, MSE ๋“ฑ
  • ํ•™์Šต:

    • Forward โ†’ Loss ๊ณ„์‚ฐ โ†’ Backpropagation์œผ๋กœ ํŒŒ๋ผ๋ฏธํ„ฐ(W, b) ์—…๋ฐ์ดํŠธ

5. ์ฃผ์š” ๋„คํŠธ์›Œํฌ ๊ตฌ์กฐ (์‚ฌ๋ก€)

  • LeNet-5:

    • ์ดˆ์ฐฝ๊ธฐ CNN, MNIST(์†๊ธ€์”จ) ๋ถ„๋ฅ˜์— ์„ฑ๊ณต, Conv-Pool-FC ๊ตฌ์กฐ
  • AlexNet:

    • 2012 ILSVRC์—์„œ ๋Œ€ํ˜์‹ , 8๋ ˆ์ด์–ด(5 Conv+3 FC), ReLU/Dropout/๋ฐ์ดํ„ฐ ์ฆ๊ฐ• ์ ์šฉ
  • ๊ทธ ์ดํ›„:

    • VGGNet, GoogLeNet, ResNet ๋“ฑ โ€œ๋” ๊นŠ๊ณ  ๋„“์€โ€ ๊ตฌ์กฐ ๋“ฑ์žฅ
    • Transfer Learning(์ „์ดํ•™์Šต), Fine-tuning(๋ฏธ์„ธ์กฐ์ •) ๋“ฑ

6. ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ๋ฐ˜ ์˜์ƒ ๋ถ„์„์˜ ์‹ค์ œ ๋ฌธ์ œ

  • ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜(Classification):

    • ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€๋ฅผ โ€˜1๊ฐœ ํด๋ž˜์Šคโ€™๋กœ ๊ตฌ๋ถ„
  • ๊ฐ์ฒด ๊ฒ€์ถœ(Object Detection):

    • ์—ฌ๋Ÿฌ ๊ฐœ ๊ฐ์ฒด ์œ„์น˜(์‚ฌ๊ฐํ˜•) + ์ข…๋ฅ˜ ์˜ˆ์ธก
  • ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜(Segmentation):

    • ๊ฐ ํ”ฝ์…€ ๋‹จ์œ„๋กœ ํด๋ž˜์Šค ์ง€์ •(์˜ˆ: ์‚ฌ๋žŒ/๋ฐฐ๊ฒฝ)
  • ๊ฐ์ฒด ํŠธ๋ž˜ํ‚น(Tracking):

    • ์˜์ƒ ๋‚ด ๊ฐ์ฒด์˜ ์œ„์น˜ ์ถ”์ 

7. ๋”ฅ๋Ÿฌ๋‹์˜ ํ•œ๊ณ„/์‹ค์ „ ์ ์šฉ์‹œ ๊ณ ๋ ค์‚ฌํ•ญ

  • ๋ฐ์ดํ„ฐ ๋ถ€์กฑ/ํŽธํ–ฅ(Bias), ์˜ค๋ฒ„ํ”ผํŒ…
  • ์—ฐ์‚ฐ๋Ÿ‰/๋ฉ”๋ชจ๋ฆฌ, ์‹ค์‹œ๊ฐ„ ์ฒ˜๋ฆฌ ํ•œ๊ณ„
  • ์„ค๋ช… ๊ฐ€๋Šฅ์„ฑ(XAI), ์œค๋ฆฌ/ํ”„๋ผ์ด๋ฒ„์‹œ ๋ฌธ์ œ
  • ๊ณผ์ ํ•ฉ ๋ฐฉ์ง€๋ฒ•: Dropout, Regularization, Data Augmentation ๋“ฑ

8. ์‹œํ—˜์— ์ž์ฃผ ๋‚˜์˜ค๋Š” ์„ธ๋ถ€ ํฌ์ธํŠธ/์ˆ˜์‹

  • Convolution ์—ฐ์‚ฐ:

    $$ y[i, j] = \sum_{m} \sum_{n} x[i+m, j+n] \cdot k[m, n] $$

  • ReLU: $f(x) = \max(0, x)$

  • Pooling: $\max$ ๋˜๋Š” $\text{avg}$ ์—ฐ์‚ฐ

  • Backpropagation/Gradient Descent ์›๋ฆฌ

  • Fully Connected ๊ณ„์‚ฐ: $y = Wx + b$

  • Softmax:

    $$ \text{softmax}(z_i) = \frac{e^{z_i}}{\sum_j e^{z_j}} $$

  • Cross-Entropy Loss:

    $$ L = -\sum_{i=1}^{C} y_i \log(\hat{y}_i) $$

    ( $y_i$: ์ •๋‹ต, $\hat{y}_i$: ์˜ˆ์ธก๊ฐ’ )


[์•”๊ธฐ/๋ฉ”๋ชจ์šฉ ํ‚ค์›Œ๋“œ]

  • Deep Learning = โ€œEnd-to-End ํ•™์Šตโ€, ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ
  • CNN: Convolution, Pooling, ReLU, Fully Connected, Softmax
  • LeNet, AlexNet, VGG, ResNet, Transfer Learning
  • Backpropagation, Gradient Descent, Cross-Entropy
  • Classification, Detection, Segmentation, Tracking
  • ์˜ค๋ฒ„ํ”ผํŒ…, Regularization, Data Augmentation, Explainable AI