5 장 파이썬 함수 계산 결과 확인 - LOPES-HUFS/DeepLearningFromForR GitHub Wiki

Deep Learning from Scratch

5 장

python으로 만들어진 것을 테스트 하기 위하여 세팅을 한다. 마지막 numpy가 중요하다. 왜냐하면 이 코드들은 기본적으로 numpy를 기본으로 만들어졌기 때문이다.

import sys, os
sys.path.append(os.pardir)
from common.layers import *
import numpy as np

클래스를 만들고 테스트 한다.

Sigmoid

python 코드

>>> test_Sigmoid = Sigmoid()
>>> temp = np.array([-0.5, 0, 0.5])
>>> test_Sigmoid.forward(temp)
array([0.37754067, 0.5       , 0.62245933])
>>> test_Sigmoid.backward(temp)
array([-0.11750186,  0.        ,  0.11750186])

R 코드

> sigmoid <- function(x){
+     return(1 / (1 + exp(-x)))
+ }
> temp <- c(-0.5, 0, 0.5)
> sigmoid(temp)
[1] 0.3775407 0.5000000 0.6224593

Relu

>>> test_Relu = Relu()
>>> temp = np.array([-0.5, 0, 0.5])
>>> test_Relu.forward(temp)
array([0. , 0. , 0.5])
>>> test_Relu.backward(temp)
array([0. , 0. , 0.5])

5.6.1 Affine 계층

순전파 계산하기

python 코드

>>> X = np.array([1, 2], [3,4], [5,6](/LOPES-HUFS/DeepLearningFromForR/wiki/1,-2],-[3,4],-[5,6))
>>> X
array([[1, 2],
       [3, 4],
       [5, 6]])
>>> X.shape
(3, 2) 
>>> W = np.array([3,4,5],[6,7,8](/LOPES-HUFS/DeepLearningFromForR/wiki/3,4,5],[6,7,8))
>>> W
array([[3, 4, 5],
       [6, 7, 8]])
>>> W.shape
(2, 3)
>>> b = np.array([9,10,11])
>>> np.matmul(X, W) + b
array([[24, 28, 32],
       [42, 50, 58],
       [60, 72, 84]])

>>> test_Affine = Affine(W, b)
>>> test_Affine.forward(X)
array([[24, 28, 32],
       [42, 50, 58],
       [60, 72, 84]])

R 코드

> X <- matrix(1:6, nrow=3, byrow=TRUE)
> dim(X)
[1] 3 2
> X
     [,1] [,2]
[1,]    1    2
[2,]    3    4
[3,]    5    6
> W <- matrix(3:8, nrow=2, byrow=TRUE)
> dim(W)
[1] 2 3
> W
     [,1] [,2] [,3]
[1,]    3    4    5
[2,]    6    7    8
> b <- matrix(9:11)
> b
     [,1]
[1,]    9
[2,]   10
[3,]   11
> dim(b)
[1] 3 1
> sweep((X %*% W),2, b,'+')
     [,1] [,2] [,3]
[1,]   24   28   32
[2,]   42   50   58
[3,]   60   72   84

역전파 계산하기

배치로 했습니다.

>>> dout = np.array([-1, -2, -3],[-4, -5, -6],[-7, -8, -9](/LOPES-HUFS/DeepLearningFromForR/wiki/-1,--2,--3],[-4,--5,--6],[-7,--8,--9))
>>> dout
array([[-1, -2, -3],
       [-4, -5, -6],
       [-7, -8, -9]])
>>> dout.shape
(3, 3)
# 아래 test_Affine.backward(dout)의 결과값이 dX!
>>> test_Affine.backward(dout)
array([[ -26,  -44],
       [ -62, -107],
       [ -98, -170]])
>>> test_Affine.dW
array([[-48, -57, -66],
       [-60, -72, -84]])
>>> test_Affine.db
array([-12, -15, -18])

5.6.3 Softmax With Loss 계층

값 입력하기

>>> import numpy as np
>>> x = np.array([0.3, 2.9, 4.0],[0.3, 2.9, 4.0](/LOPES-HUFS/DeepLearningFromForR/wiki/0.3,-2.9,-4.0],[0.3,-2.9,-4.0))
>>> x.shape
(2, 3)
>>> x
array([[0.3, 2.9, 4. ],
       [0.3, 2.9, 4. ]])

>>> t = np.array([0, 0, 1],[0, 0, 1](/LOPES-HUFS/DeepLearningFromForR/wiki/0,-0,-1],[0,-0,-1))
>>> t.shape
(2, 3)
>>> t
array([[0, 0, 1],
       [0, 0, 1]])

순전파 계산하기

>>> test_SoftmaxWithLoss = SoftmaxWithLoss()
>>> test_SoftmaxWithLoss.forward(x,t)
0.3057143290530003

역전파 계산하기

>>> test_SoftmaxWithLoss.backward()
array([[ 0.00910564,  0.12259591, -0.13170154],
       [ 0.00910564,  0.12259591, -0.13170154]])

R 코드


softmax <- function(a){
    exp_a <- exp(a - apply(a,1,max))
    return(sweep(exp_a,1,rowSums(exp_a),"/"))
}

cross_entropy_error <- function(y, t){
    delta <- 1e-7
    batchsize <- dim(y)[1]
    return(-sum(t * log(y + delta))/batchsize)
}

SoftmaxWithLoss.forward <- function(x, t){
    y <- softmax(x)
    loss <- cross_entropy_error(y, t)
    return(list(loss = loss , y = y, t = t))
}

SoftmaxWithLoss.backward <- function(forward, dout=1){
    dx <- (forward$y - forward$t) / dim(forward$t)[1]
    return(list(dx = dx))
}
> x <- matrix(c(0.3, 2.9, 4.0, 0.3, 2.9, 4.0), nrow=2, byrow=TRUE)
> dim(x)
[1] 2 3
> x
     [,1] [,2] [,3]
[1,]  0.3  2.9    4
[2,]  0.3  2.9    4

> t <- matrix(c(0, 0, 1, 0, 0, 1), nrow=2, byrow=TRUE)
> dim(t)
[1] 2 3
> t
     [,1] [,2] [,3]
[1,]    0    0    1
[2,]    0    0    1
> 

> SoftmaxWithLoss.forward(x,t)
$loss
[1] 0.3057143

$y
           [,1]      [,2]      [,3]
[1,] 0.01821127 0.2451918 0.7365969
[2,] 0.01821127 0.2451918 0.7365969

$t
     [,1] [,2] [,3]
[1,]    0    0    1
[2,]    0    0    1

> SoftmaxWithLoss.backward(SoftmaxWithLoss.forward(x,t))
$dx
            [,1]      [,2]       [,3]
[1,] 0.009105637 0.1225959 -0.1317015
[2,] 0.009105637 0.1225959 -0.1317015