GAN - newlife-js/Wiki GitHub Wiki

GAN

1. Deep Learning ๊ฐœ์š”

  • ์ธ๊ณต์ง€๋Šฅ (Artificial Intelligence) ์ง€์‹(์ •์ ) : ์–ด๋–ค ๊ฒƒ์— ๋Œ€ํ•œ ์ธ์‹, ์ดํ•ด
    ์ง€๋Šฅ(๋™์ ) : ์ง€์‹์„ ์•Œ์•„๋‚ด๊ณ  ์ƒํ™ฉ์— ๋Œ€์‘ํ•˜๋Š” ๋Šฅ๋ ฅ(ํ•™์Šต๊ณผ ์ถ”๋ก )
    ์ง€์„ฑ(๋ชฉ์ /๋ชฉํ‘œ์ง€ํ–ฅ) : ์ง€๋Šฅ์„ ํ™œ์šฉํ•˜๋Š” ์„ฑํ–ฅ/์„ฑ์งˆ
    ์ธ๊ณต์ง€๋Šฅ : ๊ธฐ๊ณ„์ ์œผ๋กœ ์ง€์‹์„ ์•Œ์•„๋‚ด๊ณ , ๋Œ€์‘ํ•˜๋Š” ๋Šฅ๋ ฅ

  • ๊ธฐ๊ณ„ํ•™์Šต : ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์Šค์Šค๋กœ ํ•™์Šต
    ๋ฐ์ดํ„ฐ์—์„œ ๋‚ด์žฌ๋œ ํŒจํ„ด, ๊ทœ์น™ ๋“ฑ์—์„œ ํ•™์Šต์„ ํ†ตํ•˜์—ฌ ํŠน์ง•์„ ๊ตฌ๋ถ„ํ•˜๊ณ  ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ.
    ํ•™์Šต: ์ฃผ์–ด์ง„ ๋ฐ์ดํ„ฐ์˜ ํ‰๊ฐ€๋ฅผ ํ†ตํ•ด ํŠน์ง•์„ ์ฐพ๊ณ , ํŠน์ง•๋“ค์˜ ์กฐํ•ฉ์œผ๋กœ ๊ณ ์ฐจ์›์˜ ํŠน์ง•์„ ์ฐพ๋Š” ๊ฒƒ

  • Deep Neural Network : ์ธ๊ณต์‹ ๊ฒฝ๋ง ๊ณ„์ธตํ™” MLP -> Deep NN

  • ๊ธฐ๊ณ„ํ•™์Šต ๋ถ„๋ฅ˜
    Supervised Learning - classification / regression
    Unsupervised Learning - clustering / generative model
    Reinforcement Learning

  • Regression : ์–ด๋–ค ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด ์ถœ๋ ฅ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” ์กฐ๊ฑด์„ ๊ณ ๋ คํ•œ ํ‰๊ท ๊ฐ’ ๊ตฌํ•˜๊ธฐ

Linear regression : ํŠน์ง•์˜ ์„ ํ˜•๊ฒฐํ•ฉ์œผ๋กœ value์ถ”์ •
- y(์ข…์†๋ณ€์ˆ˜) ๊ฐฏ์ˆ˜ -> univariate / multivariate regression
Logistic regression : ๋ถ„๋ฅ˜(-โ™พ๏ธ ~ +โ™พ๏ธ ๋ฅผ 0~1์˜ ํ™•๋ฅ ๋กœ ๋ณ€ํ™˜)
- category ๊ฐฏ์ˆ˜ -> binomial / multinomial logistic regression

๊ฐ๊ฐ์˜ ๋‰ด๋Ÿฐ node๊ฐ€ regression model๋กœ ๊ตฌ์„ฑ๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉด ๋จ.
activation function์˜ ์ข…๋ฅ˜๊ฐ€ linear / nonlinear / logistic ๋งŒ๋“ฆ

multinomial logistic regression

  • ๊ฐ ํด๋ž˜์Šค์— ํ•ด๋‹นํ•  ํ™•๋ฅ ์˜ ํ•ฉ์ด 1์ด ๋˜๋„๋ก(softmax) y ๋ฒกํ„ฐ๋ฅผ ์ถœ๋ ฅ

ํ•™์Šต ์šฉ์–ด

Epoch: ๋ชจ๋“  data๋ฅผ ํ•œ ๋ฒˆ ํ•™์Šต
Batch size: ํ•™์Šต ํ•œ ๋ฒˆ์— ์‚ฌ์šฉ๋˜๋Š” dataset ํฌ๊ธฐ
Iteration: batch_size๋กœ ํ•™์Šต ์‹œํ–‰ํ•˜๋Š” ๊ฒƒ
1 Epoch = batch_size * #of_iteration(= #of_data / batch_size)

  • Cost function / Loss function
    Cost function : ํ•™์Šต์„ ํ†ตํ•ด ์ตœ์†Œํ™”ํ•˜๋ ค๋Š” function Loss function : cost function์„ ๊ตฌ์„ฑํ•˜๋Š” subset

  • Optimization Algorithm
    Loss function์˜ ๊ฐ’์„ ์ตœ์†Œํ™”ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜
    ์˜ˆ) SGD, momentom, NAG, Adagrad ๋“ฑ

  • Learning rate
    ํ•™์Šต ์†๋„์™€ ์„ฑ๋Šฅ์— ์˜ํ–ฅ์„ ์ฃผ๋Š” hyper parameter

  • Overfitting
    Training Data์— ์ตœ์ ํ™”๋˜์–ด ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜๋Š” ๋ฌธ์ œ
    ๋ชจ๋ธ์œผ ใ…ฃ๋ณต์žก๋„๊ฐ€ ์ปค์งˆ์ˆ˜๋ก Training loss๋Š” ์ง€์†์ ์œผ๋กœ ๊ฐ์†Œ / Validation loss๋Š” ๋‹ค์‹œ ์ฆ๊ฐ€


2. Keras

๋ชจ๋ธ ์„ค๊ณ„

  1. Sequential model
model = keras.models.Sequential()
model.add(keras.~)
model.add(layers.~)
~~~
  1. Functional API
input_x = keras.Input(shape=(28,28))
x0 = layers.Flatten()(input_x)
x1 = layers.Dense(~)(x0)
output_x = layer.Dense(~)(x1)
model = keras.Model(inputs=input_x, outputs=output_x)
  1. Subclassing API
class MyModel(keras.Model):
  ~~~~~
  • visualize
model.summary()
keras.utils.plot_model(model, 'model.png', hsow_shapes=True)

Compile method

  1. ๋‚ด์žฅ ํ•จ์ˆ˜์˜ 'name'์œผ๋กœ ์ง€์ •
model.compile(loss='categorical_crossentropy', 
              optimizer='Adam', 
              metrics=['accuracy', 'mse'])
  1. ๋‚ด์žฅํ•จ์ˆ˜๋ฅผ ์ง€์ •
model.compile(optimizer=keras.optimizers.RMSprop(learning_rate=0.01, rho=0.9), 
              loss=keras.losses.CategoriclCrossentropy(), 
              metrics=[keras.metrics.CategoricalAccuracy()])
  1. ๋‚ด์žฅํ•จ์ˆ˜์˜ ์ธ์Šคํ„ด์Šค๋กœ ์ง€์ •
opt = keras.optimizers.Adam(learning_rate=0.01(
loss = keras.losses.SparseCategoricalCrossentropy()
metric = keras.metrics.CategoricalAccuracy()
model.compile(loss=loss, optimizer=opt, metrics=[metric]

๋ชจ๋ธ ์ €์žฅ

model.save('save_model') // SavedModel ํฌ๋งท ์ €์žฅ
keras.models.save_model(model, 'save_model') // SavedModel ํฌ๋งท ์ €์žฅ

model.save("model_save.h5") // HDF5 ํŒŒ์ผ๋กœ ์ €์žฅ

๋ชจ๋ธ ๋ณต์›

model = keras.models.load_model('save_model')

layers API

  • Input
  • Flatten
  • Dense
  • Activation(sigmoid, relu ๋“ฑ)
  • Dropout
  • Batch Normalization : ๋ฐฐ์น˜ ๋‹จ์œ„๋กœ ํ†ต๊ณ„์  ํŠน์„ฑ์ด ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์—, ๋ ˆ์ด์–ด ์ถœ๋ ฅ์˜ ํ†ต๊ณ„์  ํŠน์„ฑ์ด ํ”๋“ค๋ฆผ -> ํ•™์Šต์†๋„ ๋‘”ํ™” => Batch ๋‹จ์œ„๋กœ normalizationํ•˜์—ฌ ํ•™์Šต์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ด

CNN ๊ด€๋ จ layer

  • Conv2d : CNN Convolution layer
  • MaxPooling2D : downsampling(์—ฐ์‚ฐ โฌ‡๏ธ)
  • Conv2DTranspose : upsampling(์„ฑ๋Šฅ ํ–ฅ์ƒ)

3. Generative Model

  • ํ•™์Šต ๋ฐ์ดํ„ฐ ๋ถ„ํฌ์™€ ์œ ์‚ฌํ•œ ๋ถ„ํฌ๋ฅผ ๊ฐ–๋Š” ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ชจ๋ธ

๋ถ„๋ฅ˜

  1. Explicit density : ํ•™์Šต๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑ
  • PixelRNN, PixelCNN, VAE
  1. Implicity density : random๊ฐ’์œผ๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑ
  • GAN, GSN

VAE(Variational AutoEncoder)

AutoEncoder: ์••์ถ•๋œ ํ‘œํ˜„์„ ์ฐพ๊ธฐ ์œ„ํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ์••์ถ•ํ•˜๊ณ  ์ž…๋ ฅ์„ ์žฌ๊ตฌ์„ฑํ•˜๋Š” ๋น„์ง€๋„ ํ•™์Šต (์ž…๋ ฅ๋ฐ์ดํ„ฐ๋ฅผ fixed vector์— mapping) VAE: ์••์ถ•๋œ ํ‘œํ˜„์„ ๋Œ€ํ‘œํ•˜๋Š” latent vector๋ฅผ ์ฐพ๊ณ , sampling๋œ latent vector์—์„œ ์ถœ๋ ฅ์„ ์ƒ์„ฑ (์ž…๋ ฅ๋ฐ์ดํ„ฐ๋ฅผ distribution์— mapping)

GAN(Generative Adversarial Network)

discriminator๋ฅผ ํ†ตํ•œ ๊ฐ„์ ‘์  ๊ต์œก์„ ๋ฐ›๋Š” generator๋กœ ๊ตฌ์„ฑ

NN์œผ๋กœ ๊ตฌ์„ฑ๋œ ๋‘ ๊ฐœ์˜ ๋ชจ๋ธ์ด ๊ฒฝ์Ÿ์  ํ•™์Šต์„ ํ†ตํ•ด ์„ฑ๋Šฅ ๊ฐœ์„ 
generator์€ discriminator๋ฅผ ์†์ด๋„๋ก ํ•™์Šต(์ข‹์€ ๋ถ„ํฌ๋ฅผ ํ•™์Šต)
discriminator๋Š” generator์˜ ๊ฒฐ๊ณผ๋ฅผ ํŒ๋ณ„ํ•˜๋„๋ก ํ•™์Šต(์ข‹์€ ๊ฒฝ๊ณ„๋ฅผ ํ•™์Šต)

  • Image-to-Image Translation (pixel2pixel) : ๋‹ค๋ฅธ ์‚ฌ์ง„์˜ ํŠน์ง•์„ ๊ฐ€์ ธ์™€์„œ ์ž…ํ˜€์คŒ.
  • Semantic-Image-to-Photo-Translation : semantic segmentation ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ง€๊ณ  ์‹ค์‚ฌํ™” ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑ
  • Super Resolution : ํ•ด์ƒ๋„ ๋†’์ด๊ธฐ (Conditional GAN ์ ์šฉ)
  • Photo Inpainting : ์ผ๋ถ€๊ฐ€ ์ง€์›Œ์ง„ ์‚ฌ์ง„ ๋ถ€๋ถ„์„ ์ฑ„์›Œ๋„ฃ๋Š” ๊ธฐ์ˆ 

4. AutoEncoder / Denoise AutoEncoder

Encoder: ์ž…๋ ฅ ๊ฐ’์„ ๊ตฌ์กฐํ™”๋œ ๊ฐ’(์ž ์žฌ๊ณต๊ฐ„ใ…‡)์— ๋งตํ•‘ํ•˜๋Š” ํ•จ์ˆ˜
Decoder:์ž ์žฌ ๊ณต๊ฐ„์˜ ๊ฐ’์„ ๋‹ค๋ฅธ ๋„๋ฉ”์ธ์œผ๋กœ ๋งตํ•‘ํ•˜๋Š” ํ•จ์ˆ˜
Code: ์ž ์žฌ ๊ณต๊ฐ„์˜ ๋ฒกํ„ฐ๋ฅผ ๋งํ•จ.

AE๋Š” ์ˆ˜ํ•™์ ์œผ๋กœ PCA์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ ์Šค์Šค๋กœ ์ตœ์ ํ™”
Latent space: ์••์ถ•๋œ ์ €์ฐจ์› ๊ณต๊ฐ„
Latent variables: ์ €์žฅ๋œ ๋ณ€์ˆ˜
์ ์šฉ ๋ถ„์•ผ: Denoising, Super-resolution, Semantic Segmentation

  • ํ•ต์‹ฌ ๊ธฐ๋Šฅ:

๊ณ ์ฐจ์› -> ์ €์ฐจ์› ํŠน์ง• ๋ฐœ๊ฒฌ
ํ•ต์‹ฌ ์†์„ฑ ๋ณด์กด(์†์ƒ๋œ ์ด๋ฏธ์ง€ ๋ณต๊ตฌ)
์ฃผ์š” ๋ณ€๋™ ์š”์ธ์„ ์‹œ๊ฐํ™”
๋น„์„ ํ˜• ์ฐจ์› ์ถ•์†Œ(ํŠน๋ณ„ํ•œ ๊ณ ์ฐจ์› ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ์— ๊ฐ•๋ ฅํ•œ ๋„๊ตฌ)

  • ํ•œ๊ณ„:

Decoding ๊ฒฐ๊ณผ์˜ quality ๋‚ฎ์Œ(latent attribute๊ฐ€ discreteํ•˜๊ฒŒ ํ‘œํ˜„๋˜๋ฉด overfitting โฌ†๏ธ)
-> latent attribute๋ฅผ ๋ถ„ํฌ๋กœ ํ‘œํ˜„ํ•˜์ž
Latent space์— ๋น„๋Œ€์นญ mapping(decoding range ๋ถˆ๊ท ํ˜•)
-> encoder๊ฐ€ ๋งŒ๋“œ๋Š” ๋ถ„ํฌ๋ฅผ ์ •๊ทœ๋ถ„ํฌ๋กœ ์ œํ•œ(encoder์˜ ์ถœ๋ ฅ์ด ์ •๊ทœ๋ถ„ํฌ์—์„œ ๋ฒ—์–ด๋‚˜๋ฉด loss๊ฐ€ ์ปค์ง€๋„๋ก ์„ค๊ณ„)
๊ฐ ๊ธ€์ž ๋ถ„ํฌ์˜ ๋ถˆ๊ท ํ˜•, ๋„“์€ ๋ถ„ํฌ์™€ ์ข์€ ๋ถ„ํฌ๊ฐ€ ํ˜ผ์žฌ


VAE(Variational Auto Encoder)

Latent variables๊ฐ€ ๊ฐ–๊ณ  ์žˆ๋Š” ํŠน์ง•์„ ์ž˜ ํ‘œํ˜„ํ•˜๋Š” decoder์™€ data-latent variable ๊ฐ„์— mapping์„ ์ž˜ ํ•˜๋Š” encoder์˜ ๊ฒฐํ•ฉ
์ข‹์€ latent variables(P(z))๋ฅผ ๋ฝ‘๋Š” ๊ฒƒ์ด ์ค‘์š”, ํ•˜์ง€๋งŒ ๋„ˆ๋ฌด ์–ด๋ ค์šฐ๋ฏ€๋กœ ์ข‹์€ encoder q(z|x)๋ฅผ ์ฐพ์ž.. -> Variational Inference ์‚ฌ์šฉ

Variational Inference

decoder๋Š” p(x|z)๋ฅผ ํ•™์Šตํ•ด์•ผ ํ•˜๋Š”๋ฐ, prior P(z)๋ฅผ ์•Œ ์ˆ˜ ์—†์œผ๋ฏ€๋กœ, ํ•™์Šต์ด ๋ถˆ๊ฐ€๋Šฅํ•˜๊ธฐ ๋–„๋ฌธ์— q(z|x)๋กœ P(z)๋ฅผ ๊ทผ์‚ฌํ•จ
๋ณต์žกํ•œ distribution ์„ ๋” ๊ฐ„๋‹จํ•œ encoder distribution q(z|x)์„ ์ด์šฉํ•ด์„œ ๊ทผ์‚ฌํ•˜๋Š” ๊ฒƒ..
KL divergence๋ฅผ ์ด์šฉ(p(z)์™€ q(z|x) ์‚ฌ์ด์˜ KL Divergence๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , D_KL์ด ์ค„์–ด๋“œ๋Š” ์ชฝ์œผ๋กœ q_ํŒŒ์ด(z|x)์˜ ํŒŒ์ด๋ฅผ ์กฐ๊ธˆ์”ฉ ์—…๋ฐ์ดํŠธํ•ด์„œ ์ตœ์ ์˜ P(z)์™€ ์œ ์‚ฌํ•œ ๋ถ„ํฌ๋ฅผ ์–ป์Œ
-> p(x|z)๋ฅผ maximize ํ•˜๋„๋ก ํ•™์Šตํ•˜๋Š” ๊ฒƒ์„ q(z|x)๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋Œ€์ฒดํ•จ.
VAE_loss = decoder_loss + encoder_loss

  • Entropy: ์ •๋ณด๋Ÿ‰์˜ ๊ธฐ๋Œ€๊ฐ’(ํ‰๊ท  ์ •๋ณด๋Ÿ‰)
    ์ •๋ณด๋Ÿ‰์€ ๋ฐœ์ƒ ํ™•๋ฅ ๊ณผ ๋ฐ˜๋น„๋ก€(1/p) -> -log(p)
    ๊ธฐ๋Œ€๊ฐ’ -> -plog(p) ์ •๋ณด๋Ÿ‰์˜ ํ•ฉ -> -plog(p)์˜ ํ•ฉ

  • KL divergence: ์ •๋ณด ์†์‹ค๋Ÿ‰์˜ ๊ธฐ๋Œ€๊ฐ’
    ์ •๋ณด์˜ ์†์‹ค๋Ÿ‰: ํ™•๋ฅ ๋ถ„ํฌ p์™€ q ์‚ฌ์ด์˜ ์ •๋ณด๋Ÿ‰์˜ ์ฐจ์ด -> -log(q) + log(p)
    ๊ธฐ๋Œ€๊ฐ’ -> -plog(q) + plog(p)์˜ ํ•ฉ (= D_KL(p||q) )
    D_KL์ด ์ตœ์†Œ๊ฐ€ ๋˜๋„๋ก q๋ฅผ ์ˆ˜์ •

  • Cross entropy
    D_KL์˜ ๋’ทํ•ญ์€ q์™€ ๋ฌด๊ด€ํ•˜๋ฏ€๋กœ, ์•ž ํ•ญ(p*log(q)์˜ ํ•ฉ)๋งŒ ์ตœ์†Œํ™”

  • Maximum Likelihood Estimation
    likelihood: ๊ด€์ฐฐ๋กœ๋ถ€ํ„ฐ ๋ชจ์ˆ˜๋ฅผ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ
    ํ•™์Šต: ํ™•๋ฅ  ๊ด€์ ์—์„œ ๋ณด๋ฉด Maximum Likelihood ์ฐพ๋Š” ๊ฒƒ


5. GAN(Generative Adversarial Network)

์–ด๋– ํ•œ ๋ถ„ํฌ์˜ ๋ฐ์ดํ„ฐ๋„ ๋ชจ๋ฐฉ / ์ƒ์„ฑ ๋ชจ๋ธ๊ณผ ํŒ๋ณ„ ๋ชจ๋ธ์ด ๊ฒฝ์Ÿํ•˜๋Š” ๊ตฌ์กฐ
์ƒ์„ฑ ๋ชจ๋ธ์€ data class์˜ ๋ถ„ํฌ๋ฅผ ๋ชจ๋ธ๋ง
ํŒ๋ณ„ ๋ณด๋ธ์€ data class์˜ ๊ฒฝ๊ณ„๋ฅผ ๋ชจ๋ธ๋ง

  1. noise z๋กœ๋ถ€ํ„ฐ Generator๊ฐ€ G(z)๋ผ๋Š” fake ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑ
  2. Discriminator๊ฐ€ real data์ธ p(x)์™€ G(z)๋ฅผ ๋น„๊ตํ•˜์—ฌ Real์ผ ํ™•๋ฅ (D(x))์„ ์ถœ๋ ฅ

Discriminator๋Š” gradient ascent: max(log(D(x))

Real์— ๋Œ€ํ•ด ํ™•๋ฅ ์ด 1 -> ๊ธฐ๋Œ€๊ฐ’ 0(์ตœ๋Œ“๊ฐ’) Fake์— ๋Œ€ํ•ด ํ™•๋ฅ ์ด 0 -> ๊ธฐ๋Œ€๊ฐ’ 0(์ตœ๋Œ“๊ฐ’)

Generator๋Š” gradient descent: min(1-log(D(G(z))) -> max(log(D(Gz))

Fake์— ๋Œ€ํ•ด D์˜ ํ™•๋ฅ ์ด 1 -> ๊ธฐ๋Œ€๊ฐ’ -โ™พ๏ธ

๊ฐ ๋ชจ๋ธ์˜ loss function์„ ๋”ฐ๋กœ ๋‘์–ด์„œ ๊ฐ์ž ํ•™์Šต
loss๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐ์—๋Š” D_JS(Jesen-Shannon Divergence)๋ฅผ ์‚ฌ์šฉ(๋Œ€์นญ์ ์ธ D_KL)

GAN ํ•™์Šต์ด ์–ด๋ ค์šด ์ด์œ 

  1. ๋ถ•๊ดด(์ถ•์†Œ): Mode collapsing
    ๋ชจ๋ธ์ด multi-modal(์Œ๋ด‰) ๋ฐ์ดํ„ฐ ๋ถ„ํฌ๋ฅผ ๋ชจ๋‘ ์ปค๋ฒ„ํ•˜์ง€ ๋ชปํ•˜๊ณ  ๋‹ค์–‘์„ฑ์„ ์žƒ์–ด๋ฒ„๋ฆผ
    loss๋งŒ์„ ์ค„์ด๋ ค๊ณ  ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•œ์ชฝ ๋ด‰์œผ๋กœ๋งŒ bias๋จ

  2. Oscillation ์Œ๋ด‰์˜ ํ•œ์ชฝ ๋ด‰์œผ๋กœ bias๋œ ํ˜•ํƒœ๋ฅผ ์™”๋‹ค๋ฆฌ๊ฐ”๋‹ค๋ฆฌ ํ•จ
    ์„œ๋กœ์˜ ๋ฐ˜๋Œ€๋ฐฉํ–ฅ์œผ๋กœ ํ•™์Šต์ด ์ง„ํ–‰๋˜์–ด ์‹คํŒจ๋ฅผ ๋ฐ˜๋ณต

ํ•ด๊ฒฐ์ฑ…: Lossํ•จ์ˆ˜ ๊ฐœ์„ 

  • Wasserstein GAN
  • LS_GAN

CGAN(Conditinal GAN)

DCGAN์— ์˜ํ•ด ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€๋Š” ๋žœ๋ค์ด๋ฏ€๋กœ, ํŠน์ • ์ด๋ฏธ์ง€๋ฅผ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๋„๋ก condition์„ ๋ถ€์—ฌํ•จ

Condition: label์˜ one-hot code
ํŒ๋ณ„๊ธฐ: condition์„ ์ด๋ฏธ์ง€์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ณ€ํ˜•ํ•˜์—ฌ ์ด๋ฏธ์ง€์™€ concatenateํ•˜์—ฌ input์œผ๋กœ ๊ณต๊ธ‰
์ƒ์„ฑ๊ธฐ: latent vector์™€ label์„ ๊ฒฐํ•ฉํ•˜์—ฌ input์œผ๋กœ ๊ณต๊ธ‰

์‘์šฉ: text๋ฅผ condition์œผ๋กœ ๋ณ€ํ™˜ํ•˜์—ฌ, text๋ฅผ ์ด๋ฏธ์ง€๋กœ ๋ณ€ํ™˜ํ•˜๋„๋ก ํ•  ์ˆ˜๋„ ์žˆ์Œ..

ACGAN(Auxiliary classifier GAN)

์ƒ์„ฑ๊ธฐ๋Š” ๋™์ผํ•˜๋‚˜, ํŒ๋ณ„๊ธฐ๋ฅผ 2๊ฐœ์˜ ๋ชจ๋ธ๋กœ ๊ตฌ์„ฑ
์ฐธ/๊ฑฐ์ง“ ๊ตฌ๋ถ„(binary) + ์ด๋ฏธ์ง€ ๋ผ๋ฒจ ํŒ๋‹จ(categorical)
label์„ ์ด๋ฏธ์ง€์— concatenateํ•˜์ง€ ์•Š๊ณ , ์ด๋ฏธ์ง€์—์„œ์˜ ์ถœ๋ ฅ์ด sigmoid๋กœ ๋“ค์–ด๊ฐ€๊ธฐ ์ „์— ๋”ฐ๋กœ ๋ถ„๊ธฐํ•˜์—ฌ softmax๋กœ ์ถœ๋ ฅํ•˜์—ฌ reak label๊ณผ ๋น„๊ต

  • CGAN, ACGAN์œผ๋กœ๋Š” ์›ํ•˜๋Š” ์ •๋„๋กœ ๊ธฐ์šธ์–ด์ง€๊ณ , ๊ตต์–ด์ง„ ์ˆซ์ž๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜๋Š” ์—†์Œ
    ์ž ์žฌ๊ณต๊ฐ„์— ์ •๋ณด๋“ค์ด ์–ฝํ˜€์žˆ๊ธฐ ๋•Œ๋ฌธ์—...

InfoGAN

์ž ์žฌ๊ณต๊ฐ„์˜ ์ฝ”๋“œ๋ฅผ ํ’€์–ด์„œ ์ •๋ฆฌํ•ด ํ•ด์„ ๊ฐ€๋Šฅํ•œ z-vector๋ฅผ ์ถ”๊ฐ€ ๊ตฌ์„ฑ
Z=(z,c) z: noise vector(์–ฝํžŒ ์ฝ”๋“œ), c: latent code(ํ•ด์„ ๊ฐ€๋Šฅ)
์ƒ์„ฑ๊ธฐ ์ž…์žฅ์—์„œ๋Š” z์™€ c๋ฅผ ๊ตฌ๋ถ„ํ•˜์ง€ ์•Š์Œ
loss ํ•จ์ˆ˜์— ์ƒํ˜ธ์ •๋ณด๋Ÿ‰ term ์ถ”๊ฐ€
์ƒํ˜ธ์ •๋ณด๋Ÿ‰(mutual information): ๋‘ ํ™•๋ฅ ๋ณ€์ˆ˜์˜ ์˜์กด์„ฑ, ๊ณต์œ  entropy, I(X;Y) = D_KL(p(x,y)||p(x)p(y))
์—ฌ๊ธฐ์„œ๋Š” I(c'; G(z,c))๋ฅผ ์‚ฌ์šฉํ•จ, z,c๋กœ๋ถ€ํ„ฐ ์ƒ์„ฑ๋œ ์ด๋ฏธ์ง€์™€ ํŒ๋ณ„๊ธฐ ์ž ์žฌ์ฝ”๋“œ c'์˜ ์ƒํ˜ธ์ •๋ณด๋Ÿ‰

Pix2Pix

Image-to-Image Translation with CGAN
input ์ด๋ฏธ์ง€๋ฅผ ์ƒˆ๋กœ์šด domain์œผ๋กœ translation(์Šค์ผ€์น˜ -> real object ๋“ฑ)