Lottery Ticket Hypothesis - junhyukso/blog_source_public GitHub Wiki


title: Lottery Ticket Hypothesis date: 2020/10/19 19:32:00 categories:

  • DeepLearning
  • Efficient DeepLearning
  • Pruning tags:
  • DeepLearning
  • Pruning
  • Sparsity

Network Pruning

Pruning์€ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ๊ฒฝ๋Ÿ‰ํ™” ํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ, ์ •ํ™•๋„ ์†์‹ค์„ ์ตœ์†Œ๋กœ ํ•˜๋ฉฐ ๋ชจ๋ธ์—์„œ ์–ด๋А ์ •๋„์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋“ค์„ ์ œ๊ฑฐํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

Iterative Pruning

Iterative Pruning. Han et al 2015

๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” Pruning ๋ฐฉ๋ฒ•์ธ Iterative Pruning์€ ์œ„ ๊ทธ๋ฆผ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์šฐ์„  ์–ด๋– ํ•œ ๊ธฐ์ค€์„ ํ†ตํ•ด ์ค‘์š”ํ•˜์ง€ ์•Š์€ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํŒ๋‹จํ•˜๊ณ , ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋ชจ๋ธ์„ ๋‹ค์‹œ ์žฌํ•™์Šต์‹œํ‚ต๋‹ˆ๋‹ค.
์ด๋Ÿฌํ•œ step๋“ค์„ ๋ฐ˜๋ณตํ•จ์œผ๋กœ์จ, ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ ์  ๋” ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค.

Problem

ํ•˜์ง€๋งŒ, ์ด๋Ÿฌํ•œ ๋ฐฉ์‹์„ ํ†ตํ•ด ์–ป์€ SubNetwork๋ฅผ, Randomly Initializeํ•œํ›„, ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•™์Šต์‹œํ‚ค๊ฒŒ ๋˜๋ฉด ๋ณธ๋ž˜์˜ ์„ฑ๋Šฅ๋ณด๋‹ค ํฌ๊ฒŒ ๋‚ฎ์€ ์„ฑ๋Šฅ์„ ๊ฐ–๊ฒŒ๋ฉ๋‹ˆ๋‹ค.
Iterative Pruning์ž์ฒด๊ฐ€ ๊ฝค ๋งŽ์€ HyperParamter๋“ค์ด ์žˆ๊ณ , SubNetwork๋ฅผ ํ•™์Šต์‹œํ‚ฌ์ˆ˜์žˆ๋‹ค๋ฉด Train FLOPs๋˜ํ•œ ํฐ ํญ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ๊ธฐ์— ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์€ ์ค‘์š”ํ–ˆ์Šต๋‹ˆ๋‹ค.

The Lotter Ticket Hyphothesis[ICLR2019] ์—์„œ ์ €์ž๋“ค์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.

The Lottery Ticket Hyphotesis

์šฐ์„  Lottery Ticket์ด๋ž€ ์šฉ์–ด๋ถ€ํ„ฐ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.

  • Lottery Ticket : Original Network ๋ณด๋‹ค ์ ์€ Parameter๋ฅผ ๊ฐ€์ง€๊ณ , ์„ฑ๋Šฅ๋˜ํ•œ ๋” ์ข‹์€ SubNetwork

์ €์ž๊ฐ€ ์ œ์‹œํ•˜๋Š” ์ด๋Ÿฌํ•œ Lottery ticket์„ ์ฐพ๋Š” ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋งค์šฐ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. Finding Lottery ticket

  • 1,2,3๊ณผ์ •์€ ํ†ต์ƒ์ ์ธ Train -> Iterative Pruning ๊ณผ์ •์ž…๋‹ˆ๋‹ค.
  • 4 ์ด์ œ Iterative Pruning์œผ๋กœ ์ฐพ์€ SubNetowrk๋ฅผ 1์—์„œ ์‚ฌ์šฉํ–ˆ๋˜ ์ดˆ๊ธฐ๊ฐ’์œผ๋กœ ์ดˆ๊ธฐํ™” ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, Pruning๋œ ์ŠคํŠธ๋Ÿญ์ณ๋Š” Initialized๋œ Weight์— ํฌ๊ฒŒ ์˜์กดํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Experimental Results

results on LeNet ์‹ญ์ž๊ฐ€ ๊ธฐํ˜ธ ์˜†์˜ ์ˆซ์ž๋Š” ๊ธฐ์กด ๋„คํŠธ์›Œํฌ ๋Œ€๋น„ ๋‚จ์•„์žˆ๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ์˜ ๋น„์œจ์ž…๋‹ˆ๋‹ค.

  • ์ œ์ผ ์™ผ์ชฝ ๊ทธ๋ฆผ์„ ๋ณด๊ฒŒ๋˜๋ฉด, ๊ธฐ์กด ๋„คํŠธ์›Œํฌ(100)๋ณด๋‹ค Lottery ticket๋“ค์˜ ํ•™์Šต๊ฒฐ๊ณผ๊ฐ€ ์›”๋“ฑํ•จ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ฐ€์šด๋ฐ ๊ทธ๋ฆผ์„ ๋ณด๊ฒŒ๋˜๋ฉด, 3.6%(๋ณด๋ผ์ƒ‰)๊นŒ์ง€๋„ ๊ธฐ์กด ๋„คํŠธ์›Œํฌ๋ณด๋‹ค ํ•™์Šต๊ฒฐ๊ณผ๊ฐ€ ์ข‹์ง€๋งŒ, 1.9%(๊ฐˆ์ƒ‰)์€ ๊ฒฐ๊ณผ๊ฐ€ ํฌ๊ฒŒ ๋‚˜๋น ์ง€๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.
  • ์ด๋Ÿฌํ•œ SubNetwork์˜ ํŒŒ๋ผ๋ฏธํ„ฐ ์ˆ˜์˜ ์–ด๋– ํ•œ ํ•˜ํ•œ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์˜ค๋ฅธ์ชฝ ๊ทธ๋ฆผ์„ ๋ณด๊ฒŒ๋˜๋ฉด, Lottery ticket๋ฐฉ๋ฒ•์„ ์ ์šฉํ•œ ๊ฒฐ๊ณผ๊ฐ€, ์ ์šฉํ•˜์ง€ ์•Š์€ ๊ฒฐ๊ณผ(reinit)๋ณด๋‹ค ์›”๋“ฑํžˆ ์ข‹์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ €์ž๋“ค์€ Simple Convnet์ด๋‚˜ Deep Convnet(VGG,ResNet..)์— ๋Œ€ํ•ด์„œ๋„ ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜์˜€๋Š”๋ฐ, ๋ช‡๊ฐ€์ง€ ํœด๋ฆฌ์Šคํ‹ฑ์ด ๋“ค์–ด๊ฐ€๊ธดํ–ˆ์ง€๋งŒ ๋ชจ๋‘ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค.

References

Frankle, Jonathan, and Michael Carbin. "The lottery ticket hypothesis: Finding sparse, trainable neural networks." arXiv preprint arXiv:1803.03635 (2018).