Week18 Day2 - ai-esg/our-history GitHub Wiki

ํŒ€ NLP 11์กฐ Week18 Day2

๋ชฉ์ฐจ

์ผ์ž

  • 2021.11.30 ํ™”

ํŒ€์›

  • ๋ฌธ์„์•”_T2075
  • ๋ฐ•๋งˆ๋ฃจ์ฐฌ_T2078
  • ๋ฐ•์•„๋ฉ˜_T2090
  • ์šฐ์›์ง„_T2137
  • ์œค์˜ํ›ˆ_T2142
  • ์žฅ๋™๊ฑด_T2185
  • ํ™ํ˜„์Šน_T2250

์ฃผ๊ฐ„ ์ผ์ •

ํ”ผ์–ด์„ธ์…˜

  • ๋ชจ๋ธ ์ตœ์ ํ™”
    • ๊ฐ์ž๋„์ƒ
    • ์™ ์ง€ ๋ชฐ๋ผ๋„ ๋ฐ์ดํ„ฐ ๋ณ€ํ˜•์„ ์ฃผ๋Š” ๊ฒƒ์˜ ํšจ๊ณผ๊ฐ€ ์—†์–ด๋ณด์ž„
    • squeeze net ๋งŒ๋“ค๋ ค๊ณ  ํ–ˆ๋Š”๋ฐ, module ์ถ”๊ฐ€ ์ œ๋Œ€๋กœ ๋˜์ง€ ์•Š์Œ

์ตœ์ข… ํ”„๋กœ์ ํŠธ

์ตœ์ข… ํ”„๋กœ์ ํŠธ์—์„œ ์ž‘์„ฑ

Auto ML

-> ๋‚˜๋ˆ ์„œ ๋Œ๋ฆฌ -> DB-> ์‚ฌ๋žŒ์ด ์ „๋ถ€ ์ ‘์†

๋ถ„์—…

์ฝ”๋“œ -> MRC branch model

  • ์ „์ฒ˜๋ฆฌ (์„œ์šธ๋งŒ)
    • raw (๊ธ€ ์—†๋Š” ๊ฒƒ๋งŒ ์ œ์™ธ)
  • ๋ชจ๋ธ (์„œ์šธ๋งŒ)
    • In-batch
      • Context ๊ฐœ๋ณ„ (๋ฌด์ง€์„ฑ ๊ฒฐํ•ฉ, ๋ชจ๋“  ์กฐํ•ฉ), ํ†ตํ•ฉ
    • Non In-batch
      • Context ๊ฐœ๋ณ„ (๋ฌด์ง€์„ฑ ๊ฒฐํ•ฉ), ํ†ตํ•ฉ
  • ์‹คํ—˜ -> ์ „์ฒ˜๋ฆฌ ์ˆ˜์ •ํ•˜๊ณ  ๋ฐ˜๋ณต
  • ํ…Œ๋งˆ๋ณ„ ์„ฑ๋Šฅ ๋น„๊ต
  1. Context ์— ๋Œ€ํ•œ ๊ฒ€์ฆ
    • ์กฐ๊ฑด ์ผ์ • ์ ์ˆ˜ ์ด์ƒ์ผ ๊ฒฝ์šฐ(acc/f1 ?? ์ด์ƒ์ธ ๊ฒฝ์šฐ) top 10๊ฐœ์— ํฌํ•จ๋œ ๊ฒฝ์šฐ

์‹คํ—˜ํ•  ์‚ฌํ•ญ

  • '๊ด‘๋ช…๋™๊ตด' ์ด๊ฑด ์ง„์งœ ์ž˜ ์ฐพ์•„์•ผ ๋งž๋Š”๊ฑฐ ๋ฐ์ดํ„ฐ๊ฐ€ ์ •๋ง ๋งŽ์Œ

  • Non in-batch / query, context top 10 pair

  • In-batch / query(1), context(1)

  • context ํ˜•์‹

    1. ํ•˜๋‚˜์˜ ๋ช…์†Œ์— ๊ฐœ๋ณ„ context

      • ๋ชจ๋ธ์„ ๋Œ๋ ค์„œ ์ƒ์œ„ N๊ฐœ ์ค‘ ๊ฐ€์žฅ ๋งŽ์€ ๋ช…์†Œ๊ฐ€ ๋‹ต
      • train pair๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•˜๋Š”๊ฑฐ ์•„๋‹Œ๊ฐ€์š”??
        • ๊ฐœ๋ณ„ ๋‚ด์šฉ ์™„์ „ ๋”ดํŒ
        • google
        • blog
        • ํ•˜๊ฒŒ๋œ๋‹ค๋ฉด ๊ทธ๋ƒฅ ๊ฐ context๋ณ„๋กœ score์„ ๋‚ด๋Š” ๋ฐฉ์‹? ๊ณฑ๋ณด๋‹ค๋Š” ํ•ฉ? ํ•ฉ์ด๋ผ๊ณ  ํ•˜๊ธฐ์—๋Š” ์ค‘๋ณต์ด ๋ฌธ์ œ์ธ๋ฐ. train
    2. ํ•˜๋‚˜์˜ ๋ช…์†Œ๋ฅผ ํ•ฉ์นœ context

      • ๋ชจ๋ธ์„ ๋Œ๋ ค์„œ top 1์ด ๋‹ต
        • blog ์ „๋ถ€ ํ•ฉ์ณ์ง„ ์ƒํƒœ

์ „์ฒ˜๋ฆฌ

google

๊ธฐ์ค€ Dense

  • ๊ธธ์ด
  • ๋ถ€์ •์ ์ธ ๋ฆฌ๋ทฐ(Dense ๊ธฐ์ค€)
  • ๋ฒˆ์—ญ ๋ฌธ์žฅ (์ž๋™๋ฒˆ์—ญ)
    • (Google ๋ฒˆ์—ญ ์ œ๊ณต) ์•„์ฃผ ๋ฉ‹์ง€๋‹ค! (์›๋ฌธ) ะžั‡ะตะฝัŒ ะบะปะฐััะฝะพ!

blog

  • ํ•ด์‰ฌํƒœ๊ทธ ์ œ๊ฑฐ(ํ•œ๊ธ€๋งŒ)
  • @ ํƒœ๊ทธ ์ œ๊ฑฐ
  • ๊ณต๋ฐฑ 2๊ฐœ ํ•œ๊ฐœ๋กœ ๋ณ€๊ฒฝ
  • URL
  • dict ํ˜•์‹ ์ œ๊ฑฐ
  • context = re.sub(r""".(?=[^ ])""", ". ", context)
  • ์•„๋ž˜ ๋‚˜์˜จ ๋ฌธ์ž์™ธ ์ „๋ถ€ ์ œ๊ฑฐ

google api ํฌ๋กค๋ง

  • ๋‹ค ๊ฐ™์ด ๋Œ๋ฆฌ๋Š” ์ค‘ (time.sleep() <- ์ด๊ฑฐ ์ค„์ด๋ฉด ๋” ๋นจ๋ฆฌ ๋จ)
    • ๋‹ค ๋˜๋ฉด?