2021 10 20 - KR-HappyFace/meetup-logs GitHub Wiki

Ice Breaking

  • ๋งฅ๋ถ ์ƒˆ๋กœ ๋‚˜์˜จ๊ฑฐ ์ข‹๋”๋ผ... ๋ฝ๋ฟŒ
  • "๋งฅ๋ถ์€ ๊ฐœ๋ฐœ์ž์—๊ฒŒ ์žˆ์–ด ์ตœ๊ณ ์˜ ํˆฌ์ž์ž…๋‹ˆ๋‹ค"
  • Focal Loss๋Š” ์„ฑ๋Šฅ์ด ํ™•์‹คํ•˜๋‹ค!
  • KLUE-RE & Image Classification Repository Public์œผ๋กœ ์ „ํ™˜๋จ, Fork ใ„ฑใ„ฑ

์ง„ํ–‰ ์ƒํ™ฉ

  • ํ˜„์ˆ˜: DPR ๋…ผ๋ฌธ ์ฝ๊ณ  Retrieval ์‹คํ—˜ ์ง„ํ–‰ ์ค‘

  • ์ค€ํ™: ์‹ค์Šต์ฝ”๋“œ ๋”ฐ๋ผ ์ง„ํ–‰ ์ค‘

  • ์„ธํ˜„: Retrieval์ชฝ ์ž‘์—… ์ค‘

  • ์žฌ์˜: DPR Retrieval, batch์—์„œ ๋น„์Šทํ•œ question, passage pair๋กœ ํ•™์Šต ๋‚œ์ด๋„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์‹คํ—˜ ์ค‘

  • ์˜์ง„: random masking ์‹คํ—˜ ์ง„ํ–‰ ์ค‘: https://github.com/boostcampaitech2/mrc-level2-nlp-15/tree/snoop

  • special token "#" ํ† ํฐ์„ ์‚ฌ์šฉํ•˜๋ฉด ์ž˜๋ชป ์ธ์‹ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž„

  • Test dataset์—๋Š” Question๋งŒ ์žˆ๊ณ , Wiki ์—์„œ ๊ด€๋ จ๋œ passage ์ฐพ์•„์™€์„œ ์ •๋‹ต ์ฐพ์•„๋‚ด๋Š” ๊ณผ์ •

  • ์‹คํ—˜ ์ง„ํ–‰์ƒํ™ฉ ๊ณต์œ ๋Š” jupyter ๋…ธํŠธ๋ถ์œผ๋กœ ํ•˜๋Š” ๊ฒƒ์ด ์ง„ํ–‰์ƒํ™ฉ ํŒŒ์•…์— ๋„์›€์ด ๋จ

  • https://github.com/stanford-futuredata/ColBERT

  • BM25 Transformer๋Š” ๋ฌด์—‡์ธ๊ฐ€? https://github.com/arosh/BM25Transformer

  • BM25๋Š” ์ข…๋ฅ˜๊ฐ€ ๋งŽ๋‹ค

  • Question generation์„ ํ†ตํ•œ augmentation

    • ์„ฑ๋Šฅ์— ์ข‹์€ ์˜ํ–ฅ์ด ์žˆ์„ ์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์Œ -> Bias ๋ฌธ์ œ?
    • SQuAD์—์„œ dense retrieval๋ณด๋‹ค sparse retrieval์ด ๋” ์ข‹์€ ์ด์œ 
  • PORORO ํ•™์Šต๋œ ๋ชจ๋ธ: https://kakaobrain.github.io/pororo/tagging/mrc.html

  • khaiii : https://github.com/kakao/khaiii

์ง„ํ–‰ํ•  ์‹คํ—˜

Baseline code ์„ค๋ช… ์˜์ƒ