2021 10 22 - KR-HappyFace/meetup-logs GitHub Wiki

TMI

  • ์—ฐ์ฃผ๋‹˜ ์ƒ์ผ!!
  • Unicode BERT : ์ฃผ์žฌ๊ฑธ ๊ต์ˆ˜๋‹˜์—๊ฒŒ ๋ฉ”์ผ ๋ณด๋‚ด๋ด„ -> Tensorflow์— unicode tokenizer ์žˆ์Œ.
  • Google Colab Pro ์ƒˆ๋กœ์šด GPU ํ• ๋‹นํ•ด์คŒ: A100 32GB

Experiments

  • Pororo๋ฅผ ์ด์šฉํ•œ Question Generation + Data Augmentation *

  • Passage Retrieval:

    • DPR
    • BM25
    • ElasticSearch
      • k=3์ผ๋•Œ 80% ์ •๋„ ์„ฑ๋Šฅ
    • Ensemble ํ•˜๋Š” ๊ฒƒ๋„ ์ข‹์•„๋ณด์ž„
      • e.g. Negative passage ๋ฝ‘์„๋–„ ES ์‚ฌ์šฉ
  • Reader:

    • klue/roberta-large vs xlm-roberta
  • Baseline ์ฝ”๋“œ์—์„œ ๋ชจ๋ธ์„ roberta-large๋กœ ๋ฐ”๊พธ๋‹ˆ๊นŒ ์„ฑ๋Šฅ ํ–ฅ์ƒ