2021 10 22 - KR-HappyFace/meetup-logs GitHub Wiki
TMI
- ์ฐ์ฃผ๋ ์์ผ!!
- Unicode BERT : ์ฃผ์ฌ๊ฑธ ๊ต์๋์๊ฒ ๋ฉ์ผ ๋ณด๋ด๋ด -> Tensorflow์ unicode tokenizer ์์.
- Google Colab Pro ์๋ก์ด GPU ํ ๋นํด์ค: A100 32GB
Experiments
-
Pororo๋ฅผ ์ด์ฉํ Question Generation + Data Augmentation *
-
Passage Retrieval:
- DPR
- BM25
- ElasticSearch
- k=3์ผ๋ 80% ์ ๋ ์ฑ๋ฅ
- Ensemble ํ๋ ๊ฒ๋ ์ข์๋ณด์
- e.g. Negative passage ๋ฝ์๋ ES ์ฌ์ฉ
-
Reader:
- klue/roberta-large vs xlm-roberta
-
Baseline ์ฝ๋์์ ๋ชจ๋ธ์ roberta-large๋ก ๋ฐ๊พธ๋๊น ์ฑ๋ฅ ํฅ์