2021 10 22 - KR-HappyFace/meetup-logs GitHub Wiki
TMI
- ์ฐ์ฃผ๋ ์์ผ!!
- Unicode BERT : ์ฃผ์ฌ๊ฑธ ๊ต์๋์๊ฒ ๋ฉ์ผ ๋ณด๋ด๋ด -> Tensorflow์ unicode tokenizer ์์.
- Google Colab Pro ์๋ก์ด GPU ํ ๋นํด์ค: A100 32GB
Experiments
- 
Pororo๋ฅผ ์ด์ฉํ Question Generation + Data Augmentation * 
- 
Passage Retrieval: - DPR
- BM25
- ElasticSearch
- k=3์ผ๋ 80% ์ ๋ ์ฑ๋ฅ
 
- Ensemble ํ๋ ๊ฒ๋ ์ข์๋ณด์
- e.g. Negative passage ๋ฝ์๋ ES ์ฌ์ฉ
 
 
- 
Reader: - klue/roberta-large vs xlm-roberta
 
- 
Baseline ์ฝ๋์์ ๋ชจ๋ธ์ roberta-large๋ก ๋ฐ๊พธ๋๊น ์ฑ๋ฅ ํฅ์