NBT v0.01 - SITE5039/nlp_benchmark_tasks GitHub Wiki
This page outlines about the requirements that needs to be met to roll out NBT version 0.01.
As most of the contributors work on distinct research directions, the evolution of NBT would be a breadth-wise addition of downstream NLP tasks instead of depth-wise.
As far as v0.01 is concerned, after an initial discussion between ZiqiaoWangGeothe, rzTian, and yottabytt, they came up with the following set of approaches and requirements to take the project forward.
- Stick to Python 3.6 and Pytorch 1.1 (The stable version at this point in time).
- Bucketize the downstream tasks according to distinct and widely accepted measures of performance. For example, F1, BLEU, Perplexity etc.,
- Pick the tasks breadth-wise and integrate them into NBT.
Distance metric | F1 | BLEU/ROUGE | Perplexity |
---|---|---|---|
Word Similarity | Word Sense Disambiguation | Summarization | Language Modeling |
Word Analogy | Question Answering | Machine Translation |
The above table can briefly give an idea of how they have planned to choose and integrate applications into NBT. The tasks are actually chosen by considering those for which the current performance metrics are relatively lesser than the tasks that were not chosen. By breadth-wise, they mean picking applications row-wise. The above table is definitely subject to change in the future (They believe the change would mostly be in terms of the new additions to it).
For identifying task, datasets and current state-of-the-art(SOTA) code, they have planned to check with NLP-progress.