Questions Homographs - ufal/NPFL095 GitHub Wiki
-
What types of homographs are discussed in this paper? Can you provide examples of these homographs in another language than Hebrew for each type (or at least for some of the types)?
-
Figure 1 shows a box-and-whisker plot for each of the 4 methods. A box-and-whisker plot summarizes quartiles (1st, 2nd=median, 3rd) and outliers of a sample. How large is the sample, i.e. how many F1 scores are summarized by each plot?
-
Section 3 mentions "two types of PLMs, contextualized and non-contextualized." How would you define or describe these two PLMs? What is the difference? Section 3 mentions also "a BiLSTM on top of the word2vec embeddings". Is this contextualized or non-contextualized?
-
Why are homographs with a highly skewed distribution challenging for pretrained language models such as AlephBERT? How did the authors try to handle that issue? Do you consider that approach reasonable, i.e. why could it work or why not?
-
(Bonus) Why do we need to disambiguate Hebrew homographs?