sept lesson - SNUDerek/MLsnippets GitHub Wiki
Machine Learning Review / Big Picture:
- data: with features x_1...x_n, labels y
- error function: must be "differentiable"
- partial derivatives: how to adjust each 'knob'
- gradient descent: adjusting each knob iteratively
- learning rate: taking smaller steps for stability
cross-entropy
back-propagation
why neural networks?
- "Universal Approximation Property"
- see Deep Learning Book
NLP applications
resources:
preprocessing:
live example in colab: tokenization, stemming, etc.
word representations: one-hot vs dense word vectors:
UAP > function to 'map' words to 'meaning space'
(word2vec intuition)[https://towardsdatascience.com/word2vec-skip-gram-model-part-1-intuition-78614e4d6e0b]
language modeling, perplexity
n-gram models, sequence models (MEMM/CRF), RNN
(stanford LM slides)[https://web.stanford.edu/class/cs124/lec/languagemodeling.pdf]
(my project: LM classification)[https://github.com/SNUDerek/lm_perplexity_bootstrapping]