Yann Lecun, Facebook, Firstmark talk - fcrimins/fcrimins.github.io GitHub Wiki

https://www.youtube.com/watch?v=AbjVdBKfkO0 Yann Lecun, Facebook // Artificial Intelligence // Data Driven #32 (Hosted by FirstMark Capital)

FirstMark Capital is a NYC venture capital firm. Attendees included a dude from Citrix who identified himself before asking a question.

used to be 2 big ML rivals: either Deep Learning (i.e Convolutional Neural Nets) or SVMs
"I'm sure those of you who have done data science spend a lot of time selecting/engineering features, but then your classifier is just a standard module--e.g. logistic regression or random forests or whatever your favorite method is. And so if you can automate the process of engineering the features, there's a lot more problems you can apply ML to."
"DL is a conspiracy: to pick techniques that would be interesting and move away from SVMs...unbelievable success."
"companies that have the data Google, FB, IBM--not really IBM--are in a position to take advantage relative to companies that have the technology bug not the data"
DL - automate the process of feature engineering
- e.g. pixels aren't very informative so you have to combine them
- blocks of pixels are correlated => there's a more efficient way to represent blocks
- pixels -> patches/blocks -> motifs -> parts of objects -> objects -> etc.
- a hierarchical structure!
"back propagation is a practical application of the chain rule, but it took til the 80s to realize this"
- FWC - the derivative of f(g(x) is f'(g(x)) * g'(x)
- i.e. you can take the derivatives from the last layer of a NN (g') and use them to compute the derivatives of the second to last layer (f') ... and so on and so on backwards through the layers
- specifically we want the partial derivative of the cost function, J, w.r.t. the weights/parameters, W or Theta:
  - dJ/dW
  - using the chain rule, this can be decomposed into dJ/dy * dy/dW
  - d/dy J = d/dy 0.5*sum((Wa-y)^2) = sum(-(Wa-y)) (where Wa = h(x), the hidden/previous layer activations)
  - d/dW y = d/dW Wa = a
  - so dJ/dW = sum(-(Wa-y)) * a
size of typical NNs
- hundreds of thousands of inputs
- 1-10 billion multiply-accumulate operations (can't do this on CPUs--need GPUs)
- 100s of millions of internal neurons
- repeat 100s of millions of times to train properly
timeline
- speech recognition handled solely by DL since 2011 (Apple, Google, Microsoft)
- image recognition since 2012
- NLP is next
  - embedding methods
    - mapping words to vectors
    - meaning of word (1 vector) plus syntactic role (e.g. noun, verb, etc.; another vector)
  - Word2Vec
    - this is a specific technique
    - compositional properties, e.g. - =
"We cease to be the lunatic fringe. We're now the lunatic core."
Industry picked up on DL faster than academia due to resistance
Facebook AI code is mostly open source
Yann is teaching (or taught) a course at NYU this past spring (?) that is supposedly freely available w/ lectures on the web
the window is going to close very quickly for startups in DL due to a few reasons
- not easy to get data
- good people already hired by Google/FB
- good companies already sold to Google/FB
- there was a gold rush over 2014, but that window is closing
Q: as a startup, better to provide a vertical (general) or horizontal (specific) solution?
- A: vertical, b/c specifics like image recognition already solved
Yann is a co-founder of MuseAmi - DL for music related stuff
medical imaging not very well explored and there are still opportunities for small companies
the major undiscovered DL principle is unsupervised learning -- e.g. learning how the brain really works
NNs are very weakly inspired by neuroscience
- akin to how airplanes are inspired by birds
- need yet to discover the underlying principles of intelligence
- just like aerodynamics are the underlying principles of birds/planes/flight
- a bird specialist would talk a lot about feathers, if asked, but that doesn't figure at all into planes
Jeff Inten has a Coursera course on ML, but it doesn't cover NLP (negative according to Yann)
Yoshi Banjo has a free text book online co-authored w/ Aaron Courville and Ian Goodfellow
Yann has an NVidia Webinar (watch it) and also see these

Yann Lecun, Facebook, Firstmark talk - fcrimins/fcrimins.github.io GitHub Wiki

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️