Classical_Ml - RicoJia/notes GitHub Wiki

========================================================================

Curriculum

========================================================================

  • Projects

    Build a spam filter Classify images of flowers Predict the price of a house Recommend products to users

  • Online courses:

    Machine Learning for Absolute Beginners by Andrew Ng Stanford's Introduction to Machine Learning course Coursera's Machine Learning course

This is just a suggested curriculum, and you may need to adjust it based on your own interests and goals. It is important to note that machine learning is a complex field, and it takes time and effort to learn. Be patient and persistent, and you will eventually succeed.

SVM

  1. Linear SVM material

    1. hard margin

      • goal

        caption
      • goal

        caption
    2. Soft margin: adding penalization on the number of "inliers" to cost function

      - the new cost function

  2. Non linear SVM material:

    • Motivation

      1. the decision boundary the kernal shape.

      2. Convert the input space to higher order feature space, easier to classify

      3. So equivalently, plugging into the duality goal, we may have lesser computation to do by combining the conversion with the multiplication

    • example kernel funcs

    • kernal trick

  3. Multiclass classification, for K classes, explaination

    1. OVO (one versus one): K(k-1)/2 classes are generated, then you gather votes from each comparison.
      • say A,B, A,C, A,D, B,C, B,D; C,D; if it's D, then we rely on: clear decision boundary with the Ds, 3 votes
      • Then the best scenario is A wins AB, AC, so 2 votes.
      • Disadvantage: say each class has n samples, even though C(k,2), n*k(k-1) data samples, not like magnitudes more. But prediction time is longer
    2. OVR (one versus rest), k classes, so run 4 times: is A? is B? is C? is D?
      • bias is if you have many classes, then in training you see a lot more negatives than positives.
      • in total, 4kn data points.

========================================================================

Recommendation System

========================================================================

  1. Thomson Sampling

  2. Collaborative Filtering

    • User - User
    • How does
  3. Project Ideas:

    • If there's an "ingredient dictionary"?
    • Make sure you have a "large" enough base to recommend. Mysql
⚠️ **GitHub.com Fallback** ⚠️