Recommendation System - utkaln/machine-learning GitHub Wiki
Recommendation Similar to Regression
- Recommendation system can be decomposed into series of linear regression aggregated over many users and many features
- The major difference is that in this case the X labels of dataset is missing and that is the objective of the recommendation. The system learns from the rating given by other users as a result from the Linear Regression format of
(w * x) + b
. In case of recommendationx
is unknown andw
andb
are unknown. - Cost Function in this case is not just a function of
w
andb
. Rather it also includesx
Recommendation Similar to Classification (Binary System)
- A simpler rating process can be summarized as binary value - 0 or 1
- Examples
1 | 0 | ? |
---|---|---|
Liked | Did not Like | Did not Engage |
Rated | Did not Like | Did not Respond |
Clicked | Did not Click | Was not Presented |
Watched more than 30 seconds | Watched less than 30 seconds | Was not Presented |
- Prediction in this case can be done by aggregating the Sigmoid function over all the users
- Cost Function in this case is similar to that of Logistic Regression, with only additional factoring of aggregating it over all users
Collaborative Filtering System
- Identify similar set of data by collecting features from large amount of data from many users
Mean Normalization
- Allows to predict value of new unknown data by not making the parameters to
zero
value - Also increases the speed of gradient descent
Tensorflow for Gradient Descent Calculation
Reference to Gradient Descent Diagram
tf.GradientTape()
provides cost functiontape.gradient()
returns the partial derivative that helps minimize the parameter to minimize the costw.assign_add(-alpha * dJdw)
is the part that repeatedly applies optimization to w
Collaborative Filtering Recommender System
- This algorithm is used to find what is more likely rating of a product or dataset based on given user ratings
- It can be represented using linear regression equation
w * x + b
, where w = vector for user that represents user's choice, x = vector for data (product) , b = scalar part of the user vector - It generates Two vectors:
- Vector for each user for the parameters that represents the ratings of the user
w
- Vector for each product (data) of same size as that of w presented as
x
- The dot product of above to vector and adding the scalar part is representative of likely rating by the user
- Vector for each user for the parameters that represents the ratings of the user