FunkSVD - shark8me/lenskit GitHub Wiki

Funk-SVD

LensKit provides an implementation of FunkSVD, an SVD-like collaborative filtering algorithm that uses gradient descent to learn a matrix factorization. This code lives in the lenskit-svd module, under the org.grouplens.lenskit.mf.funksvd package.

Quick Start

Configuring FunkSVDItemScorer as your ItemScorer implementation is the main thing to do to use FunkSVD. There are, of course, other knobs you can tweak as well. This configuration will train 25 features for 125 iterations each, using the default learning rate and regularization:

bind ItemScorer to FunkSVDItemScorer
bind BaselinePredictor to ItemUserMeanPredictor
set FeatureCount to 25
set IterationCount to 125

Configuration Points

As with all LensKit algorithms, the FunkSVD implementation is highly configurable to allow you to experiment with a wide variety of variants and configurations. This section describes the primary configuration points for customizing the default components that drive the FunkSVD implementation.

The FunkSVD item scorer uses a FunkSVDModel, which in turn is built by FunkSVDModelBuilder. The model builder JavaDoc is the starting point for discovering most of the configuration points for training the model. Both the model and the scorer use a FunkSVDUpdateRule to do training updates; this component cannot be directly replaced at present, but transitively depends on many of the other configuration variables that control FunkSVD.

Here are some of the additional configuration points (‘@’ indicates a parameter to be set with set rather than bind):

  • @BaselineScorer — the FunkSVD algorithm learns to predict residuals from a baseline; the baseline scorer configures what that baseline is. UserMeanItemScorer with a baseline of ItemMeanRatingItemScorer is a good choice, and corresponds to Funk's original design.
  • @FeatureCount — the number of latent features to learn.
  • @InitialFeatureValue — the initial value to use for every user-feature and item-feature value. The default of 0.1 is probably suitable for most applications.
  • @LearningRate — the gradient descent learning rate.
  • @RegularizationTerm — the coefficient on the regularization term used to prefer small user-feature and item-feature values.
  • @UseTrailingEstimate – a boolean flag controlling whether trailing estimates are used when computing a feature. If true (the default), then when training one feature, the initial feature value will be used for all features that have not yet been trained. If false, then those values will be considered to be 0 and ignored.
  • ClampingFunction — a function that will be applied to the prediction after each feature is added. By default, this is the identity function; you can also use RatingRangeClampingFunction.
  • StoppingCondition — the condition used to stop the training loop for each feature. The default stopping condition is IterationCountStoppingCondition, which stops after a fixed number of epochs (controlled by @IterationCount). There are other stopping conditions in org.grouplens.lenskit.iterative.

Runtime Training

By default, all user-feature values are computed when the model is built and these pre-computed user profiles are used to generate scores. The item scorer does, however, support updating (or computing fresh) the user's feature scores based on their most current profile. To enable this, bind the @RuntimeUpdate-qualified update rule:

bind (RuntimeUpdate, FunkSVDUpateRule) to FunkSVDUpdateRule

You can also use context-sensitive bindings to customize runtime (score-time, as opposed to model-time) updating:

within (RuntimeUpdate, FunkSVDUpdateRule) {
    bind StoppingCondition to ThresholdStoppingCondition
}

Diagram

The FunkSVD algorithm is shown below:

FunkSVD Components