Configuration - shark8me/lenskit GitHub Wiki

Configuring LensKit

One of LensKit's goals is to be highly configurable with regards to the algorithms used, choice of parameters for them, and various algorithmic decisions for each algorithm (e.g. the similarity function used for k-NN collaborative filtering, or the normalization applied to ratings). LenskitRecommenderEngine is the main entry point for configuring a recommender.

Recommender configuration is done by selecting the correct implementation for various ''components'' (typically defined by Java interfaces), and values for ''parameters''. Pretty much every object you can interact with in a LensKit recommender is a component, and many of them use other components behind the scenes to do their work. In the example code in GettingStarted, we find this line:

factory.bind(RatingPredictor.class).to(ItemItemRatingPredictor.class);

If you are familiar with dependency injection, particularly with Guice, this line will look familiar. What it does is tell LensKit that we want to use ItemItemRatingPredictor as the implementation of the RatingPredictor component. When our code then asks the recommender for a rating predictor, it will use an ItemItemRatingPredictor. Likewise, any other components that use a RatingPredictor, such as an ItemItemRecommender, will use the ItemItemRatingPredictor.

When you look at the JavaDoc for a component implementation, such as ItemItemRecommender, you will see that it takes the components it uses (its ''dependencies'') as parameters to its constructor or, occasionally, parameters to setter methods. This is because LensKit is built using the Dependency Injection design pattern. The LenskitRecommenderEngineFactory provides ''automatic dependency injection'', built using the Grapht dependency injection container. It automatically instantiates the various components in accordance with the configuration (bindings) you provide in order to create the recommender you desire. Most components and parameters have default settings, so LensKit will “just work” if you specify the predictor and/or recommender you want to use, but you can always swap out components for ones more suited to your application as necessary.

Contexts

One feature provided by Grapht, and used heavily by LensKit, is ''context-sensitive'' bindings. These are bindings that choose how to configure a component based on where that component is being used. Formerly, these types of configurations were expressed with role annotations; we are moving to heavier use of contexts because they provide a cleaner, more easily discoverable solution in most cases.

To bind in a context, use the within method:

factory.within(SimpleNeighborhoodFinder.class)
       .bind(UserVectorNormalizer.class)
       .to(BaselineSubtractingNormalizer.class);

This uses the baseline-subtracting normalizer as the vector normalizer, but only when building the SimpleNeighborhoodFinder or one of its dependencies. It does not configure the normalizer passed to the rating predictor — if no other bindings are present, then the that is kept at the default.

Context-sensitive bindings override other bindings, so if you have a non-contextual binding of VectorNormalizer, that binding still applies everywhere except where the context is in active — that is, everywhere except in SimpleNeighborhoodFinder or one of its dependencies. You can have multiple context-based bindings, and you can also chain contexts in bindings. The closest, longest matching chain of contexts determines the actual binding to use.

Grapht also provides an at method, in addition to within; if you use at instead of within, the resulting bindings are anchored. Anchored bindings only override direct dependencies of the context they're applied to, whereas unanchored ones (produced by within) override bindings for transitive dependencies as well.

Parameters

LensKit provides many parameters, which are annotated with various annotations (such as NeighborhoodSize. These parameters are set using the set method:

factory.set(NeighborhoodSize.class).to(50);

Type safety is somewhat relaxed for parameters, but they are used for numeric or occasionally string values.

Qualifiers

Parameters are a special case of the more general concept of ''qualifiers'' — annotations which are annotated with @Qualifier from JSR 330, and are used to specify additional distinctions between objects. You can bind one using the two-parameter version of bind:

factory.bind(Qualifier.class, ComponentType.class)
       .to(ComponentImpl.class);

Alternatively, you can use the withQualifier method of Binding:

factory.bind(ComponentType.class)
       .withQualifier(Qualifier.class)
       .to(ComponentImpl.class);

Qualifiers are used in a couple of places:

  • To specify parameters, such as damping terms, that have primitive or string values.
  • To distinguish when a class depends on multiple components of the same type.

Other DI frameworks, such as Guice, encourage much broader use of qualifiers than we use in LensKit. Contexts provide a preferable solution in many cases.

Groovy

In the evaluator, you can use Groovy syntax to write a more fluent, English-style configuration:

bind RatingPredictor to ItemItemRatingPredictor
within(UserVectorNormalizer) {
  bind VectorNormalizer to MeanVarianceNormalizer
}