Recommendation Algorithm - shleen/threadline GitHub Wiki

Recommendation Algorithm Specification

Weather Filtering Stage

Before the recommendation algorithm begins, we collect crucial weather data that is used as context during the item ranking and layering stages of the algorithm. Specifically, we get (1) the current temperature and (2) if it is currently raining or snowing at the user's current location. To do this, we use OpenWeatherMap's /weather endpoint. See this section of the wiki for more information. This weather information is then used as an input to the next stage of the recommendation algorithm.

Item Ranking Stage

This phase of the algorithm utilizes insights from the weather filtering stage to pull climate-appropriate garments from the database and then rank each of these garments by computing a score based on its attributes. The garment scores are computed in isolation for each type of clothing (e.g. Tops are ranked separately from shoes and do not influence the scores of shoes). The type of clothing refers to a broad categorization of a garment which can be either a Top, Bottom, Dress, Outerwear, or Shoes. Up to the top 5 garments of each type are sent to the item matching stage. These items are also unioned with a garment of each type that corresponds to any present precipitation, allowing some garments to bypass scoring constraints if they are necessary for the weather (e.g., ensuring that if it is raining, a raincoat is used to form outfits regardless of its score.)

The attributes we use to compute an item's score are its item's sub-type (e.g. if a Top is a hoodie, polo, t-shirt, etc.), its fit (loose, fitted, tight) and its intended occasion (formal, casual, etc.) Each one of these attributes receive the same weight of 1/3 where the final score normalized to 1. Within each attribute, the ranking stage computes the the percentage of times an attributes value was worn to asses its importance to the user. For example, if a the user has worn tops that are polos 40% of the time, then every Top that is a polo in a user's wardrobe will map to a value of 0.4 for its sub-type weight. This process recomputes the weights each time the algorithm is run so it will dynamically respond and update to changes in what the user has worn. We repeat this process for fit and occasion and then add the sub-type weight, the fit-weight, and occasion-weight (each divided by 3) together for each garment to compute its initial score.

The ranking stage also considers how recently a user has worn a garment and deducts from the score accordingly. If the garment was worn within the past 3 days, we multiply its initial score by 0.25 to penalize it. If an item was worn between 3 and 10 days ago, we multiply by 0.75. Finally, if the most recent wear was over 10 days ago, there is no penalty so the score is multiplied by 1. The purpose of the time deductions is to encourage turnover in the wardrobe and to recommend items that have been not been worn for some time.

Finally, after applying the time deductions, we add to the score a small randomization constant whose value is randomly generated between 0 and 0.05. The intent of this constant is two-fold: On app onset, the user may have loaded their wardrobe with clothes but has not worn anything yet. Therefore, the algorithm is unable to learn anything about the user's style preferences so it can do no better than a random shuffle for ranking which is what this constant accomplishes since the other components of the score will be 0. Secondly, if a subset of a type of garment have the same score, the randomization can sometimes shift which items are included in the "top five" such that on separate invocations of the recommend algorithm, it is possible that there are some different clothes recommended that still follow the user's style preference.

Item Matching Stage (Outfit Formation)

The item ranking stage leaves us with an unordered set of clothes, with a maximum of 5 items of each type (e.g. TOP, SHOES, OUTERWEAR). In this stage, we algorithmically combine these items to form coherent, matching outfits.

To do this, we iterate through the following process until we have created a maximum of 5 outfits.

Randomly choose an outfit type. In this context, an outfit type is a pre-defined "schema" of an outfit. It is defined by an "anchor" clothing type, and a list of "exclusion" clothing types. For example, one outfit type includes a TOP and must exclude the DRESS type. Additional outfit types can be easily created or modified without modifying the logic needed.
An item of the anchor type is popped from the set of clothing output by the item ranking stage. This is the first item in the outfit.
From this anchor type, we find a set of "matching" colors. We do this by finding the complementary and analogous colors. We do this by representing the anchor clothing item's primary color in the HCL format. Then, we modify the H (hue) value by +180, +30, and -30 to get the matching colors. This is based on the color wheel and color theory, and using the HCL color format allows us to do this programmatic addition or subtraction to find the matching color.
Then, all other clothing types (except exclusion types) are iterated over. For each clothing type, we find the item that is closest to one of the colors in the set of matching colors, by Euclidean distance. That best item is added to the outfit.
This process is then repeated until we have 5 outfits or have run out of items to use from the item ranking output.

Outfit Layering Stage

The layering stage builds upon the outfits generated in the item matching stage by adding additional garments to ensure the outfit is appropriate for the weather conditions. This stage considers the base weather, such as Winter, Spring, or Fall, to determine whether additional layers, like outerwear or extra tops, are necessary. The process begins by analyzing the base weather attribute of the first garment in the outfit, as all garments in the outfit are expected to share the same weather attribute. If the base weather indicates colder conditions, such as Winter, Spring, or Fall, the algorithm attempts to add a layerable top, such as a sweater or hoodie, from the ranked items. For Winter weather specifically, the algorithm ensures that an outerwear garment, such as a coat or jacket, is included in the outfit. This ensures that each outfit not only aligns with the user’s style preferences but also provides practicality for the current weather conditions. The layering logic is implemented in the item_match function, which handles the addition of layerable tops and outerwear.