ItemKNN - xavierfeltin/mtg_data_mining GitHub Wiki

Item-k-nearest-neighbors

ItemKNN directly takes advantage of the work already done on the collaborative filtering approaches in the first part of this project.

Approach

First, ItemKNN builds a matrix with the rating of the users on the different items of the catalog.

ItemA ItemB ItemC
User1 0 4 3
User2 1 5 3
User3 2 4 4

From there, it deduces a similarity matrix. It is the same matrix described in the collaborative filtering page. For each pair of items bought together a similarity score is computed.

Then the k-nearest-neighbors of each item are kept (k highest scores for each item). Lastly, the model will be confronted with the current shopping cart to obtain the top N recommendations for the user.

The publication presents different models with variations to compute the similarity between two items (only the cosine distance was used in the first part of this project)

This model is based on the direction of the vectors representing the two cards (i and j). The vectors contain 0 if the item has not been rated by an user or the rating score given by the user to this item.

Conditional Probability-Based Similarity

This model is based on the support of two individual cards (i and j) and the support of the two cards together (ij). The support of the second card as a coefficient between 0 and 1 to parameter the model. In the publication, this coefficient is set to 0.5.

Conditional Probability

With these two models, two normalization of the data are presented: row and similarity normalizations.

Row normalization

In order to reduce the contribution of users with many items over the contribution of users with fewer items, the ranking scores are normalized for each user. This treatment is performed before computing the similarity scores with a cosine or a conditional probability model.

Thus, the precedent matrix becomes (normalization by row):

ItemA ItemB ItemC
User1 0 4/7 3/7
User2 1/9 5/9 3/9
User3 2/10 4/10 4/10

Similarity score normalization

This normalization is applied to reduce the fact that the similarity scores between an item and its k most similar items can be scaled differently from an item to another. The scaling issue is coming from the fact that the neighborood of each items is more or less densely populated.

Here, an example of a similarity score matrix computed with the cosine model:

ItemA ItemB ItemC
ItemA 1.0 0.6 0.5
itemB 0.6 1.0 0.7
ItemC 0.5 0.7 1.0

Thus, the smiliarity score matrix becomes (normalization by column):

ItemA ItemB ItemC
ItemA 1.0/2.1 0.6/2.3 0.5/2.2
itemB 0.6/2.1 1.0/2.3 0.7/2.2
ItemC 0.5/2.1 0.7/2.3 1.0/2.2

Application to Magic the Gathering

This approach is transposed to Magic the Gathering by considering decks and cards instead of users and items. We consider here only if a card has been selected or not by a player in a deck.

Consequently, each deck is considered as an independent user. The rating score is 1 if the deck contains the card and 0 otherwise.

The users x items matrix with the ratings scores may be transposed as following:

CardA CardB CardC
Deck1 0 1 1
Deck2 1 0 0
Deck3 1 1 1

The approach is applied as such with the processing of similarity scores between items, the selection of the k-nearest-neighbors for each item and the recommendation of the top N cards based on the current selection of cards by the player to build a new deck.

Comparison of models

From the publication, the following similarity models have been implemented:

  • Cosine similarity:
    • Vanilla
    • With the normalization of:
      • similarity scores
      • decks and similarity scores
  • Conditional Probability-Based Similarity with a normalization of:
    • With the normalization of:
      • similarity scores
      • decks and similarity scores

An experimental model has been added by adding a multiplier coefficient to the cosine model equals to the textual similarity of the two cards processed with LSA. The objective is to take into consideration another information about the cards to see if there may be a correlation and so improve the model.

The decks are grouped by mode and colors as described in the proof of concept page.

Validation and indicators

To validate and measure the quality of a recommendation model, the decks are separated in two:

  • A card is randomly taken out of the deck and put in a testing set
  • The rest of the deck is used as the training set

The publication defines two indicators to measure the quality of a model.

The first one is the Hit Rate (HR). It measures how many times the model has been able to recommend the missing card in the testing set for each deck. A score of 1.0 means for all the decks the missing card has been recommended among the N recommendations.

The second one is the Average Reciprocal Hit Rank (ARHR). It measures in which position among the N recommendations, the card in the testing set has been recommended for each deck. A score of 1.0 means all the cards have been recommended in the first position.

Statistics

The values of the HR and ARHR indicators used to compare the different models between them are the mean over 20 runs. At each run, different random training and testing sets are created.

To measure the significance at 95% of the difference of performance between two models, the paired t-test has been applied with a Bonferroni correction.

Results

Legend

Decks Groups Models
  • 1: Legacy Blue Black Red
  • 2: Legacy White Blue Black Red Green
  • 3: Legacy White Blue Black Green
  • 4: Pauper Red
  • 5: Pauper White Blue Black Red Green
  • 6: Commander Red
  • 7: Commander White Blue Black Green
  • 8: Commander Blue Black Red
  • 9: Legacy Blue Black Red Green
  • 10: Vintage White Blue Black Red Green
  • cos: cosine similarity
  • cos+lsa: cosine similarity plus LSA
  • proba: conditional probability-based similarity
  • D: normalization applied on decks (row)
  • S: normalization applied on item x item similarity scores (column)

Structure of the studied data sets:

Deck groups Nb of decks Nb of cards Nb of cards per deck Density
1 332 373 29 7.67%
2 329 623 28 4.56%
3 377 518 33 6.33%
4 237 116 18 15.31%
5 410 355 28 9.29%
6 269 661 69 10.49%
7 202 635 98 15.43%
8 338 668 95 14.15%
9 1326 485 31 6.45%
10 2816 776 39 5.01%

HR of each model against all studied data sets with 95% of confidence: (in bold the best scores at 95% of confidence)

cos cos cos cos+lsa cos+lsa cos+lsa proba proba
- S D,S - S D,S - D,S
1 0,621 0,632 0,633 0,619 0,634 0,64 0,311 0,357
2 0,676 0,698 0,694 0,681 0,699 0,698 0,49 0,518
3 0,585 0,61 0,605 0,586 0,604 0,604 0,223 0,26
4 0,847 0,851 0,856 0,849 0,856 0,857 0,8 0,804
5 0,578 0,623 0,648 0,579 0,626 0,649 0,452 0,492
6 0,336 0,349 0,341 0,339 0,352 0,347 0,143 0,187
7 0,414 0,442 0,443 0,429 0,46 0,465 0,187 0,249
8 0,364 0,392 0.395 0.364 0.397 0.403 0,189 0,186
9 0,74 0,758 0,752 0,744 0.759 0.75 0,499 0,566
10 0,663 0,707 0,673 0,662 0.708 0.669 0,186 0,21

ARHR of each model against all studied data sets with 95% of confidence: (in bold the best scores at 95% of confidence)

cos cos cos cos+lsa cos+lsa cos+lsa proba proba
- S D,S - S D,S - D,S
1 0,451 0,46 0,463 0,443 0,457 0,464 0,21 0,235
2 0,533 0,549 0,551 0,531 0,547 0,548 0,318 0,341
3 0,438 0,453 0,454 0,441 0,455 0,453 0,13 0,155
4 0.584 0.613 0.626 0,594 0.624 0.635 0,535 0,529
5 0,47 0,469 0,487 0,487 0,469 0,487 0,247 0,249
6 0,214 0.22 0.212 0.222 0.229 0,226 0,077 0,101
7 0,25 0,262 0,261 0,258 0,275 0,275 0,121 0,149
8 0,214 0,225 0.226 0.211 0.228 0.23 0,111 0,113
9 0,518 0,525 0,507 0,518 0.524 0.51 0,33 0,349
10 0,478 0,516 0,495 0,479 0.514 0.49 0,11 0,132

The cosine similarity approach modelizes better the studied data sets. The probability based approach is counter performing compared to the results obtained in the publication. It may be due to the fact that card decks have a standardized structure.

Contrary to the similarity score normalization, the row normalization does not add a significant gain. This is due to the fact card decks follow particular game rules depending of the mode. Two of these rules are the number of cards for the deck (100 in Commander and usually 60 cards for the other modes) and how many times a same card can be used in a deck (usually maximum 4 times).

Since the density of the neighborood for two cards may be really different, as expected the normalization of the similarity score has a bigger impact on the indicators.

The experimental coefficient based on the card's textual information adds a small gain on the HR indicator. The ARHR indicator is a bit under the best score but the ranking of recommendatiions is still consistent. The model takes a bit longer to process due to the computing of the LSA but it does not impact the processing of the top N recommendations afterwards.

Recommendation results

Here is an example of recommendations obtained with the cosine+lsa model corrected with decks and similarity scores normalization.

For this example, the deck is coming from deckstat.net website (a different source from MTGDeck decks). It is a Legacy Blue Black Red Green deck. The recommendations were processed with the model training with the equivalent deck database from MTGDeck.

Two important facts:

  • The cards for this deck are present in the deck database
  • The deck is not present as such in the training database

In yellow, the cards already selected by the player to build his deck. We are expecting some of the other cards in the top 10 recommendations made by the model.
Example deck

The top 10 recommendations made by the model are as follows: (in bold recommended cards matching the original deck)

Ranking Multiverseid Card's name
1 228255 Flusterstorm
2 386627 Polluted Delta
3 190393 Scalding Tarn
4 886 Underground Sea
5 376559 Toxic Deluge
6 878 Badlands
7 887 Volcanic Island
8 884 Tropical Island
9 3915 Hydroblast
10 879 Bayou

In green, the recommendations matching the original deck: top 10 recommendations

The other recommended cards: Other recommendations

The input is a full deck without the "Land" cards (used to provide Mana to cast different special effects). A Magic player at this point of deck building starts searching which "Land" cards should go with his deck.

The suggested cards are interesting since most of them are "Lands". Consequently, the model takes into account the cards and their numbers set as input for making its recommendations.

The model's recommendations match most of the choices made in the original deck. This can be interpreted as the player who built the example deck made relevant choices with the Magic meta-game used to train the model.

Conclusion

This model for Top N recommendations based on collaborative filtering provides results taking into account the cards and their numbers set as input.

It is necessary to be careful with the data used to train the model since they need to reflect the current meta-game to provide relevant recommendations. The recommendations are directly based on other players decks used to train the ùmodel.

In a production environment, the recommendation algorithm needs to be improved (lighter data structure, cython, ...) for models with more cards (above 1500/2000 cards) since it can take a few seconds to process.

The surprise came from the performance of the cosine similarity model compared to the probability based model. This can be partially explained with the binary format of the data (selected / not selected) and the standards (number of cards per deck).

Other models exist, it will be interesting to know what data they will be able to dig and process, and how they will perform.

Webography

Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. 22, 1 (January 2004), 143-177

Paired t-tests

Python for data analysis

⚠️ **GitHub.com Fallback** ⚠️