Item-k-nearest-neighbors

ItemKNN directly takes advantage of the work already done on the collaborative filtering approaches in the first part of this project.

Approach

First, ItemKNN builds a matrix with the rating of the users on the different items of the catalog.

	ItemA	ItemB	ItemC
User1	0	4	3
User2	1	5	3
User3	2	4	4

From there, it deduces a similarity matrix. It is the same matrix described in the collaborative filtering page. For each pair of items bought together a similarity score is computed.

Then the k-nearest-neighbors of each item are kept (k highest scores for each item). Lastly, the model will be confronted with the current shopping cart to obtain the top N recommendations for the user.

The publication presents different models with variations to compute the similarity between two items (only the cosine distance was used in the first part of this project)

Cosine distance

This model is based on the direction of the vectors representing the two cards (i and j). The vectors contain 0 if the item has not been rated by an user or the rating score given by the user to this item.

Conditional Probability-Based Similarity

This model is based on the support of two individual cards (i and j) and the support of the two cards together (ij). The support of the second card as a coefficient between 0 and 1 to parameter the model. In the publication, this coefficient is set to 0.5.

Conditional Probability

With these two models, two normalization of the data are presented: row and similarity normalizations.

Row normalization

In order to reduce the contribution of users with many items over the contribution of users with fewer items, the ranking scores are normalized for each user. This treatment is performed before computing the similarity scores with a cosine or a conditional probability model.

Thus, the precedent matrix becomes (normalization by row):

	ItemA	ItemB	ItemC
User1	0	4/7	3/7
User2	1/9	5/9	3/9
User3	2/10	4/10	4/10

Similarity score normalization

This normalization is applied to reduce the fact that the similarity scores between an item and its k most similar items can be scaled differently from an item to another. The scaling issue is coming from the fact that the neighborood of each items is more or less densely populated.

Here, an example of a similarity score matrix computed with the cosine model:

	ItemA	ItemB	ItemC
ItemA	1.0	0.6	0.5
itemB	0.6	1.0	0.7
ItemC	0.5	0.7	1.0

Thus, the smiliarity score matrix becomes (normalization by column):

	ItemA	ItemB	ItemC
ItemA	1.0/2.1	0.6/2.3	0.5/2.2
itemB	0.6/2.1	1.0/2.3	0.7/2.2
ItemC	0.5/2.1	0.7/2.3	1.0/2.2

Application to Magic the Gathering

This approach is transposed to Magic the Gathering by considering decks and cards instead of users and items. We consider here only if a card has been selected or not by a player in a deck.

Consequently, each deck is considered as an independent user. The rating score is 1 if the deck contains the card and 0 otherwise.

The users x items matrix with the ratings scores may be transposed as following:

	CardA	CardB	CardC
Deck1	0	1	1
Deck2	1	0	0
Deck3	1	1	1

The approach is applied as such with the processing of similarity scores between items, the selection of the k-nearest-neighbors for each item and the recommendation of the top N cards based on the current selection of cards by the player to build a new deck.

Comparison of models

From the publication, the following similarity models have been implemented:

Cosine similarity:
- Vanilla
- With the normalization of:
  - similarity scores
  - decks and similarity scores
Conditional Probability-Based Similarity with a normalization of:
- With the normalization of:
  - similarity scores
  - decks and similarity scores

An experimental model has been added by adding a multiplier coefficient to the cosine model equals to the textual similarity of the two cards processed with LSA. The objective is to take into consideration another information about the cards to see if there may be a correlation and so improve the model.

The decks are grouped by mode and colors as described in the proof of concept page.

Validation and indicators

To validate and measure the quality of a recommendation model, the decks are separated in two:

A card is randomly taken out of the deck and put in a testing set
The rest of the deck is used as the training set

The publication defines two indicators to measure the quality of a model.

The first one is the Hit Rate (HR). It measures how many times the model has been able to recommend the missing card in the testing set for each deck. A score of 1.0 means for all the decks the missing card has been recommended among the N recommendations.

The second one is the Average Reciprocal Hit Rank (ARHR). It measures in which position among the N recommendations, the card in the testing set has been recommended for each deck. A score of 1.0 means all the cards have been recommended in the first position.

Statistics

The values of the HR and ARHR indicators used to compare the different models between them are the mean over 20 runs. At each run, different random training and testing sets are created.

To measure the significance at 95% of the difference of performance between two models, the paired t-test has been applied with a Bonferroni correction.

Results

Legend

Decks Groups	Models
1: Legacy Blue Black Red 2: Legacy White Blue Black Red Green 3: Legacy White Blue Black Green 4: Pauper Red 5: Pauper White Blue Black Red Green 6: Commander Red 7: Commander White Blue Black Green 8: Commander Blue Black Red 9: Legacy Blue Black Red Green 10: Vintage White Blue Black Red Green	cos: cosine similarity cos+lsa: cosine similarity plus LSA proba: conditional probability-based similarity D: normalization applied on decks (row) S: normalization applied on item x item similarity scores (column)

Structure of the studied data sets:

Deck groups	Nb of decks	Nb of cards	Nb of cards per deck	Density
1	332	373	29	7.67%
2	329	623	28	4.56%
3	377	518	33	6.33%
4	237	116	18	15.31%
5	410	355	28	9.29%
6	269	661	69	10.49%
7	202	635	98	15.43%
8	338	668	95	14.15%
9	1326	485	31	6.45%
10	2816	776	39	5.01%

HR of each model against all studied data sets with 95% of confidence: (in bold the best scores at 95% of confidence)

	cos	cos	cos	cos+lsa	cos+lsa	cos+lsa	proba	proba
	-	S	D,S	-	S	D,S	-	D,S
1	0,621	0,632	0,633	0,619	0,634	0,64	0,311	0,357
2	0,676	0,698	0,694	0,681	0,699	0,698	0,49	0,518
3	0,585	0,61	0,605	0,586	0,604	0,604	0,223	0,26
4	0,847	0,851	0,856	0,849	0,856	0,857	0,8	0,804
5	0,578	0,623	0,648	0,579	0,626	0,649	0,452	0,492
6	0,336	0,349	0,341	0,339	0,352	0,347	0,143	0,187
7	0,414	0,442	0,443	0,429	0,46	0,465	0,187	0,249
8	0,364	0,392	0.395	0.364	0.397	0.403	0,189	0,186
9	0,74	0,758	0,752	0,744	0.759	0.75	0,499	0,566
10	0,663	0,707	0,673	0,662	0.708	0.669	0,186	0,21

ARHR of each model against all studied data sets with 95% of confidence: (in bold the best scores at 95% of confidence)

	cos	cos	cos	cos+lsa	cos+lsa	cos+lsa	proba	proba
	-	S	D,S	-	S	D,S	-	D,S
1	0,451	0,46	0,463	0,443	0,457	0,464	0,21	0,235
2	0,533	0,549	0,551	0,531	0,547	0,548	0,318	0,341
3	0,438	0,453	0,454	0,441	0,455	0,453	0,13	0,155
4	0.584	0.613	0.626	0,594	0.624	0.635	0,535	0,529
5	0,47	0,469	0,487	0,487	0,469	0,487	0,247	0,249
6	0,214	0.22	0.212	0.222	0.229	0,226	0,077	0,101
7	0,25	0,262	0,261	0,258	0,275	0,275	0,121	0,149
8	0,214	0,225	0.226	0.211	0.228	0.23	0,111	0,113
9	0,518	0,525	0,507	0,518	0.524	0.51	0,33	0,349
10	0,478	0,516	0,495	0,479	0.514	0.49	0,11	0,132

The cosine similarity approach modelizes better the studied data sets. The probability based approach is counter performing compared to the results obtained in the publication. It may be due to the fact that card decks have a standardized structure.

Contrary to the similarity score normalization, the row normalization does not add a significant gain. This is due to the fact card decks follow particular game rules depending of the mode. Two of these rules are the number of cards for the deck (100 in Commander and usually 60 cards for the other modes) and how many times a same card can be used in a deck (usually maximum 4 times).

Since the density of the neighborood for two cards may be really different, as expected the normalization of the similarity score has a bigger impact on the indicators.

The experimental coefficient based on the card's textual information adds a small gain on the HR indicator. The ARHR indicator is a bit under the best score but the ranking of recommendatiions is still consistent. The model takes a bit longer to process due to the computing of the LSA but it does not impact the processing of the top N recommendations afterwards.

Recommendation results

Here is an example of recommendations obtained with the cosine+lsa model corrected with decks and similarity scores normalization.

For this example, the deck is coming from deckstat.net website (a different source from MTGDeck decks). It is a Legacy Blue Black Red Green deck. The recommendations were processed with the model training with the equivalent deck database from MTGDeck.

Two important facts:

The cards for this deck are present in the deck database
The deck is not present as such in the training database

In yellow, the cards already selected by the player to build his deck. We are expecting some of the other cards in the top 10 recommendations made by the model.
Example deck

The top 10 recommendations made by the model are as follows: (in bold recommended cards matching the original deck)

Ranking	Multiverseid	Card's name
1	228255	Flusterstorm
2	386627	Polluted Delta
3	190393	Scalding Tarn
4	886	Underground Sea
5	376559	Toxic Deluge
6	878	Badlands
7	887	Volcanic Island
8	884	Tropical Island
9	3915	Hydroblast
10	879	Bayou

In green, the recommendations matching the original deck: top 10 recommendations

The other recommended cards: Other recommendations

The input is a full deck without the "Land" cards (used to provide Mana to cast different special effects). A Magic player at this point of deck building starts searching which "Land" cards should go with his deck.

The suggested cards are interesting since most of them are "Lands". Consequently, the model takes into account the cards and their numbers set as input for making its recommendations.

The model's recommendations match most of the choices made in the original deck. This can be interpreted as the player who built the example deck made relevant choices with the Magic meta-game used to train the model.

Conclusion

This model for Top N recommendations based on collaborative filtering provides results taking into account the cards and their numbers set as input.

It is necessary to be careful with the data used to train the model since they need to reflect the current meta-game to provide relevant recommendations. The recommendations are directly based on other players decks used to train the ùmodel.

In a production environment, the recommendation algorithm needs to be improved (lighter data structure, cython, ...) for models with more cards (above 1500/2000 cards) since it can take a few seconds to process.

The surprise came from the performance of the cosine similarity model compared to the probability based model. This can be partially explained with the binary format of the data (selected / not selected) and the standards (number of cards per deck).

Other models exist, it will be interesting to know what data they will be able to dig and process, and how they will perform.

Webography

Mukund Deshpande and George Karypis. 2004. Item-based top-N recommendation algorithms. ACM Trans. Inf. Syst. 22, 1 (January 2004), 143-177

Paired t-tests

Python for data analysis

ItemKNN - xavierfeltin/mtg_data_mining GitHub Wiki

Item-k-nearest-neighbors

Approach

Cosine distance

Conditional Probability-Based Similarity

Row normalization

Similarity score normalization

Application to Magic the Gathering

Comparison of models

Validation and indicators

Statistics

Results

Recommendation results

Conclusion

Webography

⚠️ GitHub.com Fallback ⚠️

ItemKNN - xavierfeltin/mtg_data_mining GitHub Wiki

Item-k-nearest-neighbors

Approach

Cosine distance

Conditional Probability-Based Similarity

Row normalization

Similarity score normalization

Application to Magic the Gathering

Comparison of models

Validation and indicators

Statistics

Results

Recommendation results

Conclusion

Webography

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️