Conclusion and openings - xavierfeltin/mtg_data_mining GitHub Wiki

Conclusion

Thanks to the complexity and deepness of the game, the regular new extensions and the huge community, Magic the gathering is a big playground. Deck building and recommendation systems are only a few ways to exploit all the data available on Magic (card content, existing decks, match results, ...).

Association Rules versus Item-To-Item (Collaborative Filtering)

These two approaches try to dig information from the cards selection inside a decks database.

Here is a summary of my return of experience with them:

Association rules are interesting by their approach but they are difficult to master and to integrate into an engine.

Even if the item-to-item approach considers only pairs of items, items usually played together will have a good chance to be recommended together. Moreover, similarity scores are easily integrated into a recommendation engine.

Strong points

Association Rules	Item-To-Item
The concept of rule is easy to understand	Return one score per pair of items
Take into account several cards	Scalable when data is growing
A lot of research is done on the field

Weak points

Association Rules	Item-To-Item
Need to find frequent items beforehand	Consider only pairs of items
Hard to predict how many rules will be generated	Require processing time
Long rules are difficult to interpret for users
Exploitation of rules in a recommendation system is difficult
Require processing time and memory usage

Latent Semantic Algorithm

LSA is interesting to provide alternative cards to a particular card. However, it does not bother itself with mode and mana color. It needs external filters (user interface) to suggest cards meaningful to the player.

Global results

The results obtained from a sample of around 500 decks for each mode are encouraging:

LSA suggestions are consistent. Probably more tuning on the descriptors generation could bring a better quality on the results.
Even if this sample is a bit light when it is split between colors, the Item-To-Item suggestions are promising and consistent with some Association rules results.

Item-to-Item can produce several suggestions with the same score. Usually, the player sees only the N first best suggestions. In case of equality, an approach is to add a second score for breaking this equality.

The power of a deck is linked to the interactions between the cards constituing it. The cards may be adding/removing effects on other cards by using their names, type, subtype, special rules (...) into their descriptions. Thus, these interactions are words common to different cards.

For these reasons, content similarity obtained from the LSA approach is an interesting metric to decide between two equal suggestions from the Item-To-Item algorithm. A stronger content similarity score may indicate a stronger interaction between the selected card and the suggested card.

Openings

Latent Semantic Analysis: this project only scratched the Latent Semantic Analysis field. Bag of Words was chosen for its simplicity to comprehend. Others models exist and may give better results. However, first results obtained in this project are already relevant.
Association rules: this area is huge and the present results are not really satisfying. (master the number and length of rules, exploit the rules in an recommendaiton system)
Collaborative filtering: interesting single card recommendations on a sample decks database. Next step should be to try an integration with production data.
Top N recommendations: obtain recommendations not for a one card but depending of already selected cards by the player