Main Menu

Statistical tournament analysis

Started by Dr. Opossum, 09-06-2016, 02:17:38 AM

Previous topic - Next topic

Dr. Opossum

The interest of the community to statistical analysis of tournaments, like the ones published regularly below bigger tournaments in the report category, is bigger than expected. From time to time players ask me, whether other questions, which are interesting for themselves, can be answered about these tournaments. For this reason and because analyzing manually is very time consuming, I created a program with the help of a friend, where deck lists from MTGPulse and Cockatrice can be added or simply feed in by hand. The program then analysis certain questions for the registered deck lists, tournaments or just single players or cards.

Until now I analyzed the tournaments separately. Mainly because certain questions cannot be used for multiple tournaments simultaneously.  For example, the mean placement of a certain keycard would not bring reasonable results for 2 tournaments with big differences in the number of participants. Because there lie worlds between a mean placement of a card of 9 in a 9-man tournament and 9 in a 50-man tournament. (I've tried to solve this problem by standardizing on the player count/ calculating the Z-value.) However, other questions can be answered very well for multiple tournaments, for example "How popular is a card overall?" or "What color combinations are popular at the moment?". Therefore, this thread will address all questions, that cannot be assigned to a certain tournament or are valid for multiple tournaments. Also comparisons have a place here.

In the Council we use statistics for quite some time. However, we are also aware, that these results have to be treated with care. Especially single tournaments only give little information about the strength of a card or the popularity of a deck.
This can be noticed especially well, while comparing the Berlin Meta with the tournaments in Maintal. With increasing sample number (and not only sample size) the results obviously become more realistic. Means, with every added tournament more meaningful insights are possible. But also here not all factors are considered.

Statistics, which are coupled to many unknown or not includable factors, always underlie some uncertainty. A victory of a deck can speak for its strength but doesn't have to. The deck maybe was well positioned in the field or its pilot was plain more experienced or the match ups favorable or the player had luck etc.  Maybe the release of a new set had a remarkable influence on the format, certain decks, some single cards...

We, players and Council members, therefore are encouraged to question them, instead of relying on them. Nevertheless, the gained data is interesting enough to get at least a "clue" or a "tendency". With this, problematic cards have at least a chance to be visible for the overall community, get tested and watched more intense and are finally taken out of the format.

If you also have questions, write us a message and I will, if possible, analyze it here.
_______________________________________________________________________


But first, let us take a look at the big tournaments since the last bannings (Season April 2016):

* MKM Series Frankfurt - Highlander Event 14.05.2016
* Spring Weekend Erfurt 28.05./ 29.06.2016
* Metagame Masters Berlin Vol.6 4.6.2016

-> 114 decks


Decktypes:




Archetypes:






Colors:






Color Combinations:



-> Red as color (separately and in combination) is very popular!


Keycards:



Count: absolute number
Color %: x% of all [color] decks played this card

Dr. Opossum

Q: "Is it possible to add "Bazaar of Baghdad" and "Splinter Twin" to the key card list for tournaments? I'm interested in these deck strategies."

A: "Sure. Simultaneously follow up questions like
* How popular are these deckstrategies?
* In which decktypes are these cards really played?
* And how have these cards performed in their respective tournament?
can be taken into account."

Z-Value:
- value of 0 = average placement
- value > 0 = placement below average
- value < 0 = placement above average
- statements like "A value of -5 stand for a better performance than a value of -2.5. (But both are above average)" can be made.
- statements like "A value of -5 is twice as good as a value of -2.5." cannot be made.

Bazaar of Baghdad:





- overall only played 4 times -> small sample size
- only played in Reanimator decks
- under performed in Erfurt (So)
- placed above average in Maintal


Splinter Twin:





- only played 5 times -> small sample size
- always in combination, that at least play Red AND Blue
- wasn't played in Erfurt at all
- performance above average in the other tournaments

Dr. Opossum

Q: "Hi, I'm interested in Non-Basic-Hate (Blood Moon and Back to Basics) and their influence on multicolored decks. Does decks with Non-Basic-Hate perform better or worse in comparison to decks with at least 3 colors? Virtually a battle between team "Hate" versus team "Rainbow"?"

A: "I also tried here to answer the question somehow meaningful and applied the Z-Value."





- 32 % of all recorded decks have 2 or less colors
- 68 % of all recorded decks have 3 or more colors
- at the investigated events the average placements of both groups was nearly identical


Back to Basics (BtB) and Blood Moon (BM):





- BtB and/ or BM almost only in decks with 2 or less colors
- Team "Hate" (2 or less colors + hate) and Team "Rainbow" (3 or more colors without hate) perform similar
- 1 list counts to Team "Rainbow" AND to Team "Hate" (3 or more colors AND hate -> Jeskai Aggro-Control)
- Decks with 2 or less colors BUT without BtB and BM perform clearly worse (Exception: Fabian Mielke, Place 1, Erfurt Saturday)
- Erfurt Sunday no "others" deck (not in Team "Rainbow" nor in Team "Hate")

MacGyver


ChristophO


Thanks for writing tools that enable the analysis of known torunaments and decklists within those tournaments. Posting data about card counts here is greatly apreciated.

Sadly the "z-Value" statistical analysis is rather pointless because calculating a standard deviation does not make the slightest amount of sense using tournament placements as measurement metrics. As is the calculated standard deviation in prior posts. Placements necessitate a spread over the range of the field of competition resulting in a mess of calculated data. Calculating a standard deviation the way it has been done only makes sense if one would assume that analyzed subsets should reach roughly the same placements. Which makes sense for analyzing the size of apples fallen from one common tree but makes no sense at all for tournament results of Magic decks during which performance was ordered on a numerical basis (First to last place). In my opinion even if a normalization of achieved points (scored xyz% of avaiable points) or a match win rate would have been used the calculated results would be very doubtful.


ChristophO

Quote from: Dr. Opossum on 09-06-2016, 02:28:33 AM

Z-Value:
- value of 0 = average placement
- value > 0 = placement below average
- value < 0 = placement above average
- statements like "A value of -5 stand for a better performance than a value of -2.5. (But both are above average)" can be made.
- statements like "A value of -5 is twice as good as a value of -2.5." cannot be made.


The Z-Value simply gives you dimensionless distance from the average by measuring the distance in units of "standard deviation".
-2.5 means the compared value was 2.5 times the standard deviation of the total population smaller than the average of the total population.

I would rather not know which values were used for that because it would most likely lead to me getting mad ;D

Tabris

#6
I agree with Christoph here. The data (even if handled with care and the note from steffi that the conclusions we might draw are fragile) shows, if anything, that it shows nothing and might lead to false deductions. Since steffi provided the SDs we can see that the mean variation is so huge for the single values that every statistic handbook will tell you the data is simply not good enough to deduct anything.

Maqi

Could just use match win%. Should be easy to calculate/implement.

Dr. Opossum

#8
Statistical Analysis for "The One - Summer 2016" (Aug 6th)


Table 1 Decktypes and Average Place:
A high standard deviation indicates a high spread. 2 decks placed on 15 and 16 will have a much lower standard deviation than 2 decks, where one is on place 2 and the other is on place 33.




Table 2 Archetypes and Average Place:




Table 3 Color Distribution:




Table 4 Color Combinations:




Table 5 Team Non-Basic Hate vs. Team Rainbow:
Created by request.
3+ colors in comparison to decks with 2 or less colors.




Table 6 Team Non-Basic Hate vs. Team Rainbow Part 2:
Created by request.
3+ colors in comparison to decks with 2 or less colors (with and without Back to Basic, Blood Moon and/or Magus of the Moon).
2 decks played 2 or less colors and refrain from playing Blood Moon effects and/ or Back to Basics.
Count of 37 is due to overlap. -> One deck has played 3 or more colors and Blood Moon effects and/ or Back to Basics.




Table 7 Key Cards:




Table 8 Top 10:
Top 10 cards from every category (by card types).
Expended rankings for cards with the same points.














Table 9 Top 30:
Created by request.
Top 30 for
a. Spells
b. Spells and played at least 3 times (means at least a sample size of 3) ordered by mean place.
c. Lands
d. Lands and played at least 3 times (means at least a sample size of 3) ordered by mean place.
e. Utility lands
f. Utility lands and played at least 3 times (means at least a sample size of 3) ordered by mean place.

Tables ordered by counts especially shows the "popularity (or the need)" of certain cards.
Tables ordered by mean place was set to a minimum count of 3 to have a minimum sample size and especially shows the "play strength".
Fetch lands are not included into utility lands.

a.




b.




c.




d.




e.




f.




Content 10 A Cat in a Tram:
No request.
Made with Paint (and a lot of talent).


berlinballz

Just when I thought the abolute maximum possible amount of love and dedication possible had already been invested ... that cat tops it off. Great job. Thanks.