Yes, i made that calculation, too. Not the one for all 40 dice, though.
I also did not yet do my summary or a write-up so i am still perfectly open to change everything.
Anybody who wants to play around with it, here's the raw data as csv.
Link to my dropboxYou are teaching, or not? If you want, use it, i don't reserve any rights on it.
A side with 2.2 standard deviations should appear in 1.4% of the cases if everything's perfectly normal.
That one out of six sides is so far away is then 7.9%, or 3 dice out of 40. But i did see 13 Dice beyond that border.
Same with any other randomly chosen borderline.
I was getting so mad that at one time, i was cross-checking with excel generated random numbers just in case i calculated something wrong somewhere, but no - the excel random numbers behave as expected.
I guess my rolling method preferred geometrical failures and tends to ignore weighting failures.
I used a dice tower, I put felt all the way in the dice tower, and I rolled always 5 dice together.
That means:
* the surface supports rolling because it is soft
* The dice tower does not support bouncing, as it is small and there is felt inside
* the surface puts the breaks on the speed of a die very fast
* to have a realistic environment, i rolled always 5 dice together. So they are also colliding with each other, invading the personal space of each other... like on the real gaming table. I am only interested in failures that are visible in such a real life situation.
I did not want roll a lot more of course, so instead i separated my notes into subsets of 30, assuming i would see technical defects all the time and luck only sometimes.
And that analysis, though not perfect mathematically speaking, made a good enough impression to convince me. Do you know how to calculate the probability of having the deviation 4 out of 5 times on the same side of the mean? Can i just work with the probs or do I have to take care for something?
This here is an example. On the right, you see the % of the total result appearing. 2 and 3 are the numbers that are the most off. But with 3.5% or 1.8 standard deviations they are not very far off, so that makes an impression of being normal: We would expect such a result in roughly 20% of the dice, or 8 out of 40. (i did see 24 like that)
However, if you divide into subsets, you see that the three was always below or exactly at the average. I made it blue for above the average and red for below the average.
And that's where i say, ok, it wasn't a streak changing the entire result, it looks like a technical failure.
This one i rolled 180 times because i couldn't decide for "guilty". That means each subset has 36 rolls.
The day i encode all that in Python, i will have one subset more instead. That seems to be cleaner.