A look at data from Mythbusters 12x5.
In Mythbusters 12x5, they test several myths about activities that allegedly either men or women excel at. For one of the tests they run, they actually give all the raw data on screen.
The setup is this: the myth is that men excel at navigation, and they test this by inviting 10 men and 10 women to read a map and navigate between two places on the San Francisco peninsula. As they navigate, they are scored by two mythbusters on how smooth the navigation went, how many wrong turns and stops were taken, how much longer than necessary the route was etc. In the end a score between 0 and 100 was issued for each test subject.
I wrote down the raw data from screen, and have made it available here: https://raw.githubusercontent.com/michiexile/mythbusters-data/master/S12/E05/navigating.csv As I keep on binging, I will try to extract more data from the episodes. I plan on putting it all up at https://github.com/michiexile/mythbusters-data
So let’s have a look at the data, shall we?
Gender | Score |
---|---|
woman | 25 |
woman | 70 |
woman | 100 |
woman | 60 |
woman | 95 |
woman | 90 |
woman | 100 |
woman | 80 |
woman | 85 |
woman | 65 |
man | 20 |
man | 100 |
man | 60 |
man | 55 |
man | 85 |
man | 95 |
man | 65 |
man | 100 |
man | 70 |
man | 90 |
Raw numbers don’t help me understand what I’m looking at all that easily - let’s get some graphics going.
The densities of the scores don’t really look all that different at first glance. The means are going to be a bit different, of course - but are they going to be sufficiently different?
Well, we know how to compare means. Let’s break out a classic: the T-test.
Welch Two Sample t-test
data: Score by Gender
t = -0.27709, df = 17.861, p-value = 0.7849
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-25.75919 19.75919
sample estimates:
mean in group man mean in group woman
74 77
While the means are indeed different – 74 vs. 77 – it is pretty clear from the test result that the difference in means is pretty small compared to the inherent variability in the data. We have no reason to reject our null hypothesis that the means could be the same, and no reason to believe that the ranking of men and women here comes down to anything more than chance.
But wait – can we rely on the T-test here? Is the data “good enough” for us to use the test?
Usually, the requirement here would be that both the total dataset and each subdataset be approximately normally distributed for the T-test to be a valid statistical test. So let’s take a look at that question too.
The QQ-plots look pretty good – data points line up pretty well in a straight line – indicating that a (mostly) normal distribution is a reasonable judgement call.
There we have it. In summary: gender differences in navigational ability is statistically Busted.
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/michiexile/rbind-io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".