Normalize it based on the fraction of women in the race to start with maybe?

jtbayly · on Aug 20, 2019

Go for it. Knock yourself out trying to prove there are no differences between the sexes.

While you're at it normalize success at giving birth based on the fraction of men to attempt it.

peripitea · on Aug 20, 2019

To be fair, the differences here seem to be smaller than in other sports, suggesting that while testosterone is a factor it may be less than it is in other sports. If 90% of the competitors in these races are male, that would further impact this discrepancy.

Godel_unicode · on Aug 20, 2019

> If 90% of the competitors in these races...

Why would that matter? They're not running as a group; if you're the fastest then you win. This isn't a probabilistic thing. Lots of slow men crowding the starting line isn't going to impact who finishes first.

peripitea · on Aug 24, 2019

I'm not sure I even understand your argument. It's possible we're saying different things? What I'm saying is that if X people try a sport the records they set won't be nearly as good as if 100X people are trying it. Winners are by definition outliers, and the larger your population, the more (and more extreme) outliers you will see. So if there are five times fewer women than men () competing in a sport, that will impact comparisons between women and men, even at the top levels, at least if you want to compare innate ability. Almost certainly, the women's records and top female performances would look better if there were five times as many women trying the sport as there currently are. Do you disagree with that?

I pulled up the most recent Ironman race; there were 5 men for every woman in the competition. So I'm using that as a rule of the thumb. But the same logic applies to any population imbalance at the top of the funnel.

IIAOPSW · on Aug 21, 2019

If 90% are men, all else equal, you would expect men to win 90% of the time.

Godel_unicode · on Aug 21, 2019

That's quite the leap of logic hiding in that deceptively small "all else equal", what leads you to believe that everyone in a race has an equal chance of winning it?

IIAOPSW · on Aug 22, 2019

>what leads you to believe that everyone in a race has an equal chance of winning it?

I made no such assumption.

If you pull runner speeds from any distribution, and label 90% of those numbers "male" and 10% "female", 90% of the time the highest speed will be labelled "male".

Even if you are pulling the female runner speeds from a slightly faster distribution, if most of the people running are men then men will still win most of the time.

Failing to normalize by the population sizes at the start of the race is a blatant mathematical error. Until you fix it your argument is flawed and if you don't fix it you're willfully wrong.

Godel_unicode · on Aug 22, 2019

> I made no such assumption.

You literally wrote "all else equal" in your comment.

> pull runner speeds from any distribution

Not true. If I pick a distribution of elite women and non-athlete men, all of the top finishers will be women. You're assuming speeds are normally distributed; they are not.

Where is this data that you're citing here? It doesn't line up with any data I've seen, nor with my extensive experience in amateur racing. Most races are won by the same small group of elite runners. The size of the field is immaterial as the majority of racers have no chance of winning.

Normalizing for population size might make sense if you actually had to beat everyone independently. Fortunately, you're only racing the person in first so everyone else can be safely ignored.

Put another way, if Michael Phelps is racing he's going to win. You can only win by beating him, the rest of the field doesn't matter.

IIAOPSW · on Aug 22, 2019

>> I made no such assumption.

>>You literally wrote "all else equal" in your comment.

That's different from the assumption that all competitors are equally likely to win.

>You're assuming speeds are normally distributed

No. Any distribution will work.

>Where is this data that you're citing here?

I didn't say anything about data. I said your argument has a blatant mathematical flaw. You said "why would it matter" in response to "if 90% of the competitors [are men]". It absolutely matters. Even if you do turn out to be correct about women being worse at this sport, you are only right in the broken clock sense.

>Put another way, if Michael Phelps is racing he's going to win. You can only win by beating him, the rest of the field doesn't matter.

The people who show up to the race are coming out of some distribution. Michael Phelps isn't showing up to every race. The probability that you win the race comes down to how fast you are vs. the max of n samples from the distribution of runners.

The list of the winner of some annual marathon is a really shitty piece of evidence. Out of all the racers and times taken, it gives us data on exactly one of them. It is especially useless to try and breakdown running ability by demographic because it doesn't even tell us how much data we have on each demographic of interest.

If you don't see why just citing the list of marathon winners fails to reject the hypothesis that women and men are about equal at ultra-marathons, then you don't understand what makes for a good data-supported argument.

Godel_unicode · on Aug 22, 2019

This is what happens when you don't look at the data and argue from your gut. We don't just learn about one person, timed races release bib data for everyone in that race.

If you're a data nerd and a runner armed with this knowledge, it will have occurred to you to wonder if distance (in time) from the winner is correlated with gender and field size. It is not. Thus, you're proposing that we "normalize" for something which is shown to not have an effect on who wins a race.