If I charge a man extra for car insurance because men on average get in more wrecks, then I am discriminating based on sex.
If I charge someone extra for car insurance because he shares dozens of innocuous traits with other people who get in wrecks, I am not discriminating based on sex. If some algorithm can accurately predict the sex of that person by the same data, I am still not discriminating based on sex.
If I target blacks with predatory loan advertisements because that demographic is on average more susceptible to those ads, then I am discriminating based on race.
If I target an individual because they share dozens of innocuous traits with other people who fall for predatory loan advertisements, I am not discriminating based on race. If some algorithm can accurately predict the race of the individual based on the same data, I am still not discriminating based on race.
The software is less biased than a human because it has no concept of race. Even if the data can be used to accurately predict the race of an individual, it does not matter. The program will not spontaneously recognize the concept of race and then discriminate based off of it. If some trait correlates with race, that's because reality is biased, not the math, algorithm, company, or implementer.
It is not the responsibility of programmers to make their algorithms less accurate so ideologues can live in a fantasy world.
But that's how sexism and racism and any kind of discrimination works in the human mind. There are innocuous traits, like sex and skin color, but also many other traits, and we use them to make predictions about behavior. This is unfair discrimination; fair discrimination looks directly at the behavior of an individual, not some innocuous proxy for the behavior, even if that proxy is right 80% of the time.
For example, most people who have an account throwawayXXX on HN are using it temporarily to say something. However, I actually kept one such account for a long time. My username is an innocuous trait, but you're discriminating if you assume my motives and behavior will be like most of the other throwaway accounts, just based on my username.
That said, I believe unfair discrimination is unavoidable in life because we can't always wait around to see what an individual's behavior will be before we discriminate fairly. We couldn't function without stereotypes and assumptions. And so, while still unfair because of the potential for unjustified penalties, it's much better to look at many variables than it is to look at one or two.
Under that argument, using machine learning to make decisions on things like advertising campaigns or mortgage rates is still bad even if race and sex do not correlate. That's a much different argument than what the author is making. It's a decent argument too.
That said, I believe unfair discrimination is unavoidable in life because we can't always wait around to see what an individual's behavior will be before we discriminate fairly. We couldn't function without stereotypes and assumptions. And so, while still unfair because of the potential for unjustified penalties, it's much better to look at many variables than it is to look at one or two.
Then you think machine learning is the best solution that exists?
Thanks, I thought your points were good too. I think machine learning in combination with human oversight is the least bad. Machines can help eliminate human bias, and humans can pick up on things that the machines missed. I wouldn't trust a machine to lend out my money, but I would trust it to give me a lot of data about someone and make recommendations on that basis.
When you search Google, you get a mostly impartial set of results, but then you need to choose the most relevant from the top 10. Without Google, you'd visit far fewer websites based only on your accumulated experience. Without a human to filter through the top 10, you'd have to systematically read webpages in order until you found what you wanted. Maybe the example is a bit stretched, but machines and humans cooperating seems to work well, even right on the HN front page, or in the browser spellchecker as I type this message (apparently HN is not a word).
If I charge someone extra for car insurance because he shares dozens of innocuous traits with other people who get in wrecks, I am not discriminating based on sex. If some algorithm can accurately predict the sex of that person by the same data, I am still not discriminating based on sex.
If I target blacks with predatory loan advertisements because that demographic is on average more susceptible to those ads, then I am discriminating based on race.
If I target an individual because they share dozens of innocuous traits with other people who fall for predatory loan advertisements, I am not discriminating based on race. If some algorithm can accurately predict the race of the individual based on the same data, I am still not discriminating based on race.
The software is less biased than a human because it has no concept of race. Even if the data can be used to accurately predict the race of an individual, it does not matter. The program will not spontaneously recognize the concept of race and then discriminate based off of it. If some trait correlates with race, that's because reality is biased, not the math, algorithm, company, or implementer.
It is not the responsibility of programmers to make their algorithms less accurate so ideologues can live in a fantasy world.