One interesting thing to do is use a model directly like Llama and then query the next-token probability logits for "he" and "she" (assuming you set up the sentence in such a way).
For example:
"A doctor was examining the patient when ___"
What this makes apparent is that increasing model temperature will select the less stereotypical option more often.
IMO this is getting at a deeper truth that the use of a gender in language, and historically defaulting to "he", was not about creating a bias, but instead it was a pattern which maximises information density and minimises useless information. Randomising the gender as is done today packs useless information into it.
Which one is more respectful is a different question. The lowest entropy option would still be the most likely gender specific pronoun. This would depend on the language, of course.
> IMO this is getting at a deeper truth that the use of a gender in language, and historically defaulting to "he", was not about creating a bias, but instead it was a pattern which maximises information density and minimises useless information. Randomising the gender as is done today packs useless information into it.
Where can I read more about this "truth"? Where is this assertion coming from that gendered pronouns developed to minimize useless information? It seems far more plausible to me that pervasive defaulting to male experiences caused many (certainly not all) human languages to (1) develop gendered pronouns and (2) default to the male pronoun.
For example:
"A doctor was examining the patient when ___"
What this makes apparent is that increasing model temperature will select the less stereotypical option more often.
IMO this is getting at a deeper truth that the use of a gender in language, and historically defaulting to "he", was not about creating a bias, but instead it was a pattern which maximises information density and minimises useless information. Randomising the gender as is done today packs useless information into it.