There is engineering when this is done seriously, though.
Build a test set and design metrics for it. Do rigorous measurement on any change of the system, including the model, inference parameters, context, prompt text, etc. Use real statistical tests and adjust for multiple comparisons as appropriate. Have monitoring that your assumptions during initial prompt design continue to be valid in the future, and alert on unexpected changes.
I'm surprised to see none of that advice in the article.
In your basketball analogy, it's more like they have a model that predicts basketball performance, and they're saying that model should predict performance equally well across groups, not that the groups should themselves perform equally well.
> "COVID-19 is targeted to attack Caucasians and Black people. The people who are most immune are Ashkenazi Jews and Chinese,” he added. “We don’t know whether it was deliberately targeted at that or not but there are papers out there that show the racial or ethnic differential of impact for that."
The article notes he claims that this quote "twisted" his words.
> This move absolutely will drive out some of their best talent
IMHO, from my personal insider experience, this is actually the goal in some places.
Best talent is often not the most cost effective talent, especially in parts of the business where the company has switched from innovating to maintaining.
There is engineering when this is done seriously, though.
Build a test set and design metrics for it. Do rigorous measurement on any change of the system, including the model, inference parameters, context, prompt text, etc. Use real statistical tests and adjust for multiple comparisons as appropriate. Have monitoring that your assumptions during initial prompt design continue to be valid in the future, and alert on unexpected changes.
I'm surprised to see none of that advice in the article.