A comparison I've seen isn't to roulette but to a slot machine. Anthropic itself encourages its employees to treat its use for refactors as a slot machine. [1]
It seems like an idea worth exploring formally but I haven't see that done anywhere. Is this a case of "perception of winning" while one is actually losing? Or it it that the winning is in aggregate and people who like LLM-based coding are just more tolerant of the volatility to get there?
The only study I've seen testing the actual observable impact on velocity showed a modest decrease in output for experienced engineers who were using LLMs for coding.
That really resonates. I've found myself questioning whether I'm wasting my time writing a piece of code: what if the LLM could do this more quickly? So I try it, almost every time, and sometimes it does, sometimes it doesn't. Am I really saving myself any work in the long run? Honestly I don't know. I feel like it's just causing me to work more because it feels like a game and that is, ultimately, where the results are coming from.