I think that study is the wrong framing of the problem for identifying economic returns on AI. We don't need AI to complete tasks perfectly, just to be able to generate a good enough approximation that is easy to review and correct such that an employee has to spend less time correcting AI's errors than they would spend producing the entire output from scratch. So it won't be a drop in replacement for an employee for another 4-10 years, but in the interim, will shift an employee's role from generating a complete solution to primarily reviewing and correcting an LLM-generated solution to get it from that 80-95% level (or whatever the starting point might be prior to 2029) to 100%.
At this point, the vast majority of the work required to make GenAI capable of producing that sufficiently reviewable/correctable content isn't improving model quality, but creating the harnesses, infrastructure, and workflows around the models. Companies aren't seeing returns yet because too many early adopting companies have conceived of AI as a drop in replacement for employees, or at least as a reason to cut staff immediately, without first building out the supporting systems needed to compensate for the inadequacies of the models.
At this point, the vast majority of the work required to make GenAI capable of producing that sufficiently reviewable/correctable content isn't improving model quality, but creating the harnesses, infrastructure, and workflows around the models. Companies aren't seeing returns yet because too many early adopting companies have conceived of AI as a drop in replacement for employees, or at least as a reason to cut staff immediately, without first building out the supporting systems needed to compensate for the inadequacies of the models.