To me I guess it suggests that these models are not using the correct approach. We keep finding new types of tasks the models are bad at, then the next model fixes those issues because those specific tasks are added to the training set. But that approach never results in a generalized problem solving ability, just an ability to solve all the problems we've thought of so far.