> I see no evidence in the paper that it can learn arbitrary tasks on the fly.
Neither can we do that. It takes years to become and expert in any field, we are not learning on the fly like Neo. That's when there is extensive training available, for research - it takes thousands of experts to crack one small step ahead. No one can do it alone, it would be too much to expect it from a lonely zero shot language model.
On the other hand the transformer architecture seems to be capable of solving all the AI tasks, it can learn "on the fly" as soon as you provide the training data or a simulator. This particular paper trains over 600 tasks at once, in the same model.
Neither can we do that. It takes years to become and expert in any field, we are not learning on the fly like Neo. That's when there is extensive training available, for research - it takes thousands of experts to crack one small step ahead. No one can do it alone, it would be too much to expect it from a lonely zero shot language model.
On the other hand the transformer architecture seems to be capable of solving all the AI tasks, it can learn "on the fly" as soon as you provide the training data or a simulator. This particular paper trains over 600 tasks at once, in the same model.