Well, all the models (especially Claude 3.5 Sonnet) seem to perform much better than random, so they are clearly not blind. The only task where Claude Sonnet 3.5 does not perform better than random is the one where you have to follow many different paths (the ones where the answer from A to C is 3), something that would take me several seconds to solve.
I have the feeling that they first choose the title of the paper and then run the evaluation on the new Claude 3.5 Sonnet on these abstract images.
>their vision is, at best, like that of a person with myopia seeing fine details as blurry
This also makes no sense, since the images evaluate the abstract capabilities of the models, not their eyesight.
I have the feeling that they first choose the title of the paper and then run the evaluation on the new Claude 3.5 Sonnet on these abstract images.
>their vision is, at best, like that of a person with myopia seeing fine details as blurry
This also makes no sense, since the images evaluate the abstract capabilities of the models, not their eyesight.