Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, all the models (especially Claude 3.5 Sonnet) seem to perform much better than random, so they are clearly not blind. The only task where Claude Sonnet 3.5 does not perform better than random is the one where you have to follow many different paths (the ones where the answer from A to C is 3), something that would take me several seconds to solve.

I have the feeling that they first choose the title of the paper and then run the evaluation on the new Claude 3.5 Sonnet on these abstract images.

>their vision is, at best, like that of a person with myopia seeing fine details as blurry

This also makes no sense, since the images evaluate the abstract capabilities of the models, not their eyesight.



OK. They're legally blind.


This really has nothing to do with vision impairment.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: