Misunderstanding benchmarks seems to be the first step to claiming human level intelligence.
Additionally:
> > ARC-AGI is a benchmark that’s designed to be simple for humans but excruciatingly difficult for AI. In other words, when AI crushes this benchmark, it’s able to do what humans do.
This feels like a generalized extension of the classic mis-reasoned response to 'A computer can now play chess.'
Common non-technical chain of thought after learning this: 'Previously, only humans could play chess. Now, computers can play chess. Therefore, computers can now do other things that previously only humans could do.'
The error is assuming that problems can only be solved via levels of human-style general intelligence.
Obviously, this is false from the way that computers calculate arithmetic, optimize via gradient descent, and innumerable other examples, but it does seem to be a common lay misunderstanding.
Probably why IBM abused it with their Watson marketing.
In reality, for reliable capabilities reasoning, the how matters very much.
> Misunderstanding benchmarks seems to be the first step to claiming human level intelligence.
It's known as "hallucination" a.k.a. "guessing or making stuff up", and is a major challenge for human intelligence. Attempts to eradicate it have met with limited success. Some say that human intelligence will never reach AGI because of it.
Thankfully nobody is trying to sell humans as a service in an attempt to replace the existing AIs in the workplace (yet).
I’m sure such a product would be met with ridicule considering how often humans hallucinate. Especially since, as we all know, the only use for humans is getting responses given some prompt.
Misunderstanding benchmarks seems to be the first step to claiming human level intelligence.
Additionally:
> > ARC-AGI is a benchmark that’s designed to be simple for humans but excruciatingly difficult for AI. In other words, when AI crushes this benchmark, it’s able to do what humans do.
Doesn’t even make logical sense.