I dunno, some of the questions on things like Humanity's Last Exam sure strike me as "godlike." Yes, I'm happy that I can still crush LLMs on ARC-AGI-2 but I see the writing on the wall there, too. Barely over a year ago LLMs were what, single digit percentages on ARC-AGI-1?
I would hope god can do better than 40% on a test. If you select experts from the relevant fields humans, they together would get a passing grade (70%) at least. A group of 20 humans is not godlike.