Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why would you consider this a good prompt?


Because both Nano Banana Pro and ChatGPT Images 2.0 have touted strong reasoning capabilities, and this particular prompt has more objective, easy-to-validate criteria as opposed to the subjective nature of images.

I have more subjective prompts to test reasoning but they're your-mileage-may-vary (however, gpt-2-image has surprisingly been doing much better on more objective criteria in my test cases)


[flagged]


"Quirky and obscure" has the functional benefit of ensuring the source question is not in the training data/outside the median user prompt, and therefore making the model less likely to cheat.

We have enough people complaining about Simon Willison's pelican test.


When you program, do you consider using your prior knowledge of programming cheating?


What would make the prompt a better actual evaluation in your judgement?


Not focusing on pokemon for a start. Maybe use something more people can recognize and evaluate. I have zero knowledge of pokemon, I see it as a niche thing for ultra-nerdy people, and not something everyone is familiar with. Nothing about that test can be evaluated by anyone but a pokemon expert. Sorry, but pokemon isn't as mainstream as some people might think it is.


I think you underestimate how popular Pokemon is.

By most objective measures it's the largest entertainment franchise in all of history.

Would you also object to any other pop-culture reference for the same reason?


>I think you underestimate how popular Pokemon is.

No, I think you are overestimating how popular pokemon is.

>By most objective measures it's the largest entertainment franchise in all of history.

I don't care? Only a small set of pokemon fans would be able to gain anything from this "test".

>Would you also object to any other pop-culture reference for the same reason?

Yes.


still #opentowork huh


Where does one even use that hashtag?


It's a LinkedIn joke.


Ah yes, also known as C++ enjoyers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: