Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any plans to publish the benchmark results?


I have plans to publish the problems, not any plans to publish how well the LLMs perform on them. The standard for publishing benchmarks is very high, and I'm really just posting vibes here. Still, I hope my experiences are useful to some people, as others experiences have been useful to me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: