Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Its a 2 day project at best to create your own bespoke llm as judge e2e eval framework. Thats what we did. Works fine. Not great. Still need someone to write the evals though.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: