Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I saw there was another benchmark where top LLMs also struggle in real patient diagnostic scenarios in a way that isn't revealed when testing in e.g. medical exams. I wonder if this also applies to law, too...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: