Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's not what 90% effective means. Tests don't work that way.

Tests can be wrong in two different ways, false positive, and false negative.

The 90% figure (which people keep rounding up from 86% for some reason, so I'll use that number from now on) is the sensitivity, or the abitity to not have false negatives. If there are 100 cheaters, the test will catch 86 of them, and 14 will get away with it.

The test's false positive rate, how often it says "AI" when there isn't any AI, is 0%, or equivalently, the test's "specificity" is 100%

> Turnitin correctly identified 28 of 30 samples in this category, or 93%. One sample was rated incorrectly as 11% AI-generated[8], and another sample was not able to be rated.

The worst that would have happened according to this test is that one student out of 30 would be suspected of AI generating a single sentence of their paper. None of the human authored essays were flagged as likely AI generated.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: