Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

They did not test on the data that they tested, that's not what he wrote.


They synthetically generated 290k examples and kept 10k of them for testing.

It's worth pointing out that that's technically not testing on the training set, but looking at how similar examples are in the dataset, it's clear that severe overfitting would be unavoidable. That also makes the headline very misleading.

The weights may not be published since using it for document extraction on even the same format but with slightly different content or lengths would show how abysmal this finetune does outside of the synthetic data.


Thanks, rereading it makes it clear that you are correct.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: