Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

GPT-3 had "books1" and "books2" among its training material and "books2" never had its actual source disclosed :https://arxiv.org/pdf/2005.14165.pdf

Speculations about these source materials can be traced back as far as 2020: https://twitter.com/theshawwn/status/1320282152689336320

I don't think this issue would've flown under the radar for so long, especially with the implication that Ilya sided with the rest of the board to vote against Sam and Greg.



books2 = libgen imo




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: