Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, but larger models that fit in that 512gb memory are going to take a long time to tokenize/detokenize without hardware-accelerated BLAS.


Why would you need BLAS for tokenization/detokenization? Pretty much everyone still uses BBPE which amounts to iteratively applying merges.

(Maybe I'm missing something here.)


Tokenization/detokenization does not use BLAS.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: