Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In general you can just use the parameter count to figure that out.

70B model at 8 bits per parameter would mean 70GB, 4 bits is 35GB, etc. But that is just for the raw weights, you also need some ram to store the data that is passing through the model and the OS eats up some, so add about a 10-15% buffer on top of that to make sure you're good.

Also the quality falls off pretty quick once you start quantizing below 4-bit so be careful with that, but at 3-bit a 70B model should run fine on 32GB of ram.



Does 70b mean there are 70 billion weights and biases in the model?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: