Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Epyc Genoa CPU/Mobo + 700GB of DDR5 ram. The model is a MoE, so you don't need to stuff it all into VRAM, you can use a single 3090/5090 to hold the activated weights, and hold the remaining weights in DDR5 ram. Can see their deployment guide for reference here: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: