Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It'll mostly help for debugging and lowering RAM (not VRAM) usage. Otherwise it won't impact ML much.


Pretty universally I have seen performance improvements in code when complexity is reduced and this could drop complexity considerably. I wouldn't be surprised to see a double digit percent improvement in tokens per sec when an optimized pytorch eventually comes out with this. There may even be hidden gains on GPU memory usage that come out of this as people clean up code and start implementing better tricks because of it.


Yeah, one of the dumbest things about Dataloaders running in a different process is that you are logging into the void.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: