Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

PyTorch is only part of it. There is still a huge amount of CUDA that isn’t just wrapped by PyTorch and isn’t easily portable.


... but not in deep learning or am I missing something important here?


Yes, absolutely in deep learning. Custom fused CUDA kernels everywhere.


Yep. MoE, FlashAttention, or sparse retrieval architectures for example.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: