IMHO there's reason to believe that was what was discussed here plays a role in ...

anon291 · on March 19, 2024

That has nothing to do with the API. The restriction there is you cannot use nvcc to generate nvidia bytecode, take that bytecode, decompile it, and translate it to another platform. This means that, if you use cuDNN, you cannot intercept the already-compiled neural network kernels and then translate those to AMD.

You can absolutely use the names of the functions and the programming model. Like I said, HIP is literally a copy. Llama.cpp changes to HIP with a #define, because llama.cpp has its own set of custom kernels.

And this is what I've said before, CUDA is hardly a moat. The API is well-known and already implemented by AMD. It's all the surrounding work: the thousands of custom (really fast!) kernels. The ease-of-use of the SDKs. The 'pre-built libraries for every use case'. You can claim that CUDA should be made open-source for competition, but all those libraries and supporting SDKs represent real work done by real engineers, not just designing a platform, but making the platform work. I don't see why NVIDIA should be compelled to give those away anymore than Microsoft should be compelled to support device driver development on linux.

paulmd · on March 19, 2024

that’s literally old news, it’s from ‘20 or ‘21 and just got noticed iirc