This looks similar to Triton, I wonder what it does differently. But in any case...

erwincoumans · on July 14, 2024

Warp outputs its intermediate GPU CUDA or CPU C++ files that can be compiled and linked into a binary. Here is an old example of mine calling Warp kernels from C++: https://github.com/erwincoumans/warp_cpp

meisel · on July 14, 2024

Neat!

xpe · on July 14, 2024

Triton offers broad GPU support for writing high throughput kernels. Some higher level ML/AI tools, such as PyTorch, can use Triton internally. I don’t know off the top of my head if any simulation libraries do.

In what sense do you think they are similar?