(I'm the guy who guessed correctly and got early access). Sorry I have been busy with my main job and haven't been able to contrib much in the past couple of months or so, but I just took a look at it and it looks really good so far. I think some of the architectural things are well-considered, and also, one thing that I forsee is "calling out to a remote gpu", where the nx library itself can be used by a node that doesn't have a gpu, instead of having the node with a gpu expose an rpc to the gpuless node. I guess a relevant analogy here would be the difference between graphQL and REST, but possibly also allowing the gpuless node to gracefully degrade to cpu, if the gpu resource isn't available. The Nx API will likely present an opportunity to also make data gravity abstractions cleaner.
I think the elixir/erlang way of thinking about "concurrency as a special case of distribution" will apply to Nx, except "gpu as a special case of distribution" which means there won't be a leaky abstraction for remote gpu. I tried to get the Julia folks to adopt this sort of an attitude towards the juliagpu, but I'm a bit of a nobody so it didn't take.
Very excited! And I'm whipping up a little bit of something interesting related to Nx myself that hopefully I can release around the time of your talk, but I don't want to make any promises yet.
Good point. The relevant analogy is that in GraphQL the client defines the query that gets run and in REST the server defines the query that gets run. So one way you could architect Nx is to have your elixir running on the gpu node expose a handful of prebaked functions and allow your client elixir node issue requests to those prebaked functions ("REST"). But I think a bit more exciting would be to have your gpu node running a relatively dumb Nx server, and have your gpu-less client running Nx, where its math instructions get serialized using a thin "network device" backend instead of a "gpu backend", over to the gpu node, then the gpu node does the math and comes back with the answer ("GraphQL"). For even longer-haul operations you would want to do things like figure out how to represent datasets with tokens that are cached remotely. I think this all is possible, and, relatively easy with the Nx architecture.
I think the elixir/erlang way of thinking about "concurrency as a special case of distribution" will apply to Nx, except "gpu as a special case of distribution" which means there won't be a leaky abstraction for remote gpu. I tried to get the Julia folks to adopt this sort of an attitude towards the juliagpu, but I'm a bit of a nobody so it didn't take.
Very excited! And I'm whipping up a little bit of something interesting related to Nx myself that hopefully I can release around the time of your talk, but I don't want to make any promises yet.