This actually begs the question: Does anyone know the kind of actual infrastruct...

willyyr · on Dec 11, 2023

Mark Russinovich shares some of it in this recent Ignite session: https://ignite.microsoft.com/en-US/sessions/49347847-9ae4-43... *I work at Microsoft but have nothing to do with the datacenter engineering or other insights into the details behind it.

zurfer · on Dec 11, 2023

So 14400 H100 for GPT-4, but that's just a fraction of the new system that Azure is building for OpenAI.

FWIW, I most enjoyed the 29TB machine demo at the end.

Closi · on Dec 11, 2023

While we can't be sure of most of those answers, they have stated it is running in Azure.

Also we can probably assume the pricing is likely to be somewhat in proportion to the cost to run (possibly subsidised to gain market, but they are unlikely to be taking a giant/unsustainable loss per query here, particularly as they seem to announce price decreases when they increase model performance).

amir734jj · on Dec 11, 2023

Azure VMSS (uniform orchestration) + 2000 to 3000 GPU enabled servers. I'm not sure about what kind of GPU is on these servers.

dataking · on Dec 11, 2023

> Is the answer computed on a single NVidia GPU?

Most likely given that one of their open positions for a GPU programmer includes

> high technical competence for writing custom CUDA kernels and pushing GPUs to their limits.

Edit: only narrows it down to NVidia hardware, IDK if single GPU or not.