We’re spread over 9 servers but only due to ephemeral port and handle exhaustion issues. Each server is fronted by 4 HAProxy frontends that each handle ~18k connections.
Since Nick’s post we’ve moved from StackExchange.NetGain to the managed websocket implementation in .NET Core 3 using Kestrel and libuv. That sits at around 2.3GB RAM and 0.4% CPU. Memory could be better (it used to be < 1GB with NetGain) but improvements would likely come from tweaks to how we configure GC under .NET Core which we haven’t really investigated yet.
We could run on far fewer machines but we have 9 sitting there serving traffic for the sites themselves so no harm in spreading the load a little!
Ephemeral port exhaustion is easy to handle if you control HAProxy and the origins.
You'll need the source [1] option on your server lines, and you also need to adjust to allow more connections, one of these will do: have the origin server listen on more ports, add more ips to the origin server and listen to those too, add more ips to the proxy and use those to connect as well.
I'm not sure about handle exhaustion? I've run into file descriptor limits, those are usually simple to set (until you run into a code enforced limit)
Maybe just a couple. If you google the StackExchange Hardware Setup, you will find they use surprisingly little Hardware. Back in 2013, they could go with 2 single webservers not accounting for failover reduncancy [0]. They have upgraded since, but still, if you consider the amount of hits they get, they run a pretty humble hardware [1].
One of the arguments I make when people say microservices + cloud built to the core is the only way to scale - clearly a cleverly architected approach can save you lots on hardware/hosting money.