Making a high reliability queue that can also soak load spikes is non-trivial. N...

Making a high reliability queue that can also soak load spikes is non-trivial.

Now that we have hundreds of Gbps ethernet and TB of memory the idea has more merit, can scale pretty absurdly high with mundane systems. Or maybe you have sharding, which means now you have a load balancing problem again, of picking which work queue to take work from.

The HA bit is still hard. You have to to figure out if there's a netsplit (some folks can't connect to one server) or if one server really is gone. Probably just multicast to each queue all the incoming work & all the incoming pulls. Ideally each queue could also hear all the outgoing traffic. If ethernet capacity were unidirectional this would be great, box #2 could autonomously detect faults & take over. But ethernet is bidirectional, and now it needs all box #1's incoming traffic and it's outgoing traffic too. So instead maybe have the clients fail over. We can iterate on resign but HA is non-trivial.