I wonder why they didn't use rendezvous hashing (aka HRW)[0]?
It feels like it would solve all the requirement that they laid out, is fully client side, and doesn't require real time updates for the host list via discovery.
HRW would cover the simple case, but they needed way more-- e.g. per-request balancing, zone affinity, live health checks, spillover, ramp-ups, etc. Once you need all that dynamic behavior, plain hashing just doesn’t cut it IMO. A custom client-side + discovery setup makes more sense.
the problem is that they want to apply a number of stateful/lookaside load balancing strategies, which become more difficult to do in a fully decentralized system. it’s generally easier to asynchronously aggregate information and either decide routing updates centrally or redistribute that aggregate to inform local decisions.
It feels like it would solve all the requirement that they laid out, is fully client side, and doesn't require real time updates for the host list via discovery.
[0] https://en.wikipedia.org/wiki/Rendezvous_hashing