At scale, with also having scaled systems engineers, it's not, like, impossibly hard. A sidecar like Envoy can be configured to emit health stats which can then be read by the load balancer to consider a given server unhealthy. Again, at scale, but each team is already responsible for a dashboard with health metrics for their service, so the load balancer team doesn't have to try and determine everybodies health metrics, only their own.