Maybe you could bring up a bunch of fake-servers, that "implement" every rpc call with a long sleep or maybe by responding with internal error. Then take them down gradually as real ones go up.
I think what you really want to do is not give every client a full view of all the backends. xDS lets you write a service discovery server that meets this condition (it knows the full state, potentially with health information about upstreams, and it knows which client is connecting, so you can adjust this as you see fit). I've also seen people do AZ or regional aggregation, i.e. given some consumer in AZ A, the consumer gets a list of endpoints like, 0.upstream.a, 1.upstream.a, ..., regional-aggregator.B, regional-aggregator.C, etc. It sees all the endpoints in the same AZ, but goes through a proxy to get to other regions/zones. Under non-panic circumstances, you'd want all requests to be served from the same node, then the same zone, and only go to other regions in degraded cases.
I don't know what the state of the art around tooling to manage this is. For some reason, I suspect that the service meshes punt on this, because N * M isn't a problem in the demo environments where these systems spend most of their time. Meanwhile, the big companies that hit the scalability limit of N * M connections across upstream/downstream pairs wrote their own service discovery stuff decades ago.