> That's usually too fast for a load balancer that's still sending new requests.
How?
A load balancer can't send a new request on a connection that doesn't exist. (Existing connections being gracefully torn down as requests conclude on them & as the underlying protocol permits.) If it cannot open a connection to the backend (the backend should not allow new connections when the drain starts) then by definition new requests cannot end up at the backend.
The server in http is limited in its ability to initiate connection closures. Remember that when you close a connection in TCP, that sends a FIN packet, and the other end of the connection doesn't know that that's happened yet and might still be sending data packets. In http, the server can request that the client stop using a connection and close it with the "connection: close" header. If the server closes the connection abruptly, there could be requests in flight on the network. With http pipelining, the server may even receive requests on the same connection after sending "connection: close" since they could have been sent by the client before that header was received. With pipelining, the client needs to close the TCP connection to achieve a graceful shutdown.
How?
A load balancer can't send a new request on a connection that doesn't exist. (Existing connections being gracefully torn down as requests conclude on them & as the underlying protocol permits.) If it cannot open a connection to the backend (the backend should not allow new connections when the drain starts) then by definition new requests cannot end up at the backend.