This is pretty cool. But I think an implementation that avoids the mutexes (mutices?) when allocating the backends and uses channels instead would probably perform better.
2 channels needed, 1 for available backends and 1 for broken ones.
On incoming request, the front end selects an available backend from channel 1. On completion, the backend itself puts itself either back onto channel 1 on success, or channel 2 on error.
Channel 2 is periodically drained to test the previously failed backends to see if they're ready to go back onto channel 1.
2 channels needed, 1 for available backends and 1 for broken ones.
On incoming request, the front end selects an available backend from channel 1. On completion, the backend itself puts itself either back onto channel 1 on success, or channel 2 on error.
Channel 2 is periodically drained to test the previously failed backends to see if they're ready to go back onto channel 1.