Failing over is correct because there's no way to discern that the hardware is not at fault. They should have designed a better response to the second failure to avoid the knock-on effects.
Retroactive inspection revealed that it wasn't a hardware failure, but the computer didn't know that at the time, and hardware failure can look like anything, so it was correct to exercise its only option.