This is cool - time travel debugging is potentially really helpful in these flaky situations. Having tests fail unpredictably basically means you're not controlling the thing they test, which can be scary.
But I find the most annoying failures in CI are the ones I can't reproduce any other way and somehow always happen when I'm not trying.
Run locally? Fine.
Run on a cloud machine that's identical to the CI system. Fine.
Run multiple instances of the test on the cloud machine to generate more load. Fine.
Run in the overnight tests - blam.
This doesn't always make sense, even once I've found the bug - sometimes the timing just shakes out that way.
Sometimes we record stubborn tests that are acting weird, so we're ready when they next fail.
But I find the most annoying failures in CI are the ones I can't reproduce any other way and somehow always happen when I'm not trying.
Run locally? Fine. Run on a cloud machine that's identical to the CI system. Fine. Run multiple instances of the test on the cloud machine to generate more load. Fine. Run in the overnight tests - blam.
This doesn't always make sense, even once I've found the bug - sometimes the timing just shakes out that way.
Sometimes we record stubborn tests that are acting weird, so we're ready when they next fail.