Hacker News new | past | comments | ask | show | jobs | submit login
Troubleshooting an intermittent failure in CI tests on ARM64 (konghq.com)
58 points by dndx on Dec 15, 2023 | hide | past | favorite | 6 comments



Another nice demonstration of power of rr, reversible debugging.

https://rr-project.org/


This is cool - time travel debugging is potentially really helpful in these flaky situations. Having tests fail unpredictably basically means you're not controlling the thing they test, which can be scary.

But I find the most annoying failures in CI are the ones I can't reproduce any other way and somehow always happen when I'm not trying.

Run locally? Fine. Run on a cloud machine that's identical to the CI system. Fine. Run multiple instances of the test on the cloud machine to generate more load. Fine. Run in the overnight tests - blam.

This doesn't always make sense, even once I've found the bug - sometimes the timing just shakes out that way.

Sometimes we record stubborn tests that are acting weird, so we're ready when they next fail.


I never get tired of funny misreads of product names. I thought "our own busted framework" was self-deprecation until it showed up in slab serif on the next line.


This is a good cautionary tale for folks jumping on cheap ARM cloud instances. Different architectures mean different bugs, and depending on how your infra is provisioned, could make reproducing bugs even harder.


This is a very well-written blog post: concise, lucid, and it does a good job speaking to programmer audiences with varying backgrounds.


Is that the company that fucked up insomnia.rest?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: