Happens to all of us. Once I required logs from the server. The log file was a f...

tempestn · on Oct 18, 2020

Seems unwise to have an employee doing anything with production servers on their first day, let alone while everyone else is asleep.

amingilani · on Oct 18, 2020

It does but that was an exceptional role. The company needed emergency patches to a running product while they hired a whole engineering team. As such, I was the only one around doing things, and there wasn't any documentation for me to work off of.

I actually waited until nightfall just incase I bumped the server offline because we had low traffic during those hours.

nullsense · on Oct 18, 2020

What's the story behind this company/job? Was it some sort of total dumpster fire?

amingilani · on Oct 19, 2020

I wouldn't classify it as that but they had had trouble in the past which lead to a lot of their team leaving, and were now looking to recover from it.

I was only there for a short time though. Hopefully they figured things out.

netheril96 · on Oct 18, 2020

Why does the DB get corrupted? Does ACID mean anything these days?

theamk · on Oct 18, 2020

Not original poster, but up to 2010, default MySQL table type was MyISAM, which does not support transactions.

thdrdt · on Oct 18, 2020

When a server runs out of memory a lot of strange things can happen.

It can even fail while in the middle of a transaction commit.

So transactions won't fix this.

tannhaeuser · on Oct 18, 2020

No. That is exactly what a transactional DB is designed to prevent. The journal gets appended with both the old and the new data and physically written to disk, and only then the primary data representation (data and B-tree blocks) gets updated in memory, then eventually that changed data is written to DB files on disk. If the app or DB crashes during any stage, it will reconstruct primary data based on journalled, comitted changes. DBs shouldn't attempt to allocate memory during the critical phase, and should be able to recover even on failed allocations at any time by just crashing and let regular start-up recovery clean up. Though a problem on Linux might be memory overcomitting.

Edit: and another problem is disk drives/controller caches lying and reporting write completion when not all data has actually reached stable storage

vanviegen · on Oct 18, 2020

Transactions should fix this. That's what the Write Ahead Log and similar techniques are for.

amingilani · on Oct 19, 2020

It was an older MongoDB in my case. :)