The skills required to maintain old codebases atrophy and we are barely training new people to do it so the skills pool shrinks. That means it doesn't get regular maintenance and it means disaster response in situations like this is slow and expensive.
Whilst it would be a major upheaval to switch to a clean room engineered implementation using 2025 best practices it would at least increase the talent pool that can work on it effectively.
There does likely come a point where it is cost effective to rebuild it both in reduced unplanned downtime and reduced maintenance costs.
Eventually, yes. But considering I worked at a place as recently as 2018 that had COBOL systems (and I know it's not unique in this regard, I'm confident those systems are still there too) from the 80s still running in production, I think that time horizon can be long.
If someone was on the original team writing this software when they were in their mid 20s then they are now in their 70s. If we don't start rebuilding some of this software soon there won't be anyone alive that understands it intimately in the way only an original author could.
That implies that no engineer who isn't (one of) the original author(s) cannot learn a system, to which I vehemently disagree. New engineers can be trained up on old systems and languages. The trouble is few want to because it typically ties their skillset heavily to a particular place of work, which is risky with current business culture.
Yes they can be trained up but a mechanic working on a car will never understand all of the undocumented design decisions that went into its production. They understand the what but not necessarily all of the why. Not only that but like you say it isn't a popular path meaning you aren't getting a broad (or deep) talent pool.
Welcome to the world of large legacy systems developed for and used by organizations that don't understand IT, and don't know that they don't understand IT.
The good news is there's always a contractor willing to promise the world and deliver something that doesn't work in 5-10 years. Your internal team, who could have finished the original job in a couple years if they'd been funded, will be the ones that end up making that delivered system actually work. But you'll tell everyone that it was the contractor's high quality output that did the trick, because saying they failed would hurt your career. In 15-30 years the system will get replaced and your successor will hear about how great <contractor> did the first time, and they'll get another shot at failure.
Exactly, we need to fund the people that have the knowhow now to either document it fully so it can be replaced later or to replace it now. Before that knowledge of building and operating it for so long is lost forever.
Whilst it would be a major upheaval to switch to a clean room engineered implementation using 2025 best practices it would at least increase the talent pool that can work on it effectively.
There does likely come a point where it is cost effective to rebuild it both in reduced unplanned downtime and reduced maintenance costs.