Oof, that sounds scary. I’ve come to view high uptime as dangerous… it’s a sign you haven’t rebooted the thing enough to know what even happens on reboot (will everything come back up? Is the system currently relying on a process that only happens to be running because someone started it manually? Etc)
Servers need to be rebooted regularly in order to know that rebooting won’t break things, IMO.
> Servers need to be rebooted regularly in order to know that rebooting won’t break things, IMO.
the only thing we have to fear is fear itself[0]
Worrying about critical process(es) being started manually which will not be restarted if a server is rebooted has the same risk as those same process(es) crashing while the server is operational. Best practice is to leverage the builtin support for "Managing Services in FreeBSD"[1] for deployment-specific critical process(es).
Now if there is a rogue person which fires up a daemon[2] manually instead of following the above, then there are bigger problems in the organization than what happens if a server is rebooted.
Depends how they are built. There are many embedded/real-time systems that expect this sort of reliability too of course.
I worked on systems that were allowed 8 hours of downtime per year -- but otherwise would have run forever unless there was nuclear bomb that went off or a power loss...Tandem. You could pull out CPUs while running.
So if we are talking about garbage windows servers sure. It's just a question of what is accepted by the customers/users.
> I worked on systems that were allowed 8 hours of downtime per year -- but otherwise would have run forever unless there was nuclear bomb that went off or a power loss...Tandem. You could pull out CPUs while running.
Tandem servers were legendary for their reliability. I knew h/w support engineers years ago that told me stories like your recounting being able to pull components (such as CPU's) without affecting system availability.
Yep. I once did some contracting work for a place that had servers with 1200+ day uptimes. People were afraid to reboot anything. There was also tons of turnover.
Oof, that sounds scary. I’ve come to view high uptime as dangerous… it’s a sign you haven’t rebooted the thing enough to know what even happens on reboot (will everything come back up? Is the system currently relying on a process that only happens to be running because someone started it manually? Etc)
Servers need to be rebooted regularly in order to know that rebooting won’t break things, IMO.