Eh I can’t see Linux getting a built-in distributed kv store (etcd) any time soo...

throwaway787544 · on March 30, 2022

Fwiw those features existed in Mosix (a Linux SSI patch) 2 decades ago... I feel like we could probably do it again

In terms of CAP, yeah it might not have been technically as reliable. But there's different levels of reliability for different applications; we could implement a lot of it in userland and tailor as needed

NavinF · on March 30, 2022

I call BS. I can’t find any details about how mosix handled storage, but what I did find suggests nfs semantics. That’s totally unusable which is probably why the project died decades ago. (And apparently you had to recompile every app because they changed the syscall ABI to add a node ID to every inode or something? Guess they were speedrunning obsolescence)

> we could implement a lot of it in userland

Yeah that’s k8s, etcd, ceph, and the distributed database of the week.

throwaway787544 · on April 4, 2022

You didn't need to recompile programs, that was the whole idea. Distribute any app's compute over many nodes. But shared memory and threading were very hard to distribute and I/O was not distributed except for mfs (a distributed layer on NFS) which did work fine. But obviously NFS is not suitable for all applications, in which case you could use any other form of distributed I/O.

It worked great for forking apps. Trouble was hell would freeze over before the patches got merged and most people thought it wouldn't be widely adopted without shared memory and threads.

But the point is, it did run arbitrary apps across distributed nodes, you could see any node's processes and instrument them, you could see the filesystem of any node. This isn't some advanced mystic sorcery, it was there two decades ago. Clearly we could implement these features again in some new way - not as an SSI, but at least allowing an assortment of system-level RPC and some sort of distributed pluggable VFS.

And also my point is: sure, we have all these 3rd party userland solutions, and that is bad. It means nothing is supported until it's been "integrated". It means we have miles and miles of plumbing that schmucks like me are paid to set up before a JavaScript developer can run their piddly web app across 3 nodes. It should just be baked into the OS, batteries included. A lot less annoying bullshit, a lot more standardization, and the ability to get more shit done with less effort. That is the entire point of operating systems, to make it easier to run programs. Not to make it necessary to add 15 million new abstractions before you can run your programs.

zozbot234 · on March 30, 2022

> requires distributed and consistent state

Distributed yes, but not necessarily consistent. You can use CRDTs to manage "partial, flexible" consistency requirements. This might mean, e.g. sometimes having more than 5 instances running, but should come with increased flexibility overall.