Hey! At the moment the distributed features are entirely missing from the repo, ...

Hey!

At the moment the distributed features are entirely missing from the repo, sorry to disappoint on that front despite being part of the readme. It's hectic!

But to answer your question: I plan to sidestep most difficulties of the distributed story by offloading into the usual distributed tools - horizontally scalable DBs and distributed locks, queues etc. This will have the added upside of being able to use your favourite tools for which you already have dashboards, alerting etc.

There will be no fancy resource management (an other comment touched in "reinventing kubernetes" which is definitely not a goal at all), but there will be some simple optimizations in terms of where to run which prompt depending on if the model is already running on a node or not etc.

So there will be some knowledge of the nodes' capacity (at least available VRAM etc), but the picking of the prompts from the queue will be a simple unit testable algorithm.

At least this is the plan but it's only half implemented in my mind.