Hacker Newsnew | past | comments | ask | show | jobs | submit | more friendly_chap's commentslogin

Hah! Nice idea! I built something with a similar mindset but instead of calling cloud AI providers my aim is to provide a self-hostable complete AI platform: https://github.com/singulatron/singulatron

I know that might sound like putting the server back to serverless. But I would say it's being your own serverless provider - once you have the platform installed on your servers you can build frontend-only AI apps on top.

Hope you don't mind the self-plug. Your approach definitely a ton of advantages when starting out (no infra to manage etc).


Hey! At the moment we don't release much about the inner workings but that stance might change soon as we have a habit of going from closed -> open pretty quickly ;).


Sorry if this sounds harsh. I'm sure you've all done a lot of great work. I'm only saying why some kind of operational knowledge is important to me. I don't buy into a product without some mental model how it works because it's otherwise hard to tell if it's worthwhile. Or if the entire value is based on something that's easily replicated, then I may as wait for those competitors that compete on execution, ease of use, etc rather than a promise.


> Sorry if this sounds harsh

I braced myself for actual harsh critique there but that was rather mild! Phew.

> Or if the entire value is based on something that's easily replicated

Not THAT easily replicated but given how well funded our competitors are and we are bootstrapped I would prefer to wait until most startups participating in this wave burn through their funds instead of publishing relatively novel but not overly expensive to replicate twists.

We'll come up with better explanations of how the thing works without hinting at technical difficulties we faced and their solutions.

The rightful point was made, thank you. You were not harsh at all.


Shameless self-plug but your questions (especially combined with the product posted) are the raison d'etre for our whole company: https://singulatron.com/


We are running smaller models with software we wrote (self plug alert: https://github.com/singulatron/singulatron) with great success. There are obvious mistakes these models make (such as the one in our repo image - haha) sometimes but they can also be surprisingly versatile in areas you don't expect them to be, like coding.

Our demo site uses two NVIDIA GeForce RTX 3090 and our whole team is hammering it all day. The only problem is occasionally high GPU temperature.

I don't think the picture is as bleak as you paint. I actually expect Moore's Law and better AI architectures to bring on a self-hosted AI revolution in the next few years.


The main focus is the daemon which we plan to make distributed. We will also support multiple database backends and other distributed primitives.


Thanks gpaslari,

We are running known models like LLama, StableDiffusion etc. So pretty much the same :)

We are not planning on building our own models yet.


Hey folks,

Op here. Trying to build a distributed AI platform - lots of work, fairly early phase but I think it's already useful. Hope you'll like it.

Thanks!


Are you distributing the LLM generation tasks at a low-level (server 1,2,3,4 each have a GPU and each get 25% of the model + work somehow)? Or is this just a way of interfacing with an existing LLM using a standardized interface on front and back ends?


Hey!

At the moment the distributed features are entirely missing from the repo, sorry to disappoint on that front despite being part of the readme. It's hectic!

But to answer your question: I plan to sidestep most difficulties of the distributed story by offloading into the usual distributed tools - horizontally scalable DBs and distributed locks, queues etc. This will have the added upside of being able to use your favourite tools for which you already have dashboards, alerting etc.

There will be no fancy resource management (an other comment touched in "reinventing kubernetes" which is definitely not a goal at all), but there will be some simple optimizations in terms of where to run which prompt depending on if the model is already running on a node or not etc.

So there will be some knowledge of the nodes' capacity (at least available VRAM etc), but the picking of the prompts from the queue will be a simple unit testable algorithm.

At least this is the plan but it's only half implemented in my mind.


Why reinvent Kubernetes?


I'm not planning to :)). Singulatron is not about scheduling, etc.; it operates at a much higher level of abstraction. To put it in less fancy terms: it's an app, albeit one designed to run on many servers simultaneously.


Why reinvent half of Kubernetes?


So any web app that runs on multiple nodes reinvent k8s? I'm not sure about that.


> So any web app that runs on multiple nodes reinvent k8s

Most of them only reinvent 1/10th-1/100th of Kubernetes.


I have actually open sourced one recently which supports queues - its both a desktop app and a daemon: https://github.com/singulatron/singulatron


So many interesting projects, I just wish AI hardware was readily available.


We are working on improving getting the idea across. It's just a search engine with a bunch of extra filtering capabilities - at the moment those filters are all related to the website owner companies' metadata (industry, size etc).


Thank you for your kind words!

> industry filter should be case insensitive

Will fix!

> Quite limited when trying to find the website of a specific company

Yeah so the index is not entirely complete and/or correct. The goal for now is not to be perfect but to produce different results than your usual search avenues to broaden your horizons.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: