Like others, I think the categories are a bit iffy. But tbh It's really hard to correctly delineate stuff. Terms like 'IDE', 'workspace', 'pipeline', 'workflow' and 'platform' all end up semantically merging. ComfyUI, for example, could sit in a bunch of these categories because it's basically become an eco-system unto itself. It's useful to surface singular libraries and SaaS tools, but lots of the other things are plural in nature. It might be prudent to make these categories more continuous by using a tagging approach instead? Ps. I love overview dashboards like this and happy to see it's community-editable.
It makes me think of Thomas Kuhn’s Structure of Scientific Revolutions. To be fair I never finished the book but one thing really stuck out.
He describes how scientific revolutions aren’t really incremental progress on previous theories. Think Galileo’s theory earth revolved around the sun - that destroyed the conventional understanding of astronomy (and maybe even physics?). After these revolutions there is a flood of “normal” science that fills in the gaps based on these new big axioms.
Now, blockchain nor AI are scientific revolutions by Kuhn’s meaning of the term. They are certainly “normal” operating within the established domains of statistics, computer science, cryptography etc. But I think they are an analogs . Major breakthroughs were made that gave us AI and blockchain and then afterwards there is a flood of (probably decreasingly small) extensions or applications of them.
I guess my point is that I don’t feel like anything is broken or there are bad incentives. Thats just how innovation works. You have massive progress followed by decreasingly useful work. And certainly in the case of AI there is more large scale progress to be made.
Except in one of those, we get steam engines, electricity, space travel, and what not, and in the other we get biz bros getting rich quick by scamming people with cryptocoin Ponzi schemes, deepfakeporn, and election fraud robocalls. Plenty of bad incentives there, not at all convinced it all should count as "innovation" even if it is new.
People are overselling and hyping it in similar ways, and entire companies springing up that are essentially just thin wrappers around chatGPT doesn’t help, especially when a lot of the same people pimping that stuff were on the crypto/web3 train barely over a year ago.
However, while definitely overhyped and its capabilities far exaggerated, its applications already in several fields have been significant. I think the “gen AI” thing is pointless, but I got an eerie feeling watching 4 fairly senior engineers the other day huddled around a chatGPT terminal they were asking questions.
To me it felt like some profound moment, like maybe how the guys that went around manually lighting street lamps felt when they first saw electric lights.
Anyway, the answer usually falls in the middle somewhere between hype and doomsaying. It’s improved my workflow a bit. Not too worried that it’ll replace me, but I do believe there will be less work because of it, and to management that usually means someone’s getting fired.
> and entire companies springing up that are essentially just thin wrappers around chatGPT doesn’t help
instant frat-guy funding.. they have networks of insta-money. There is a cult in SF connected to Seattle..
meanwhile, thousands upon thousands of competent coders are looking for a work gig.. the price for skilled coders is dropping in almost all categories..
It does have the same feel but there's a big difference. Back then we knew crypto was all fiat. Dollar value was based on perception that could and was(is still) manipulated by whales.
AI, OTOH, can provide meaningful value in most industries. Lump them together at your own peril.
As someone who has built large scale AI, it’s funny to see the only 2 brand name tools my team used missing: Airflow (etl and embedding processing), BigQuery (product analytics)
There are new tools here I'm excited to try, especially the open-source ones.
One of the things that is still TBD about this wave of ai companies: what are the pain points in building products that people actually want to use?
I guess my warning would be that, much like integrating an analytics platform a few years back, it's easy to spend a lot of time and money on someone else's idea of how to improve your business.
I feel like I've had to learn this lesson at every level of the stack. There's no ORM that will save you from thinking about database schemas. There's no analytics platform that will save you from thinking about what's the most important thing your product is. I suspect, there's no training or evaluation platform that will save you from trial and error; prompt and check techniques.
**
Prompt: Is this classification correct: LLM Providers are { Hugging Face, Open AI, Anthropic }. Think step by step. Be very precise and return with YES or NO only for each Provider in the list.
Response: For each provider in the list:
Hugging Face: No, Hugging Face is not a large language model provider. They are a technology company that provides a platform for NLP and AI, offering a wide range of pre-trained models, including large language models, through their transformers library.
OpenAI: Yes, OpenAI is a large language model provider, as they have developed several large language models, like GPT-2 and GPT-3.
Anthropic: Yes, Anthropic is a large language model provider, as they have developed large language models, like Claude.
In summary, the classification "LLM Providers are { Hugging Face, Open AI, Anthropic }" is not entirely accurate. A more precise classification would be "NLP organizations/companies { Hugging Face }" and "LLM Providers { OpenAI, Anthropic }".
**
Excuse my ignorance in this space but I want to check my understanding:
About a year ago I toyed with writing a web app that was essentially a front-end to a diffusion image generator (very original I know). The site used socket-io -> flask -> redis queue -> distributed pytorch processes.
Am I correct that several of these services are selling some equivalent of the '-> redis queue -> model' component? Is part of the value proposition here that you don't need to staff people with pytorch/equivalent familiarity?
This looks like the CNCF Landscape(https://landscape.cncf.io/) but will we be able to achieve project maturity like the CNCF graduated projects to any of the listed AI Infra projects, also how will the projects/infra/solutions get added/categorized? Just listing 1000 libraries/projects/tools in one single place without a dedicated team managing their quality wound't feel good for me.
Yeah, the landscape is using CNCF landscape generator https://github.com/cncf/landscape2. I also want to introduce the maturity similar to CNCF but not sure how to adopt it.
It seems like there are zero startups here other than OpenAI and some other well war-chested funded companies that would last for 3 years here or make any money.
At a quick glance, they seem significantly different.
https://ai-infra.fun/ contains things like OpenAI & Anthropic under the LLM section.
I don't see that on https://landscape.lfai.foundation/.
Also, I don't see a space for Code or IDE tooling the Linux Foundation page.
ai-infra.fun contains TabNine and Tabby.
Would things like TabNine and Tabby make sense on the Linux Foundation one? Would love to collaborate on this!
I dont know about you guys, but I seriously have no clue what many are really doing. Even after 1 minute of reading their landing pages. Are these already profitable products or products being used?
I think a lot of these are niche applications, intended for other startups working in this space. The services they offer are overkill for the majority of casual users trying to get a handle on AI by dabbling with llama.cpp or Stable Diffusion on a desktop.
These startups are offering tools to maximize workflow efficiencies for high scalability deployments, but most of the world is still trying to understand WTF is a transformer.
I got my Mac Studio with M1 Ultra and 128g of RAM specifically for this reason, in case I needed it. Definitely not using its full capacities for any other purpose, though it's nice to be able to run 10 instances of a big data-grinding program at once, overnight.
Still waiting on 'consumer' AI training facilities I can practically use: it's a bit arcane, not up to speed on that. I can generate text up to Llama 70b if I'm patient, and generate images and sort-of-video quite easily.
The Macbook Pro is this, if you don't mind its relatively slow speed. It's costly, but a 92GB Macbook Pro is still by far the cheapest way to get that much VRAM.
A device like the Rabbit r1 looks promising. You don't technically need to run inference on the same device. We're still a few years away from making this efficient and small enough to run locally, but there is a lot of potential for the next generation of AI assistants to revolutionize how computing is done, and have mass market appeal.
I'm surprised that Apple hasn't jumped on this yet. All the building blocks are in place to make a revolutionary device possible. It just needs to be very polished, and have great UX, which is what Apple typically excels at.
Good point. And when you talk about a few years away this made me think about the ENIAC. A modern household typically consumes far less electricity than ENIAC.
Now our smartphones are 1000 times more powerful than the ENIAC and use less power.
Do you think Apple likes to jump on things? Apple usually tries not to be first, but definitely likes to polish .
Really!
I always imagine a system:
Composed of 5 specialized computers.
One for each category: audio, text, video, image.
One analyzer to coordinate everything.
This would be my API that I could access with mobile devices.
Here's a scenario:
I could talk to my phone about ideas, in the background it would create apps prototypes, create posters, make music based on something i whistle, teach me ask i ask question about a topic.
For the currently used architectures, it doesn't make sense to have 5 specialized, dedicated computers as the "AI" text processing and "AI" video processing and the others use very similar architectures and there's no benefit from specialization, the "video-specialized" hardware would be just as good at processing text and vice versa.
A processor has different cores, Computers may have several hard-drives, 4 sticks of ram.
Each component can run in parallel.for example, if a long video processing task is underway and your text generation component is idle, it could assist.
Should the audio component fail , only that specific part would be affected.
“There’s no point in personal computing consumer device, the compute can just be on a mainframe.”
Apologize for the allergic reaction to running something on somebody else’s computer. As much as I appreciate our connected world, I prefer my data and compute local and private.
There’s few things in this world that infuriate me on a daily basis more than a slow or lost internet connection.
If you are serious enough about gen AI to want to run it locally you can just get a good video card. Otherwise, you'll save time and money by just using a service/collab, I promise you that there will be ones that offer privacy and uncontrolled generation.
the guide section of the site needs at least a little work
from the guide:
Category 1 - Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
I’m confused. Is this supposed to be a startup landscape or tool landscape? If the latter then I’d include cloud providers, BigQuery and compute providers.