Their goal is to always drive enterprise business towards consumption.
With AI they need to desperately steer the narrative away from API based services (OpenAI).
By training LLMs, they build sales artifacts (stories, references, even accelerators with LLMs themselves) to paint the pictures needed to convince their enterprise customer market that Databricks is the platform for enterprise AI. Their blog details how the entire end to end process was done on the platform.
In other words, Databricks spent millions as an aid in influencing their customers to do the same (on Databricks).
Thanks! Why do they not focus on hosting other open models then? I suspect other models will soon catch up with their advantages in faster inference and better benchmark results. That said, maybe the advantage is aligned interests: they want customers to use their platforms, so they can keep their models open. In contrast, Mistral removed their commitment to open source as they found a potential path to profitability.
If Databricks makes their money off model serving and doesn't care whose model you use, they are incentivized to help the open models be competitive with the closed models they can't serve.
> Demonstrating you can do it yourself shows a level of investment and commitment to AI in your platform that integrating LLAMA does not.
I buy this argument. It looks that's not what AWS does, though, yet they don't have problem attracting LLM users. Maybe AWS already got enough reputation?
It's easier because 70% of the market already has an AWS account and a sizeable budget allocated to it. The technical team is literally one click away from any AWS service.
I may be misunderstanding, but doesn't Amazon have it's own models in the form of Amazon Titan[0]? I know they aren't competitive in terms of output quality but surely in terms of cost there can be some use cases for them.
Mistral did what many startups are doing now, leveraging open-source to get traction and then doing a rug-pull. Hell, I've seen many startups be open-source, get contributions, get free press, get into YC and before you know it, the repo is gone.
It's an image enhancement measure, if you want. Databricks' customers mostly use it as an ETL tool, but it benefits them to be perceived as more than that.
A lot of enterprise orgs are convinced of two things:
1. They need to train their own LLMs
2. They must fine-tune an LLM to make use of this tech
Now number (1) is almost entirely false, but there are willing buyers, and DB offers some minimal tools to let them live their lies. DBRX proves that it's possible to train an LLM on the DB stack.
Number (2) is often true, although I would say that most orgs skip the absolutely essential first step of prompting a powerful foundation model to get a first version of a product done first (and using evals from that prompting to seed evals for fine-tuning). It's here where DBRX is much more relevant, because it is by all accounts an extremely capable model for fine-tuning. And since it's entirely built by DB, they can offer better support for their customers than they can with Llama or Mistral variants.
More broadly, the strategic play is the be the "enterprise AI company". OpenAI, Anthropic, and Meta are all competing at the consumer level, but nobody's really stuck out as the dominant player for the enterprise space. Arguably OpenAI is the most successful, but that's less about an enterprise focus and just about being wildly successful generally, and they're also still trying to figure out if they want to focus on consumer tech, AGI woo woo stuff, research work, or enterprise stuff. DB also knows that to be an AI company, you also have to be a data company, and they are a data company. So it's a natural strategic move for them.
Databricks is trying to go all-in on convincing organizations they need to use in-house models, and therefore pay they to provide LLMOps.
They're so far into this that their CTO co-authored a borderline dishonest study which got a ton of traction last summer trying to discredit GPT-4: https://arxiv.org/pdf/2307.09009.pdf
I can see a business model for inhouse LLM models: Training a model on the knowledge about their products and then somehow getting that knowledge into a generally available LLM platform.
I recently tried to ask Google to explain to me how to delete sender-recorded voice-message I had created from WhatsApp. I got totally erroneous results back. Maybe it was because that is a rather new feature in WhatsApp.
It would be in the interests of WhatsApp to get accurate answers about it into Google's LLM. So Google might make a deal with them requiring WhatsApp to pay Google for regular updates about the up-to-date features of What's App into Google. The owner of What's App Meta of course is competition to Google so Google may not much care of providing up to date info about WhatsApp in their LLM. But they might if Meta paid them.