Just curious, what business benefit will Databricks get by spending potentially ...

ramoz · on March 27, 2024

Their goal is to always drive enterprise business towards consumption.

With AI they need to desperately steer the narrative away from API based services (OpenAI).

By training LLMs, they build sales artifacts (stories, references, even accelerators with LLMs themselves) to paint the pictures needed to convince their enterprise customer market that Databricks is the platform for enterprise AI. Their blog details how the entire end to end process was done on the platform.

In other words, Databricks spent millions as an aid in influencing their customers to do the same (on Databricks).

hintymad · on March 27, 2024

Thanks! Why do they not focus on hosting other open models then? I suspect other models will soon catch up with their advantages in faster inference and better benchmark results. That said, maybe the advantage is aligned interests: they want customers to use their platforms, so they can keep their models open. In contrast, Mistral removed their commitment to open source as they found a potential path to profitability.

cwyers · on March 27, 2024

Commoditize your complements:

https://gwern.net/complement

If Databricks makes their money off model serving and doesn't care whose model you use, they are incentivized to help the open models be competitive with the closed models they can't serve.

youssefabdelm · on March 28, 2024

At this point it's a cliché to share this article, as much as I love gwern lol.

sitkack · on March 28, 2024

There is always the lucky 10k.

josh-sematic · on March 28, 2024

Today I was one

PoignardAzur · on March 29, 2024

For that reference in particular, feels like you should really share the link as well:

https://xkcd.com/1053/

Closi · on March 27, 2024

Demonstrating you can do it yourself shows a level of investment and commitment to AI in your platform that integrating LLAMA does not.

And from a corporate perspective, it means that you have in-house capability to work at the cutting-edge of AI to be prepared for whatever comes next.

hintymad · on March 27, 2024

> Demonstrating you can do it yourself shows a level of investment and commitment to AI in your platform that integrating LLAMA does not.

I buy this argument. It looks that's not what AWS does, though, yet they don't have problem attracting LLM users. Maybe AWS already got enough reputation?

rmbyrro · on March 27, 2024

It's easier because 70% of the market already has an AWS account and a sizeable budget allocated to it. The technical team is literally one click away from any AWS service.

zubairshaik · on March 27, 2024

I may be misunderstanding, but doesn't Amazon have it's own models in the form of Amazon Titan[0]? I know they aren't competitive in terms of output quality but surely in terms of cost there can be some use cases for them.

[0] https://aws.amazon.com/bedrock/titan/

theturtletalks · on March 27, 2024

Mistral did what many startups are doing now, leveraging open-source to get traction and then doing a rug-pull. Hell, I've seen many startups be open-source, get contributions, get free press, get into YC and before you know it, the repo is gone.

antupis · on March 28, 2024

Well Databricks is a big company with real cash flow, and Mistral is a startup so there is a kinda big difference here.

richardw · on March 27, 2024

They do have a solid focus on doing so, it’s just not exclusive.

https://www.databricks.com/product/machine-learning/large-la...

tartrate · on March 27, 2024

> Why do they not focus on hosting other open models then?

They do host other open models as well (pay-per-token).

bobbruno · on March 27, 2024

https://docs.databricks.com/en/machine-learning/foundation-m...

anonymousDan · on March 27, 2024

Do they use spark for the training?

alexott · on March 27, 2024

Mosaic AI Training (https://www.databricks.com/product/machine-learning/mosaic-a...) as it's mentioned in the announcement blog (https://www.databricks.com/blog/announcing-dbrx-new-standard... - it's a bit less technical)

anonymousDan · on March 27, 2024

Thanks. Is this open source - i.e. can it be used on my own cluster outside of databricks?

dhoe · on March 27, 2024

It's an image enhancement measure, if you want. Databricks' customers mostly use it as an ETL tool, but it benefits them to be perceived as more than that.

spxneo · on March 27, 2024

you can improve your brand for a lot less I just dont understand why they would throw all their chips in a losing race.

Azure already runs on-premise if I'm not mistaken, Claude 3 is out...but DBRX already falls so far behind

I just don't get it.

phillipcarter · on March 28, 2024

A lot of enterprise orgs are convinced of two things:

1. They need to train their own LLMs

2. They must fine-tune an LLM to make use of this tech

Now number (1) is almost entirely false, but there are willing buyers, and DB offers some minimal tools to let them live their lies. DBRX proves that it's possible to train an LLM on the DB stack.

Number (2) is often true, although I would say that most orgs skip the absolutely essential first step of prompting a powerful foundation model to get a first version of a product done first (and using evals from that prompting to seed evals for fine-tuning). It's here where DBRX is much more relevant, because it is by all accounts an extremely capable model for fine-tuning. And since it's entirely built by DB, they can offer better support for their customers than they can with Llama or Mistral variants.

More broadly, the strategic play is the be the "enterprise AI company". OpenAI, Anthropic, and Meta are all competing at the consumer level, but nobody's really stuck out as the dominant player for the enterprise space. Arguably OpenAI is the most successful, but that's less about an enterprise focus and just about being wildly successful generally, and they're also still trying to figure out if they want to focus on consumer tech, AGI woo woo stuff, research work, or enterprise stuff. DB also knows that to be an AI company, you also have to be a data company, and they are a data company. So it's a natural strategic move for them.

blitzar · on March 27, 2024

An increased valuation at IPO later this year.

qrios · on March 28, 2024

Instead of spending x by 10^7 of dollars, Databricks could buy databricks.ai, it's for sale.

But really, I prefer to have as many players as possible in the field of _open_ models available.

BoorishBears · on March 27, 2024

Databricks is trying to go all-in on convincing organizations they need to use in-house models, and therefore pay they to provide LLMOps.

They're so far into this that their CTO co-authored a borderline dishonest study which got a ton of traction last summer trying to discredit GPT-4: https://arxiv.org/pdf/2307.09009.pdf

galaxyLogic · on March 27, 2024

I can see a business model for inhouse LLM models: Training a model on the knowledge about their products and then somehow getting that knowledge into a generally available LLM platform.

I recently tried to ask Google to explain to me how to delete sender-recorded voice-message I had created from WhatsApp. I got totally erroneous results back. Maybe it was because that is a rather new feature in WhatsApp.

It would be in the interests of WhatsApp to get accurate answers about it into Google's LLM. So Google might make a deal with them requiring WhatsApp to pay Google for regular updates about the up-to-date features of What's App into Google. The owner of What's App Meta of course is competition to Google so Google may not much care of providing up to date info about WhatsApp in their LLM. But they might if Meta paid them.

BoorishBears · on March 27, 2024

Pretraining on internal knowledge will be incredibly inefficient for most companies.

Finetuning makes sense for things like embeddings (improve RAG by defining domain specific embeddings) but doesn't do anything useful for facts

spxneo · on March 27, 2024

Businesses are already using Azure GPT4 on-premise I believe with good feedback

DBRX does not compete with GPT4 or even Claude 3.

seattleeng · on March 27, 2024

What does borderline dishonest mean? I only read the abstract and it seems like such an obvious point I dont see how its contentious

BoorishBears · on March 27, 2024

The regression came from poorly parsing the results. I came the conclusion independently, but here's another more detailed takedown: https://www.reddit.com/r/ChatGPT/comments/153xee8/has_chatgp...

Given the conflict of interest and background of Zaharia, it's hard to imagine such an immediately obvious source of error wasn't caught.

guluarte · on March 28, 2024

nothing, but they will brag about it to get more money from investors