Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just curious, what business benefit will Databricks get by spending potentially millions of dollars on an open LLM?


Their goal is to always drive enterprise business towards consumption.

With AI they need to desperately steer the narrative away from API based services (OpenAI).

By training LLMs, they build sales artifacts (stories, references, even accelerators with LLMs themselves) to paint the pictures needed to convince their enterprise customer market that Databricks is the platform for enterprise AI. Their blog details how the entire end to end process was done on the platform.

In other words, Databricks spent millions as an aid in influencing their customers to do the same (on Databricks).


Thanks! Why do they not focus on hosting other open models then? I suspect other models will soon catch up with their advantages in faster inference and better benchmark results. That said, maybe the advantage is aligned interests: they want customers to use their platforms, so they can keep their models open. In contrast, Mistral removed their commitment to open source as they found a potential path to profitability.


Commoditize your complements:

https://gwern.net/complement

If Databricks makes their money off model serving and doesn't care whose model you use, they are incentivized to help the open models be competitive with the closed models they can't serve.


At this point it's a cliché to share this article, as much as I love gwern lol.


There is always the lucky 10k.


Today I was one


For that reference in particular, feels like you should really share the link as well:

https://xkcd.com/1053/


Demonstrating you can do it yourself shows a level of investment and commitment to AI in your platform that integrating LLAMA does not.

And from a corporate perspective, it means that you have in-house capability to work at the cutting-edge of AI to be prepared for whatever comes next.


> Demonstrating you can do it yourself shows a level of investment and commitment to AI in your platform that integrating LLAMA does not.

I buy this argument. It looks that's not what AWS does, though, yet they don't have problem attracting LLM users. Maybe AWS already got enough reputation?


It's easier because 70% of the market already has an AWS account and a sizeable budget allocated to it. The technical team is literally one click away from any AWS service.


I may be misunderstanding, but doesn't Amazon have it's own models in the form of Amazon Titan[0]? I know they aren't competitive in terms of output quality but surely in terms of cost there can be some use cases for them.

[0] https://aws.amazon.com/bedrock/titan/


Mistral did what many startups are doing now, leveraging open-source to get traction and then doing a rug-pull. Hell, I've seen many startups be open-source, get contributions, get free press, get into YC and before you know it, the repo is gone.


Well Databricks is a big company with real cash flow, and Mistral is a startup so there is a kinda big difference here.


They do have a solid focus on doing so, it’s just not exclusive.

https://www.databricks.com/product/machine-learning/large-la...


> Why do they not focus on hosting other open models then?

They do host other open models as well (pay-per-token).



Do they use spark for the training?


Mosaic AI Training (https://www.databricks.com/product/machine-learning/mosaic-a...) as it's mentioned in the announcement blog (https://www.databricks.com/blog/announcing-dbrx-new-standard... - it's a bit less technical)


Thanks. Is this open source - i.e. can it be used on my own cluster outside of databricks?


It's an image enhancement measure, if you want. Databricks' customers mostly use it as an ETL tool, but it benefits them to be perceived as more than that.


you can improve your brand for a lot less I just dont understand why they would throw all their chips in a losing race.

Azure already runs on-premise if I'm not mistaken, Claude 3 is out...but DBRX already falls so far behind

I just don't get it.


A lot of enterprise orgs are convinced of two things:

1. They need to train their own LLMs

2. They must fine-tune an LLM to make use of this tech

Now number (1) is almost entirely false, but there are willing buyers, and DB offers some minimal tools to let them live their lies. DBRX proves that it's possible to train an LLM on the DB stack.

Number (2) is often true, although I would say that most orgs skip the absolutely essential first step of prompting a powerful foundation model to get a first version of a product done first (and using evals from that prompting to seed evals for fine-tuning). It's here where DBRX is much more relevant, because it is by all accounts an extremely capable model for fine-tuning. And since it's entirely built by DB, they can offer better support for their customers than they can with Llama or Mistral variants.

More broadly, the strategic play is the be the "enterprise AI company". OpenAI, Anthropic, and Meta are all competing at the consumer level, but nobody's really stuck out as the dominant player for the enterprise space. Arguably OpenAI is the most successful, but that's less about an enterprise focus and just about being wildly successful generally, and they're also still trying to figure out if they want to focus on consumer tech, AGI woo woo stuff, research work, or enterprise stuff. DB also knows that to be an AI company, you also have to be a data company, and they are a data company. So it's a natural strategic move for them.


An increased valuation at IPO later this year.


Instead of spending x by 10^7 of dollars, Databricks could buy databricks.ai, it's for sale.

But really, I prefer to have as many players as possible in the field of _open_ models available.


Databricks is trying to go all-in on convincing organizations they need to use in-house models, and therefore pay they to provide LLMOps.

They're so far into this that their CTO co-authored a borderline dishonest study which got a ton of traction last summer trying to discredit GPT-4: https://arxiv.org/pdf/2307.09009.pdf


I can see a business model for inhouse LLM models: Training a model on the knowledge about their products and then somehow getting that knowledge into a generally available LLM platform.

I recently tried to ask Google to explain to me how to delete sender-recorded voice-message I had created from WhatsApp. I got totally erroneous results back. Maybe it was because that is a rather new feature in WhatsApp.

It would be in the interests of WhatsApp to get accurate answers about it into Google's LLM. So Google might make a deal with them requiring WhatsApp to pay Google for regular updates about the up-to-date features of What's App into Google. The owner of What's App Meta of course is competition to Google so Google may not much care of providing up to date info about WhatsApp in their LLM. But they might if Meta paid them.


Pretraining on internal knowledge will be incredibly inefficient for most companies.

Finetuning makes sense for things like embeddings (improve RAG by defining domain specific embeddings) but doesn't do anything useful for facts


Businesses are already using Azure GPT4 on-premise I believe with good feedback

DBRX does not compete with GPT4 or even Claude 3.


What does borderline dishonest mean? I only read the abstract and it seems like such an obvious point I dont see how its contentious


The regression came from poorly parsing the results. I came the conclusion independently, but here's another more detailed takedown: https://www.reddit.com/r/ChatGPT/comments/153xee8/has_chatgp...

Given the conflict of interest and background of Zaharia, it's hard to imagine such an immediately obvious source of error wasn't caught.


nothing, but they will brag about it to get more money from investors




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: