More

yujian · on Nov 21, 2023

I'm not sure if I'm missing something from the paper, but are multi-billion parameter models getting called "small" language models now? And when did this paradigm shift happen?

hmottestad · on Nov 21, 2023

All the llama models, including the 70B one can run on consumer hardware. You might be able to fit GPT-3 (175B) at Q4 or Q3 on a Mac Studio, but that's probably the limit for consumer hardware. At 4-bit a 7B model requires some 4GB of ram, so that should probably be possible to run on a phone, just not very fast.

sa-code · on Nov 21, 2023

Gpt 3.5 turbo is 20B

kristianp · on Nov 21, 2023

I doubt that. What's your source?

sa-code · on Nov 21, 2023

There was a paper published by Microsoft that seemed to leak this detail. I'm on mobile right now and don't have a link but it should be searchable

nl · on Nov 22, 2023

The paper was https://arxiv.org/abs/2310.17680

It has been withdrawn with this note:

> Contains inappropriately sourced conjecture of OpenAI's ChatGPT parameter count from this http URL, a citation which was omitted. The authors do not have direct knowledge or verification of this information, and relied solely on this article, which may lead to public confusion

(the noted URL is a just a Forbes blogger with no special qualifications that would make what he claimed particularly credible).

Chabsff · on Nov 21, 2023

Nowadays, small essentially means realistically useable on prosumer hardware.

moffkalast · on Nov 21, 2023

When 175B, 300B, 1.8T models are considered large, 7B is considered small.

nathanfig · on Nov 21, 2023

Relative term. In the world of LLMs, 7b is small.

yujian · on Oct 1, 2023

i'm also on dnd all day, i just don't wanna be disturbed

yujian · on Sept 14, 2023

Anyscale consistently posts great projects. Very cool to see the cost comparison and quality comparison. Not surprising to see that OSS is less expensive, but also rated as slightly lower quality than gpt-3.5-turbo.

I do wonder, is there some bias in quality measures? Using GPT 4 to evaluate GPT 4's output? https://www.linkedin.com/feed/update/urn:li:activity:7103398...

yujian · on Sept 5, 2023

yujian · on Sept 5, 2023

Who's focusing on top 10 recall? I recently saw someone ask if we (I work at Zilliz) can update Milvus' recall to 10 million nearest neighbors lol

yujian · on Aug 1, 2023

Zilliz | Developer Advocate, Solutions Architect | ONSITE | Full Time

Zilliz is a fast-growing startup developing the industry’s leading vector database company for enterprise-grade AI. Founded by the engineers behind Milvus, the world’s most popular open-source vector database, the company builds next-generation database technologies to help organizations quickly create AI applications. On a mission to democratize AI, Zilliz is committed to simplifying data management for AI applications and making vector databases accessible to every organization.

Developer Advocate - https://app.careerpuck.com/job-board/zilliz/job/4031246005

Solutions Architect - https://app.careerpuck.com/job-board/zilliz/job/4256138005

yujian · on July 31, 2023

Interesting graphic, bland and unvoiced conclusion

You're also missing a lot of details. For example, Milvus and Zilliz are actually a little different, check this out for more details: https://github.com/zilliztech/VectorDBBench (of course run it on your own stuff, don't blindly trust companies just because their product is open source)

Also if you want to throw some more comparisons in their checkout elastic search

yujian · on July 31, 2023

I work on Milvus at Zilliz and we encounter people working on LLM companies or frameworks often, I don't ask this question a lot a lot, but it looks like at the moment many companies don't have a real moat, they are just building as fast as they can and using talent/execution/funding as their moat

I've also heard some companies that build the LLMs say that those LLMs are their moat, the time, money, and research that goes into them is high

yujian · on July 11, 2023

Great conceptual explanation of the different primitives in LangChain. Some possible inspiration for your upcoming piece involving vector stores with LangChain - https://milvus.io/blog/conversational-memory-in-langchain.md

yujian · on June 15, 2023

Nice, this is a cool version of ANN search. I like that at the end there is commentary on what's needed for production as well - things like parallelization, RAM considerations, and the consideration for balancing the trees. It's really the production level considerations that would steer you toward a vector database like Milvus/Zilliz