Hacker News new | past | comments | ask | show | jobs | submit login
Cohere: The world’s most powerful NLP toolkit (cohere.ai)
41 points by davidbarker on April 8, 2022 | hide | past | favorite | 33 comments



After clicking through endless marketing speak on multiple pages I still wasn't able to find anything remotely substantiating the claim "the world's most powerful'.

As it is running on cloud machines as a SaaS business that might theoretically just mean potentially most available computer cores for a comparable offering.

Or any other BS market g can come up with. At the time of writing there isn't even a footnote or anything to that claim. Companies that do this are on my instant black list. Why should I trust them, when they don't show sources, or anything to back up their claim in an easy to discover way.

Edit before potential answers: Maybe they have something to the claim. Maybe the offering is actually great. I wouldn't know. I wasn't able to somewhat verify the big claim and therefore did not go deeper into the offering. Unsubstantiated marketing speak just rubs me the wrong way.


Thanks, sdoering. Cohere engineer here. Happy to provide some context.

You can can find performance benchmarks here: https://txt.cohere.ai/launch-larger-embed-models#model-compa...

Cohere provides an API to access and finetune large language models (generative models like GPT and text representation/embedding models like BERT). These types of language models empower the majority of the latest developments in natural language understanding and generation.

Your feedback is well taken, we'll work to make these more reachable from the homepage.


I don't know if they have the world's most powerful NLP toolkit (what does that even mean?), but they do have a strong team. I think it's worth a second look.


I don't question their team. It at least sounds mightily impressive (just checked). My problem was, that this Form of marketing claim put me off.

Might be a cultural thing.

I wasn't even interested in spending more time than looking at around 5 pages to see if the claim might be substantiated somewhere. It wasn't obvious so the heuristic is that it is not substantiated.


> I don't question their team. It at least sounds mightily impressive (just checked). My problem was, that this Form of marketing claim put me off.

I'm totally with you, and...

> Might be a cultural thing.

...for me it might actually be a "cultural" thing. I'm from Norway, and in Norway it's illegal for companies to _claim_ that their products are the best, cheapest etc. So to me, whenever someone claims this - like in this case - my BS-instincts immediately kicks in, and I become very skeptical.

Sorry for the digression. :)


I’ve worked with some truly excellent people who came from Theranos.


Their combined total bench press weight is truly outstanding.


thank you


I just want to note the replies to this thread are excessively dismissive and toxic. You may not agree with the wording of their advertising ("world's most powerful NLP toolkit" is marketing speak, sure) but going from that to implying the technical side is "only Min-GPT" is tremendously weird. As someone who works in machine learning and specifically language models this is a team I'm keeping an eye on.

For anyone who wanted more technical discussion re: ML / LM (though the author notes this work "[does] not reflect the architectures or latencies of my employer's models" i.e. it's an exploratory technical breakdown of general model characteristics) I've appreciated the technical write-ups from @kipperrii (ML ops @ Cohere) recently:

- Transformer Inference Arithmetic: https://carolchen.me/blog/transformer-inference-arithmetic/

- Breakdown of H100s for Transformer Inferencing: https://carolchen.me/blog/h100-inferencing/


They should put that content on their website. I also thought that the comments were a bit harsh but then visited the site and was immediately put off myself. They have a really great team and could do a great job of conveying that through content.


I think this post was a trick question that you have in exams. The sentiment in this thread is negative (at the point in time I am posting). You can say whatever you want, they ended up raising a lot of capital from well known VC firms [1] and has few of the smartest guys working there. Does that mean the org is legit, I don't know. Does that mean their claims are valid, I don't know. Jumping into a negative conclusion without doing a proper due diligence is something I am seeing lately.

Criticism is different from downright pessimism. I know founders and entrepreneur should have thick skin, but few people who could be potential customers or investors take these threads and the sentiments in hacker news as grain of salt.

[1] https://www.crunchbase.com/organization/cohere-82b8


Hey all, I'm Aidan! I cofounded Cohere.

Really appreciate the folks hyping up our team heheh

Also the feedback on the marketing copy being shit really resonates. We're going to be launch a refresh pretty soon which I'll make sure reads better.

It would be super helpful to me if some of you here would be willing to take an advanced look and just tear it up with some critical feedback!

Also anytime you have feedback about the product; like model quality, the experience of onboarding and using the product; etc. please shoot me an email!!

I'm at aidan.gomez@cohere.ai

Thanks again all!


BS marketing aside, there is a real market for Cohere, GPT-3 and other large language model APIs. Tonnes of companies want to generate embeddings from text or perform other NLP tasks, without having to hire machine learning experts. Why handle the infrastructure, training data, and model training yourself? Let someone else handle that. POST your text and get an embedding vector back.

NLP is going the same way as the rest of "cloud", towards managed services. Plenty of companies will roll their own ML models. If NLP is a core part of your business it might make sense to hire a machine learning team. If NLP is just one small feature that might be scrapped in 3 months after an A/B test shows negative results, using a managed service is a no brainer.


There is market, I agree, but claiming "most powerful" in relation to embeddings and text generation is hard to believe. Just last week Google published PaLM, a 530B weights LM which seems to top GPT-3.


I went to university with these folks, they worked with Hinton and wrote the transformers paper (T in GPT) - just wanted to mention that there is real tech behind all of the marketing speak


World most powerful data scientist here. I disagree.


The problem with Cohere is that it’s an “all-in” platform. They want to control everything from data collection to training to ops. It’s the most lock-in thing you can do from a data science perspective and intends to compete with Sagemaker.


Cohere (language understanding and generation with large language models) is very different from Sagemaker (general ML platform).

Cohere abstracts training and deploying language models for developers and companies that don't have an army of MLEs to collect billions of training tokens and figure out TPU/GPU training/serving of massive models.

Consider that BERT was published in 2018 and then put into mass production to power Google Search in 2019 [1]. For companies and devs other than big tech, the cost and required knowhow to put these models into production is staggering. Even deploying open source models (which we love) requires overhead in compute and knowhow. Services like Cohere lower the barrier for those who need access to this tech in a managed way.

Generation use cases often don't even involve user data beyond an input prompt. In embedding use cases, the user only sends the text they want embedded and get their vectors in return.

[1] https://blog.google/products/search/search-language-understa...

Edit: Cohere engineer here.


Do you work for Cohere?


Yeah, mentioned it earlier, but added a note to this thread too.


OK, I just don't trust "the world's most".


I wonder, why are you calling yourself the most powerful ?

Any comparison to solutions from Hugging Face or John Snow Labs ?

I know these guys have been claiming SOTA for a few years and don't see how Cohere would be any more powerful, do you provide training ?


This is not to be mistaken with the company https://cohere.io/ that lets you monitor your web users in real-time.


For the discussion forums spam detection use case, any service recommendation (or a open lib model) to detect if a user submitted comment/post is spam/advertisement


Content moderation extends to plenty of sub-problems other than spam. A lot of use cases need detection of different types of online harm, for example (bullying, hate speech...etc). A lot of these cases can be improved by training classifiers based on language models that better understand the context and complexities of language.


This is probably an API gateway fronting Spark NLP and a mini GPT. They have 25 engineers, what do they all do?


They’ve built their own large language model.


This is a SaaS.


S**t-as-a-Service?

  Language models trained on such data encode the hegemonic viewpoint; Jo and Gebru, 2021 detail issues and solutions around this topic in-depth. Enhancing the diversity of our training data is a top priority as we continue to iterate our data collection process.
(Source: https://docs.cohere.ai/data-statement#source-demographics)

Hegemonic viewpoint?? Maybe I'm late for my mutual criticism session.


Gebru.


How modest of them. As of lately, state of the art lasts only one week in NLP.


Well, Jay Alaamar (one of the greats of our field) is working there so I automatically believe in them. Too bad Jay won't connect to me on LinkedIn, and likely played a part in my resume being rejected before the initial phone screen!

My own heros think I am nothing!


Sorry if I've missed your connection request. Please email me at alammar at gmail if I can be of service.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: