Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It seems to be the way with these releases, sticking with Claude, at least for the 'hard' tasks. In my agent platform I have LLMs assigned for easy/medium/hard categorised tasks, which was somewhat inspired from the Claude 3 release with Haiku/Sonnet/Opus. GPT4-mini has bumped Haiku for the easy category for now. Sonnet 3.5 bumped Opus for the hard category, so I could possibly downgrade the medium tasks from Sonnet 3.5 to Mistral Large 2 if the price is right on the platforms with only 123b params compared to 405b. I was surprised how much Llama3 405b was on together.ai $5/mil for input/output! I'll stick to Sonnet 3.5. Then I was also surprised how much cheaper Fireworks was at $3/mil

Gemini has two aces up its sleeve now with the long context and now the context caching for 75% reduced input token cost. I was looking at the "Improved Facuality and Reasoning in Language Models through Multi-agent debate" paper the other days, and thought Gemini would have a big cost advantage implementing this technique with the context caching. If only Google could get their model up to the level of Anthropic.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: