Outage for Anthropic's Claude 3.5 Sonnet

saaaaaam · 2024-08-08T16:28:49 1723134529

Ten hours and counting! That's made a lovely mess of my day. Fell back to the Mistral API for some things, but it's (a) much slower, (b) not as good. Reminder to self: have fail-over in place already, rather than having to whip it up on the fly.

maeil · 2024-08-08T17:27:52 1723138072

Same, broke product. Issue is that there is an (albeit small) subset of tasks where Claude is first in class and can't be replaced by any other model without a decrease in quality. Even then, taking away the same lesson here, and leading me to unfortunately consider defaulting to other providers for new tasks, despite generally preferring Claude.

saaaaaam · 2024-08-08T21:43:25 1723153405

I’m not going to be defaulting to other providers for new tasks - just putting a fail-over in place.

Out of interest what small set of tasks do you find Claude to be best for? Because I find it to be significantly better for most things. The only thing I have found it not better at for my use cases is identifying specific pieces of (somewhat specialist) machinery and equipment from images, where I’m still getting stronger results via OpenAI.

maeil · 2024-08-09T05:04:30 1723179870

We mostly do multimodal tasks (vision + text), and there the differences between flagship models are still much bigger. For us, the benchmarks showing all of them being close are pretty meaningless, it really depends on the task when vision is involved.

Our pure text tasks are generally quite simple, so for price+speed reasons those don't use Sonnet but instead Llama 3.0, very-old-version 3.5 Turbo (newer versions are awful) or 4o-mini.

smca · 2024-08-08T16:38:55 1723135135

Sorry for the hassle Sam. We're seeing decreasing error rates on the API now. This was painful -- appreciate your patience while we work through the issue with one of our infra providers.

saaaaaam · 2024-08-08T18:19:39 1723141179

I bet your day was more filled with hassle! But genuinely appreciate the response. One thing it’s really highlighted to me is that you’re miles ahead of the alternative APIs available. It’s also highlighted quite how much I’ve come to rely on not just the API but other little workflows I’ve made in Claude itself. I felt quite bereft for most of the morning.

smca · 2024-08-08T18:31:36 1723141896

Believe me, I missed Sonnet too. Thanks for sharing.

maeil · 2024-08-08T11:11:54 1723115514

Anecdotally, login TOTP emails are also not arriving.