I've currently done this, and I'm seriously considering *undoing* it in favor of...

bamboozled · on Aug 9, 2024

Isn’t the problem here that your code is crashing and you’re relying on the wrong tool to help you solve that ?

JoshTriplett · on Aug 9, 2024

Logging solves this problem. If OTel and observability is attempting to position itself as a better alternative to logging, it needs to solve the problems that logging already solves. I'm not going to use completely separate tools for logging and observability.

Also, "crash" here doesn't necessarily mean "segfault" or equivalent. It can also mean "hang and not finish (and thus not end the span)", or "have a network issue that breaks the ability to submit observability data" (but after an event occurred, which could have been submitted if OTel didn't wait for spans to end first). There are any number of reasons why a span might start but not finish, most of which are bugs, and OTel and tools built upon it provide zero help when debugging those.

phillipcarter · on Aug 10, 2024

OTel logs are just your existing logs, though. If you have a way to say "whoopsie it hung" then this doesn't need to be tied to a trace at all. The only tying to a trace that occurs is when there's active span/trace in context, at which point the SDK or agent you use will wrap the log body in that span/trace ID. Export of logs is independent of trace export and will be in separate batches.

Edit: I see you're a major Rust user! That perhaps changes things. Most users of OTel are in Java, .NET, Node, Python, and Go. OTel is nowhere near as developed in Rust as it is for these languages. So I don't doubt you've run into issues with OTel for your purposes.

growse · on Aug 9, 2024

Can you give an example of an event that's not part of a span / trace?

Spivak · on Aug 9, 2024

Unhandled exceptions is a pretty normal one. You get kicked out to your app's topmost level and you lost your span. My wishlist to solve this (and I actually wrote an implementation in Python which leans heavily on reflection) is to be able to attach arbitrary data to stack frames and exceptions when they occur merge all the data top-down and send it up to your handler.

Signal handlers are another one and are a whole other beast simply because they're completely devoid of context.

growse · on Aug 10, 2024

Two good examples - thank you.

They're icky (as language design / practices) to me precisely because you end up executing context-free code. But I'd probably also just start a new trace in my signal handler / exception handler tagged with "shrug"...

jononor · on Aug 9, 2024

Can't you close the span on an exception?

JoshTriplett · on Aug 9, 2024

See https://news.ycombinator.com/item?id=41205665 for more details.

And even in the case of an actual crash, that doesn't necessarily mean the application is in a state to successfully submit additional OTel data.