I wonder if anyone has utilized trace capabilities like head and tail based sampling _for logging_. You could potentially have the same logic used to determine that logs do not need to be generated because there are no transaction logs ahead of it or before.
Useful capabilities for tracing today that could translate well for logs too. So long as everthing is stitched together, which is what these tracing libraries are responsible for.
I'm not aware of tail-based (i.e., gathering a batch of logs and sampling the group coherently), but some of the sampling techniques mentioned in the post are absolutely used for structured logging systems.
At least with Honeycomb, for logs unassociated with traces, you can use dynamic sampling[0] to significantly cut down on total event volume and bake in good representivity. And when you have logs correlated with traces, like how OTel does it, you can sample all logs correlated to a trace when that trace is also sampled[1]. We've at least also been doing some thinking about what batching up and coherently sampling "likely related but not correlated by ID" groups of logs. It's definitely in the realm of doable and TBH I find it a little strange that people haven't sought these kinds of solutions at large yet. I think most folks still just assume that the only way you can reduce volume is by giving up some notion of representativeness when that's just not true.
Useful capabilities for tracing today that could translate well for logs too. So long as everthing is stitched together, which is what these tracing libraries are responsible for.