Sure, but the scope is quite broad. For example, we might need to explain logs from a relatively rare app or framework. In such cases, a large model could unintentionally have knowledge of it.
Take a look at Coroot [0], which stores logs in ClickHouse with configurable TTL. Its agent can discover container logs and extract repeated patterns from logs [1].
Nik and Anton here - we're building an open-source observability tool that transforms telemetry data into actionable insights.
While there are many excellent observability tools on the market, pinpointing the root cause of an incident often requires manual searching through a sea of metrics, logs, and traces.
Coroot acts as a virtual assistant, conducting system audits just like an experienced engineer would:
- It utilizes telemetry data collected through eBPF to construct a model of the distributed system and understand its topology.
- It traverses the dependency graph and audits every relevant service to identify the root cause.
Our journey has been long, but we've made great strides in learning how to build better models of distributed systems, and improving our agents to gather the right metrics.
Coroot is an open-source product (Apache 2.0), so you can self-host it for free. We charge for a cloud version with AI-based Root Cause Analysis, RBAC, and premium support.
We also plan to convert structured logs into OpenTelemetry attributes [2].
[1] https://demo.coroot.com/p/tbuzvelk/applications/default:Depl... [2] https://github.com/coroot/coroot/issues/490