Hacker Newsnew | past | comments | ask | show | jobs | submit | pranay01's commentslogin

open source observability platform - https://github.com/signoz/signoz


SigNoz | US Remote | Forward Deployed Engineer, DevRel and Growth Marketing | Full-Time | $120K-$200K

SigNoz is an open source observability platform based natively on OpenTelemetry. We developers monitor their applications & infrastructure, and troubleshoot problems quickly. https://github.com/SigNoz/signoz

We have crossed 23000+ Github stars, 6000+ members in the slack community and 150+ contributors.

We are expanding our US team & hiring for the following roles:

Forward Deployed Engineer - https://jobs.ashbyhq.com/SigNoz/8f0a2404-ae99-4e27-9127-3bd6...

DevRel Engineer - https://jobs.ashbyhq.com/SigNoz/8447522c-1163-48d0-8f55-fac2...

Growth Marketing - https://jobs.ashbyhq.com/SigNoz/a3644d27-3dfc-44cb-962d-5f27...

if interested, reach out directly to us at hiring@signoz.io. Mention that you saw this post in HN


like that it is based on OTel. can you share the project if it is public?


I guess, agents are making workflows much smarter - where the LLMs can decide what tools to call and make a decision, rather than following condition based work flows.

Agents are not that different than what lot of us are already doing. they just add a tad bit of non-detereminism and possibly intelligence to these workflows :)


looks like everyone is just BS ing like this CTO person. AI seems ot have attracted the most toxic ppl.


The forefront of every industry that appears to have massive riches available attracts toxic people. Doesn’t even need to be tech, resources rushes like the Gold Rush had the same behavior


one of the cases we have observed is that Phoenix doesn't completely stick to OTel conventions.

More specifically, one issue I observed is how it handles span kinds. If you send via OTel, the span Kinds are classified as unknown

e.g. The Phoneix screenshot here - https://signoz.io/blog/llm-observability-opentelemetry/#the-...


Phoenix ingests any opentelemetry compliant spans into the platform, but the UI is geared towards displaying spans whose attributes adhere to “openinference” naming conventions.

There are numerous open community standards for where to put llm information within otel spans but openinference predates most of em.


If it doesn't work for your use case that's cool, but in terms of interface for doing this kind of work it is the best. Tradeoffs.


I’ve found phoenix to be a clunky experience and have been far happier with tools like langfuse.

I don’t know how you can confidently say one is “the best”.


Curious what you prefer from langfuse over Phoenix!


Sorry for the delayed response!

The main thing was wrestling with the instrumentation vs the out of the box langfuse python decorator that works pretty well for basic use cases.

It’s been a while but I also recall that prompt management and other features in Phoenix weren’t really built out (probably not a goal for them, but I like having that functionality under the same umbrella).


Spans labeled as 'unknown' when I definitely labeled them in the code is probably the most annoying part of Phoenix right now.


Yes, it is happening because OpenInference assumes these span kind values https://github.com/Arize-ai/openinference/blob/b827f3dd659fc...

Anything which doesn't fall in other span kinds is classified as `unknown`

For reference, these are span kinds which opentelemetry emits - https://github.com/open-telemetry/opentelemetry-python/blob/...


> leveraged for agent platforms & orchestration

can you share more on what you mean by this?


Claude Code Agents can be integrated into existing platforms such as github. I can envision agents automatically handling issues with certain tags, or doing pull request reviews, or other such similar trigger based behaviour.

In that kind of orchestration this observability would be invaluable.


interesting. So, you mean say if an agent is working on automatically doing a PR review, how many such calls to agents are failing, how much time they are taking, etc?

Lot of this you can do with traces today which trace AI specific calls


I don't think the primary goal here is "surveillance" but better understanding where in the team are tools like claude code getting adopted, what models are being used, are there best practices to learn in token usage which could make it more efficient


great to hear. yes, it can help understand how developers are using Claude Code and also optimise token usage etc.


SigNoz maintainer here. Curious, when did you try SigNoz (which version/which timeframe) and any specific feedback on what you don't like about it's tracing UI? Would be helpful for us to understand areas to improve on


Sigh. I have _plenty_ of gripes that would be easy to fix. My first "sniff test" for observability platforms is a tool to quickly jump to a given trace/span by ID. You don't have it. Uptrace has it.

Another issue is the complexity of switching between filtered views. A very useful primitive that you and Uptrace are missing: "show this event within the surrounding context". CloudWatch has it.

The other main overarching issue is ease of navigation and switching between contexts. You are actually somewhat better than Uptrace because I can actually cut&paste URLs on most of the pages and send them to my colleague over Slack.

But you make up for that by having bad search in traces (e.g. I can't just search all the traces with the word "UploadDoc" somewhere in them). Here's how Uptrace works: https://imgur.com/a/UWSdIEt

Your "Trace View" is ridiculous: I can't resize columns, I can't drag them to change the order, I can't even _show_ additional columns even though I can sort by them: https://www.loom.com/share/d5fa401384d94959978c0bb2be9010a5?...

Then you also are freaking annoying with the UI. I don't even care about everything getting extra-bloated. It's just par for the course for the modern UI vibe-based design.

But I get almost physically sick from these ridiculous popups: https://www.loom.com/share/21f5efdae8b84b12ba09c45cd2fa0855?...

Honestly, I think that most observability stacks (very much including SigNoz) are focusing on looking hip with cool dashboards. They totally suck when I need to dig deep into logs to find what happened.


thanks for the detailed note

> My first "sniff test" for observability platforms is a tool to quickly jump to a given trace/span by ID.

You should be able to do this in SigNoz https://www.loom.com/share/71a2a95b76584b3983d9eeebb60ac420?...

> "show this event within the surrounding context"

we have this in the context logs. does this solve your use case or you mean something else? https://www.loom.com/share/9039afd5c4bf45e7b357a22c9943bb32?...

>But you make up for that by having bad search in traces

Did you mean for this to search across all attributes in spans or when you know which attribute you want to search in? If later, than you can do this through our query builder even today.

Your feedback on "Trace View" is fair. We are planning some improvements on that


you might want to take a look at SigNoz - https://github.com/SigNoz/signoz logs, metrics & traces in a single pane and you can create advanced alerts and dashboards as well

PS: I am one of the maintainers


I was able to setup SigNoz on the order of five minutes to view traces in my Dagger builds locally just by exporting the right env vars — it was nice to not have to run and orchestrate three+ tools together


Having to deploy and manager Clickhouse put me off a little - OpenObserve uses it's own binary format, so there are zero external dependencies, making it super easy to setup. Maybe Clickhouse is nothing to worry about, but it's something I've never used before.

Also, I wasn't sure if Zookeeper was mandatory even for a single-server SigNoz install?

SigNoz UI certainly looks more polished tho!


ClickHouse doesn't use ZooKeeper anymore, and if you're just using a single server you don't need to worry about coordination :)

ClickStack/HyperDX is a polished OOTB stack that has an all in one image you can deploy to get started, so you don't need to worry about the ClickHouse side until you need to really scale (which is where ClickHouse really shines).


Interesting! Does HyperDX handle logs as well as metrics and traces?


Yeah, handles all the OTel signals


Yeah metrics, logs, traces and can do session replay


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: