Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ZeroMQ: High-Performance Concurrency Framework (zeromq.org)
95 points by klaussilveira on July 15, 2024 | hide | past | favorite | 55 comments


ZeroMQ was followed by nanomsg (https://nanomsg.org/) and nng (https://nng.nanomsg.org/)

Some of Martin's rationale here:

https://250bpm.com/blog:23/index.html

ZeroMQ is still widely used and popular, but I am not sure if it is still actively developed.


I've used zeromq, nanomsg, and nng. The differences are subtle and focused on native library support, background threading model, and other systems level things.

All of them are based on specs that are widely published. I've had zero problems implementing real robotic systems in nng, zmq, etc.

And it is so damn easy to use it's amazing to me the whole world doesn't use it.


> it's amazing to me the whole world doesn't use it.

I Googled nng and apparently it's this? [1]

It's written in C so if you click the issues tab the second and third issues are "IPC - Use After Free" and "Setting TLS config option via nng_socket_set_ptr causes access violation if you free config."

Why would you want the world to build other software on this?

[1] https://github.com/nanomsg/nng


Well good to know, but I've never used TLS, since the apps were P2P over a secure overlay, so encrypting payloads was sufficient.


I've had some bad experiences with zeromq in robotic systems contexts: it's very assert-happy, and therefore tends to bring down the whole process in corner cases, and it's quite difficult to debug. It caused me quite a lot of headache and I'm no longer particularly enamored of the approach (the internal architecture is one which makes error propagation very difficult, so even if the individual bugs were fixed there's not a good overall handling strategy).


> And it is so damn easy to use it's amazing to me the whole world doesn't use it.

Perhaps many of them are locked into “the cloud” and “serverless”, by default choosing the proprietary solutions offered by these providers on their platforms?


I'm pretty sure the world doesn't use it more because it doesn't have a flashy marketing campaign and trendy developer tools/libraries don't have a plugin for it. If a whole bunch of developers aren't writing blog posts about it, does it even exist? Plus, it's old, so it's bad.


zmq was pretty dang trendy for a while. It had a well styled website and very wide language support. I think it just failed to deliver on its promise for most users.


I've been looking at these recently for a project.

nng looks promising, but the guide from zmq seemed like a killer feature. It describes all sorts of high level patterns, gotchas, etc.

For nng I mostly found API documentation, which made me a bit more cautious (though to be fair, I've not tried it yet).


I have used ZeroMQ with C, Dlang, and Python -- mostly for learning.

However, I have used NetMQ.. a C# implementation of ZeroMQ in live software and the results are very positive!

I used a Pub-Sub pattern for one program to keep users informed on progress of a task, which could have taken hours to complete. They had a GUI program which spits out updates. It worked really well.

I was also tasked updating a Till software, which the integration of orders to the central system was extremely slow. I used NetMQ which was looking extremely successful but was put on hold due to IT manager not understanding Software Developers -- if something starts taking more than a week to do (which I stated I needed a month) they get itchy and move onto to something else. Sadly, that never got completed.

Now, I have played with NNG and there are some interested articles (or hidden pages from memory) about companring NNG to ZeroMQ - it seems the "patterns" are simplified.

I am currently in the progress of creating bindings for NNG. Seems to be pretty good, so far. I plan to move away from NetMQ (C#) in favour of this language moving forward.

Whether you use ZeroMQ or NNG - I dont think you can go wrong. It is all about the process more than anything, ensuring you do not lose data.


In 2014, I was tasked with rebuilding an event processing engine to increase throughput and performance. Used ZeroMQ with C# and also had a very positive experience.

It was very easy to build a multi-node, distributed event processing engine (think Apache Flink) that could scale by simply adding more nodes or threads. ZMQ makes coordination and management of messages easy and low-fanfare.

In our use case, it was stable and it was the least problematic part of a relatively complex platform.


nice!


I looked at 0mq et al, and what I couldn't understand is why sockets have a single exclusive type. Like say I want a process that generally sends out streaming broadcast updates, but is also controlled through request-reply. It would need two separate sockets. Then I'd have to reinvent some sort of ordering protocol across the two (as well as program logic to handle the partially-connected state), which would defeat much of the point of using 0mq?


Without knowing any details, it sounds like a hard problem whether you use 0mq or not.


The problem generally needs to be solved by the transport protocol. Having two sockets communicating in parallel means the problem needs to be solved again.


news = -setup broadcast socket-

orders = -setup req rep socket-

while 1:

----order = orders.read (with timeout)

----orders.send(ack)

----if order:

--------do stuff differently

----news.send(status)


And then how does a client know whether an `order` was processed before or after a specific `status`? If you start talking about adding sequence numbers or duplicating application data between `status` and `order` (and its reply), that's the creation of an ad-hoc ordering protocol I'm talking about.


that doesn't sound like a socket level problem, it's more like an application level problem? how long does it take to obey-orders()? do you want to keep spitting out logs while doing it? maybe give orders a number, and the logs can include the order number and the progress?


It's a problem that the transport level would normally solve and then provide a single coherent stream to the higher layer. But with 0mq providing multiple parallel streams between each pair of endpoints means that events from each of those streams aren't necessarily received in relative order. And yes depending how long obey-orders() takes, there are actually two instances of this ambiguity per command - when in the log stream did the call start happening, and when in the log stream did it finish. Assigning serial numbers to commands on the request-reply channel and then having the log channel publish those serials is exactly the type of having to invent an ad-hoc ordering protocol that I was talking about. Of course this can be done, but at the expense of adding accidental complexity and unnecessary resource usage.

There's a similar problem with MQTT where the spec explicitly disclaims the ordering of messages across topics. Lots of people just ignore this and write software that works perfectly fine in the real world where messages generally get transported in order anyway. But it's still technically incorrect and strikes me as a breeding ground for Heisenbugs.


> mindslight 21 hours ago | root | parent | next [–]

It's a problem that the transport level would normally solve and then provide a single coherent stream to the higher layer. But with 0mq providing multiple parallel streams between each pair of endpoints means that events from each of those streams aren't necessarily received in relative order

zeromq isn't forcing you to use multiple streams. you can just use one. reqrep, send your command, wait for the response after it's done.


And then how does that client receive the broadcast updates? (`status`)


I would rather use sockets. Not getting errors when a client times out is a bad api design to me. I've used zeromq an only kept it around for IPC.

I may have been doing it wrong, but i personally want to know when clients disconnect/reconnect/etc. the API seems to hide all that from you and your send or recv just block.


Imo, ZMQ is more of an abstraction with which to design protocols, rather than a message queue ready to use, Kafka-style. I found unexpectedly that I needed to really read the entire manual and work through the worked examples and some of my own to start getting it, rather than the usual incremental read.

So, you can set up a protocol using ZMQ such that you become aware when a client times out, and you can set behavior regarding the high water mark, and other things, but you have to actually do it explicitly - it's not required that you do it, because you can choose to maximize throughput or minimize latency instead.

But, whatever protocol behavior / performance you want, you can pretty much build it with ZMQ. In Python, ZMQ was the only "feasible for my not-a-network-engineer-self" to get a system with 100μs latency with sufficient throughput and guarantees (when used for IPC, although not using the IPC transport type). gRPC was a lot less performant for me, granted it would've been more convenient, but the low latency was a hard need.

Although, networks are one of my noobier areas, so I might be blind in many ways here.


I don't see what ZMQ abstracts that you don't get in TCP already.

Is this all just about having a common cross-language API for TCP? Wasn't "BSD sockets" supposed to be that?


For me, the benefit was a lot of convenience.

And, the component was just "a component" and not "the purpose" of what I was building, so went with ZMQ.

I'm also highly inexperienced in that area, so ZMQ having a singular, well-written, linear manual was a huge benefit.


Pretty much this.

I always found it crazy how zmq gained any traction at all.

"Oh, I have a req/resp workload" - one of the sides restarts, goes out of rhythm with the state of the connection (whether its req or resp), unrecoverable errors.

Every system I've seen use zmq usually use it without these fancy patterns (use yeet messages in any order), and usually have some sort of "Is anybody there on the other side?" message to combat the fact there is no way to introspect connection state (otherwise your writes just block at the high water mark), at which point it would have been easier to just use tcp.

The whole thing to me reeks of mongo. It's great if you are completely incapable/incompetent of solving the problem properly.


> at which point it would have been easier to just use tcp

Funny, I always used ZeroMQ as a message-oriented TCP-like protocol that reconnects automatically.

> It's great if you are completely incapable/incompetent of solving the problem properly.

Oof. I think it's great if you don't want to mess with low-level socket details and just want to write a actor-like messaging protocol.


We've used ZMQ rather successfully, but step 1 of ZMQ is don't use REQ/REP at all, ever. It has advantages primarily around throughput and abstracting details across different mediums (which is sometimes actually a detriment).


As a new dev in 2011, I was unsuccessfully making a http worker thing in Python with Zeromq and I was distracted by their docs were encouraging replacing callback architecture with message passing with inproc queues.


Is this criticism still relevant? 2011 is 13 years (as in: a teenager's life) ago.


FWIW most discussion about inproc is around 2010, per $SEARCHENGINE. Guess that architecture didnt catch on.


Appreciate the rosettacode-like (chrestomathy) approach to documentation:

https://zguide.zeromq.org/


NATS is the ZeroMQ of today.


They aren't really the same thing. NATS is a message queue/MOM, ZeroMQ is a smarter abstraction over sockets.


People are replying that NATS isn't the same as ZeroMQ. But, I think the missing piece here is that most people who are looking at ZeroMQ really just want something like NATS. There is a decent amount of effort required to make ZeroMQ work. Whereas, NATS just does the thing you wanted in the first place without all that effort.


They're not even remotely the same thing. You could build NATS on ZeroMQ but not the other way around. ZeroMQ is more like a network/message-passing abstraction library while NATS is a service.


I'd 90% agree with this, except that NATS does have an embedded mode.[0] That plus the clustering, gateway, and leaf-node features gets it very close to what ZeroMQ is doing, though at a much higher complexity level that's maybe unnecessary depending on what you're doing. Embedding is exclusive to Go programs, so that's very limited compared to ZeroMQ.

[0]: https://www.youtube.com/watch?v=cdTrl8UfcBo


Even in embedded mode NATS is doing way more than ZeroMQ is, it's a full-blown message queue system while ZeroMQ at best gives you the tools to build a message queue.


I think https://zenoh.io is more of a successor or replacement.


NATS looks a lot like crossbar.io, what's the differences?


Anything new with ZeroMQ? It's been around for a long time.


The licensing was changed about nine months ago. LGPL-3.0+ is out and MPL 2.0 is in. Very thankful for that.


Why are you thankful for the MPL instead of the LGPL? Is there any advantage to the MPL other than being easier to incorporate MPL code into proprietary software?


Making the software easier to use from a legal point of view was indeed the reason. They explained why they did this here: https://github.com/zeromq/libzmq/issues/2376

Bottom line is that their licensing with a static linking exception was kind of weird and creating a lot of issues combining zeromq code even with other open source licenses (like Apache 2.0).

Interesting to see how they gathered permission to do this from the developer community. License changes like this are usually hard to realize unless you insist on copyright transfers. But in this case they managed to do it without that. So it was a collective decision. Hard to argue with that.


Yes, GPL compatibility -- in particular GPL v2 isn't compatible with LGPL v3, but it is compatible with MPL.

Several projects, including some I work on, only found out how much a mess (L)GPL v2 vs v3 is once important developers had passed away, meaning it's very hard to get out of the resulting mess.


MPL doesn't have an anti-TiVoization clause. The company I'm working for has a complete ban on (L)GPL3.0 source code.


Is the ban because your company does TiVoization? If so, then that sounds like the (L)GPL3.0 is working as intended.


Yes, and yes.

My project is an IoT node that requires secured and auditable software up and down the chain. We can't allow user replacement. We acknowledge that the spirit of LGPL3 is for a reason and it works for a lot of parties, just not us.


How does TiVoization make a product any more auditable?


I always get confused when I see MPL... part of me panics thinking ZeroMQ went with the "Microsoft Public License"

:-)


Similar here... The "look but don't touch" license they used for some things was kind of irksome in practice.


Ah ok. Was wondering why this was on HN front page today. :)


This is probably not the reason.


I've never used ZeroMQ. How does it compare to MPI? I assume it is easier to use?


If an app needs a DB anyway, has ZeroMQ then advantages over a DB based MQ, like PostgreSQL LISTEN/NOTIFY or SQLite with update_hook?

Did anybody compare throughput/latency for these approaches? Edit: ... for the basic zmq patterns PUB/SUB, REQ/REP, Client/Server




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: