It seems to me that this submission is getting a lot of blowback in the comments for 1) the style and 2) the implication that wiring up Python services with HTTP is bad engineering. I don’t think this is productive.
On the first point, yeah Rachel’s posts are kinda snarky sometimes, but some of us find that entertaining particularly when they are highly detailed and thoroughly researched. I’ve worked with Rachel and she’s among the best “deep-dive” userspace-to-network driver problem solvers around. She knows her shit and we’re lucky she takes the time to put hard-earned lessons on the net for others to benefit from.
As for “microservices written in Python trading a bunch of sloppy JSON around via HTTP” is bad engineering: it is bad engineering, sometimes the flavor of the month is rancid (CORBA, multiple implementation inheritance, XSLT, I could go on). Introducing network boundaries where function calls would work is a bad idea, as anyone who’s dealt seriously with distributed systems for a living knows. JSON-over-HTTP for RPC is lazy, inefficient in machine time and engineering effort, and trivially obsolete in a world where Protocol Buffers/gRPC or Thrift and their ilk are so mature.
Now none of this is to say you should rewrite your system if it’s built that way, legacy stuff is a thing. But Rachel wrote a detailed piece on why you are asking for trouble if you build new stuff like this and people are, in my humble opinion, shooting the messenger.
"JSON-over-HTTP for RPC is lazy, inefficient in machine time and engineering effort, and trivially obsolete in a world where Protocol Buffers/gRPC or Thrift and their ilk are so mature."
Laziness is a virtue in our profession :) huge "citation required" on it being more efficient in engineering effort and most importantly it's ubiquitous and interoperable with any language without having to rely on grpc or thrift implementation and tools for that language/platform.
Re machine time efficiency - to the extent I understood what she was actually complaining about simply none of those issues are attributable to json-over-http being used.
On the digression of efficiency... treating efficiency as something you solve by throwing more machine resources at is causing recognizable problems today.
And the calculus of cost doesn't even always work out the same.
That's a prime example of survivorship bias, projects that spent their runway optimizing for machine resources are not around to have their problems recognized.
I had more than one project where people were cheaper than machines, and going tight on budgets for machine resources led to the project surviving instead of dying.
Hell, a big example of that is Stack Overflow, which runs a very busy site on much less hw than they would have needed otherwise, just by taking high level optimization questions up front.
I can spot, however, kernel-mode HTTP servers (including customized ones), and heavy use of pretty advanced stack with good optimization capabilities. The choices of the stack do make a big impact, something that they have mentioned several times, with the summary of "paying Microsoft licenses paid back very well compared to using popular open-source stacks"
Remember, Performance is also a feature. Both a non-functional one (to reduce your costs) or functional one (to have happier users).
> we’re lucky she takes the time to put hard-earned lessons on the net for others to benefit from.
I genuinely don't see much of a lesson to learn from this particular blogpost, and it appears neither did many others in HN. If there is one, beyond "don't use x", it's hard to find it.
I get the impression that this particular post is being upvoted to the top of HN because of who the author is, not necessarily because this post itself has value. This results in a whole bunch of others reading it, wondering why they're wasting their time with such a rambling post.
2. Remember that green threads tend to have problems with fairness of scheduling.
3. JSON decoding gobbles CPU.
4. Scheduling fairness problems increase response time variance.
4½. Green threads also increase it.
5. Don't forget about retries of timed-out requests into account in protocol design; idempotence is the simplest solution when you can use it.
6. Wake-one semantics to avoid the thundering herd are important for performance when you have multiple threads, and Gunicorn has that thundering herd problem, so you probably don't want to be running it this way on a 64-core box with hyperthreading. (The problem is of course less severe than it was for Apache because the green threads don't thunder.)
7. Gevent uses epoll, not select, poll, or RT signals
8. EAGAIN and SIGPIPE if you didn't know about those. (Somebody is in today's lucky ten thousand.)
9. What kinds of mechanisms “tend to show up given time in a battle-tested [network server] system.”
10. Your systems don't have to be fragile pieces of shit.
I'm not sure whether the person I was replying to is The One to whom all these things are too obvious to be worth mentioning, or if these were too implicit for them to notice, or a combination. Either way, no, thank you for writing it.
The takeaway should be: don't do green threads/event loops for anything that involves any kind of non-trivial processing or even better yet, don't do that unless you really need to do such things (and “better performance” is not valid reason)
One of the people who designed Protobuf has criticized it (Edit: to clarify, one of the authors of v2, see below), so that doesn't really inspire much confidence in it for me.[1] Your general point is correct though, there's much more to a well designed RPC system than what HTTP based systems can do, but protobuf/gRPC is very much lacking in ideas that are decades old at this point, like promise pipelining, etc.
Also, I feel like she (intentionally maybe) is conflating concurrency and parallelism. These "green thread" systems provide concurrency, but not parallelism. That should be something people are aware of when they use them.
> One of the people who designed Protobuf has said it's awful
Hmm, you seem to be citing me. To clarify:
1. I didn't design Protobuf. I just rewrote the implementation (created version 2) and open sourced it.
2. I don't think it's awful. In fact, I think it's best-of-breed for what it is, and certainly a much better choice than JSON-over-HTTP. Yes, in Cap'n Proto I've added a bunch of features which Protobuf doesn't have, like zero-copy and promise pipelining, and I obviously think these features make it better tech. But, to be fair, these ideas make Cap'n Proto more complicated, and whether that complexity is worth it is still very much unproven at the scale that Google uses Protobuf and gRPC/Stubby.
> These "green thread" systems provide concurrency, but not parallelism.
I'm not sure if these specific definitions of "concurrency" and "parallelism" are universal. I wasn't aware of them, at least.
Hi kentonv! I have a tangential question for you, if you don’t mind. I brought up your capnproto with a few friends at work, while chatting about the profiling data of our services (mostly CPU-bound, mostly on protobuf de-/encoding). After convincing ourselves that language-agnostic “zero-cost” requests weren’t completely magical, and that the whole promise thing is very useful, we got to wondering...
Do you think it’s possible, that gRPC/proto could evolve in a non-total-rewrite way to earn the benefits offered by capnproto? I figure you’d be best positioned to answer that kind of question, having worked so intimately on both! :)
We have enjoyed getting to know and using gRPC and proto, I also want to thank you for your work! capnproto is an inspiring solution to a prima facie unsolvable problem, I hope to see it succeed more universally, or at least inspire a proto4. :) Thank you again!
I think Protobuf fundamentally can't achieve zero-copy parsing without changing the underlying encoding. That said, zero-copy parsing only provides a significant real-world benefit in certain use cases. For the use case of RPC over a network -- especially over the internet -- zero-copy parsing has minimal benefit. The places where zero-copy parsing can be a big win are when it means you can mmap() a very large file, or for IPC in shared memory.
On the other hand, Promise Pipelining -- and, more generally, object-capability RPC -- could definitely be added to gRPC. In fact, the very first iteration of Cap'n Proto was a Protobuf-based RPC system that used the same service definition syntax that gRPC now uses. "Cap'n Proto" at the time meant "capabilities and protobuf". (That version of the project was short-lived and shares no code at all with the current Cap'n Proto.)
However, I don't think gRPC is likely to add ocaps unless and until the model proves itself by gaining wide popularity elsewhere. It doesn't make sense for gRPC to take the risk of adding a big, new, experimental feature which they'll be forced to support forever when the demand hasn't been proven yet. Ocaps are gaining a lot of popularity lately (with major new tech like Fuschia and WASI being capability-based) but I think there's still further to go before it would make sense for gRPC to adopt it.
I'm not entirely clear on whether this Github issue would bring feature-parity with Cap'n'proto but apparently Google already has a zero-copy API for Protocol Buffers internally: https://github.com/protocolbuffers/protobuf/issues/1896
I originally wrote the "zero copy" support in proto2, long before I created Cap'n Proto. What Protobuf means by "zero copy" is much more limited than what Cap'n Proto means. Protobuf's "zero copy" applies only to individual string (or "bytes") fields. The effect is that when you call the getter for that field, you get a pointer to the string inside the original message buffer, rather than a copy allocated on the heap. The overall message structure still needs to be parsed upfront and converted into an object tree on the heap (which I count as a copy).
Cap'n Proto is very different. Every Cap'n Proto object is actually a pointer into the original message buffer. Accessing one element of a large array is O(1) -- the previous (and subsequent) elements don't need to be examined at all. Similarly with structs, each field is located at a known fixed offset from the start of the struct, so can be accessed without examining other fields. Protobuf inherently cannot do this; there is no way to know where a field is located without first parsing all previous fields in the same message.
Thanks for the response, I was being a bit tongue in cheek. I don't think you actually think it's awful :) And thanks for clarifying that.
About concurrency vs. parallelism, I think it is fairly standard to think of them as two different concepts that overlap somewhat.
You can have concurrency with parallelism (e.g. pthreads, or M:N threading where you map "green threads" on to processes that can run in parallel). You can also have concurrency without parallelism. The difference between the two is that parallelism can be deterministic, whereas concurrency is always going to be non-deterministic.
> > These "green thread" systems provide concurrency, but not parallelism.
> I'm not sure if these specific definitions of "concurrency" and "parallelism" are universal. I wasn't aware of them, at least.
To be clear, since GP didn't define them: concurrency simulates parallelism through context switching. Context switching itself encompasses both cooperative multitasking (gevent does this) and preemptive multitasking (modern operating system threads when they're sharing a CPU).
AFAIK it is universal, but close enough not to matter in most cases so people get lazy with their words.
Those definitions have been coming into fashion in the last couple of decades. I think it's useful to have the distinction but I wish we had new words that didn't previously mean both things.
I think the main issue is it seems really one-sided and the intent was to be snarky, vs educational. I posted a comment here detailing some ways to work around some of the pitfalls. I think if she devoted more time in the article to solutions vs. complaining, her points would come across more productively.
On the first point, yeah Rachel’s posts are kinda snarky sometimes, but some of us find that entertaining particularly when they are highly detailed and thoroughly researched. I’ve worked with Rachel and she’s among the best “deep-dive” userspace-to-network driver problem solvers around. She knows her shit and we’re lucky she takes the time to put hard-earned lessons on the net for others to benefit from.
As for “microservices written in Python trading a bunch of sloppy JSON around via HTTP” is bad engineering: it is bad engineering, sometimes the flavor of the month is rancid (CORBA, multiple implementation inheritance, XSLT, I could go on). Introducing network boundaries where function calls would work is a bad idea, as anyone who’s dealt seriously with distributed systems for a living knows. JSON-over-HTTP for RPC is lazy, inefficient in machine time and engineering effort, and trivially obsolete in a world where Protocol Buffers/gRPC or Thrift and their ilk are so mature.
Now none of this is to say you should rewrite your system if it’s built that way, legacy stuff is a thing. But Rachel wrote a detailed piece on why you are asking for trouble if you build new stuff like this and people are, in my humble opinion, shooting the messenger.