I like the author's articles most of the time. While this article contains some truths, I don't think it argues very persuasively for its conclusion. Okay, these parts of the Python ecosystem don't work well together, and it's a bad, unpolished experience. Fair, as with other criticisms of Python.
The question, however, is why one would use gevent at this point in Python's evolution. There's async await now, and things like FastAPI. If you want to use, say, the Django ecosystem, use Nginx and uWSGI and be done with it. Maybe you need to spend some more resources to deploy your Python. Okay. Is that a problem? Why are you using Python? Is it because it's quick to use and helps you solve problems faster with its gigantic, mature ecosystem that lets you focus on your business logic? Then this, while admittedly not great, is going to be a rounding error. Is it because you began using it in the aforementioned case and now you're boxed into an expensive corner and you need to figure out how to scale parts of your presumably useful production architecture serving a Very Useful Application?
Maybe you need to start splitting up your architecture into separate services, so that you can use Python for the things that it does well and use some other technology for the parts that aren't I/O bound and could benefit from that. But that's not this article is about. This article is about someone making the wrong choices when better choices existed and then making a categorical decision against using Python for a service. I'd say that's what "we have to talk about" if you ask me.
I've been working on a legacy internal python system that suffers from most of the complaints here (and in the excellent COST paper Rachel links at the bottom).
The problems alluded to are, yes, solvable in python. But they also seem endemic in python systems.
When everyone who uses the tool uses it wrong, maybe it's not the user's fault.
(That said, I generally do think there's a time and place for python systems or web apps. That time is generally when speed and maintainability is significantly less important than flexibility)
> The problems alluded to are, yes, solvable in python. But they also seem endemic in python systems.
>
> When everyone who uses the tool uses it wrong, maybe it's not the user's fault.
Yes, though that doesn't mean it is necessarily the code's fault.
Honestly, I was very confused by this article, because I thought everyone understood what was going on, the trade-offs involved, and how that ought to impact your design decisions.
It's not that Gevent'd Gunicorn is intrinsically a bad thing. You're going for cooperative multi-tasking/concurrency, so no preemptive multi-tasking support. This creates potential challenges with fair scheduling if you have real-time constraints like timeouts... so you design accordingly.
One of the advantages of this model is you do indeed need less memory (and often a little less CPU) to handle high load levels. It's not like you are intrinsically better off if you use Python in a forking model. You can still end up so CPU bound that you timeout handling requests... the only difference is you'll get fairer splitting of the CPU's time across tasks. It can actually get worse if you get lost in an infinite series of context switches (yes, there are ways to mitigate this problem... although they can create fair scheduling problems... it's a natural tension), or worse still, start swapping.
If the notion that running out of CPU might mean you have timeouts hasn't occurred to you...
> When everyone who uses the tool uses it wrong, maybe it's not the user's fault.
I'm not the GP, but I guess that a tool that is
> quick to use and helps you solve problems faster with its gigantic, mature ecosystem that lets you focus on your business logic
Can never cover all bases perfectly, and is generally great when starting out, but ultimately not built to be very forgiving when grown too much.
> now you're boxed into an expensive corner and you need to figure out how to scale
When you get to this point, and the requirements start to be more focused on performance, then it's time to start switching Python out. That does not devaule Python in the earlier stages of development and operation.
The point being that Python is the right tool for getting stuff working quick, not to getting stuff executing quick.
Agreed on both (a) I usually like the author’s articles and (b) think she’s missing the point on this one.
gevent and gunicorn were good attempts to remedy a bad situation. async/await is the solution that the Python community is coalescing around. Even with Django, there are active efforts to support ASGI. [1]
Gevent was doing it right and async syntax was a huge mistake that fractioned community-contributed libraries into two incompatible camps with lots of unnecessary cloning happening at present moment.
In high-level languages with virtual machines and/or garbage collectors, the runtime system should be solely responsible for scheduling green threads around IO entry points, all without special syntactic markers. GHC has it right (https://www.aosabook.org/en/posa/warp.html), Gevent was a right development with on-par async performance metrics (https://gist.github.com/rfyiamcool/41d4004b7fd46516d0b4f34f6...), that had a standard synchronous coding style. It could be adopted into the core language and improved further without splitting the community.
I have run Python in its traditional synchronous form, using gevent, and with the more recent async/await syntax. I don’t hold this opinion strongly, but do lean towards async/await syntax for the sake of explicit is better than implicit [1]. Node.js which was asynchronous from the start also separates async from sync explicitly with, for example, distinct fs.readFile() and fs.readFileSync() functions [2].
(Edit: Commenting only on clarity of syntax. Those performance metrics are interesting and I’ve admittedly never hit a scale where the difference has a practical impact.)
> I don’t hold this opinion strongly, but do lean towards async/await syntax for sake of explicit is better than implicit
I guess it's a question of where the line that defines "too implicit" should be drawn. I'm totally fine with implicit gevent yields, yet sometimes when I need to do heavy Python meta-programming, I wish things were more explicit around language semantics, namely everything around inheritance handling inside meta-classes (for instance, see the current implementation of enum.Enum).
What is the basis for that assertion? The "high-level VM with green threads" approach has been tried for a long time - most prominently, Java - and it just doesn't seem to stick.
For Python especially, it is problematic because it is a glue language more often than not, and VM-specific green threads are not good for cross-language interop. When you have promises and async/await around them, at ABI level it can all be mapped to a simple callback, which any language that has C FFI can handle. When you have green threads, every language in the picture has to be aware of them - and god forbid you have two different VMs with different notions of green threads interacting.
The fact it's implemented in a runtime system I use nowadays
> For Python especially, it is problematic
Shall we say it's a complex task instead of a problematic case?
> The "high-level VM with green threads" approach has been tried for a long time - most prominently, Java - and it just doesn't seem to stick.
afaik, it didn't stick because JNI related to green threads needed to be scalable on SMP, while the runtime implementation used a single thread, and then a decision was made to move to native threads, which doesn't necessarily indicate any inherent issues with the VM-managed green threads (and CPython specifically cannot utilise SMP with its threads anyways). At least, this was mentioned in https://www.microsoft.com/en-us/research/publication/extendi... (Section 7).
> When you have promises and async/await around them, at ABI level it can all be mapped to a simple callback, which any language that has C FFI can handle.
Why a VM wouldn't be able register those callbacks and bound them to a concrete OS thread when it knows that an FFI interop is going to happen? I don't see the point where explicit async/await is needed for it. It may require thread-safety markers (and that's what GHC's FFI interface has - https://wiki.haskell.org/GHC/Using_the_FFI#Improving_efficie...), but that's not the story about the invasive async syntax we have in contemporary Python.
It works nicely in Golang and Haskell. The main issue with Java and Python is that the core runtime developers, reasonably, do not wish to spend a lot of time developing equivalent systems.
I can't speak for Haskell, but inadequate performance of C FFI in Go is routinely mentioned as the reason why the community is so reluctant to wrap existing C libraries, rather than reimplementing them from scratch in Go.
To be completely honest, I don't know much about C interfaces or systems programming in general. Looking at benchmarks Go's FFI does indeed seem to perform pretty poorly. However, as a web dev, I find it works well for the concurrent programming tasks I find myself dealing with.
The ASGI spec came from the Django project as part of their Django Channels work. "There are active efforts to support ASGI in Django" is selling them a bit short, methinks.
I still use gevent, even for brand new projects. I work much faster with it than with async/await, and the performance appears to be comparable. I've tried getting used to async/await, but I find gevent much simpler to work with, in spite of the arguments made in places like https://glyph.twistedmatrix.com/2014/02/unyielding.html.
I wasn't aware of this particular inefficiency, but gevent is still fulfilling its purpose for me very well, and I see no reason to change. I like lightweight threads and thinking in terms of background jobs and dividing up work instead of remembering what things to annotate and when. I use locks if I need predictability. I like Python because I can develop quickly with it, and I can do so even faster with gevent while still getting more than enough performance.
Just because you don't understand the difference between gevent and asyncio, please don't post garbage laundry lists of your flavor of the month stack choices.
It's an amazing library and a very unique way to write cooperatively-scheduled applications. Best of all it works with existing libraries and doesn't require special "asyncio" implementations from top to bottom. It's not a silver bullet, but don't fool yourself that asyncio is because it's been blessed.
I think I understand the difference between gevent and asyncio pretty well. Moreover, it sounds like you understand the difference between community adoption and not, but you're fighting against community adoption with your own opinion of what is a "garbage laundry list" -- okay. You can say that. But, there's a reason the gevent approach is not what the community settled on.
What you call "special" asyncio implementations others would merely call obviously explicit code. Async/await is a powerful syntactic construct. I would never go back to gevent hell after using it.
The question, however, is why one would use gevent at this point in Python's evolution. There's async await now, and things like FastAPI. If you want to use, say, the Django ecosystem, use Nginx and uWSGI and be done with it. Maybe you need to spend some more resources to deploy your Python. Okay. Is that a problem? Why are you using Python? Is it because it's quick to use and helps you solve problems faster with its gigantic, mature ecosystem that lets you focus on your business logic? Then this, while admittedly not great, is going to be a rounding error. Is it because you began using it in the aforementioned case and now you're boxed into an expensive corner and you need to figure out how to scale parts of your presumably useful production architecture serving a Very Useful Application?
Maybe you need to start splitting up your architecture into separate services, so that you can use Python for the things that it does well and use some other technology for the parts that aren't I/O bound and could benefit from that. But that's not this article is about. This article is about someone making the wrong choices when better choices existed and then making a categorical decision against using Python for a service. I'd say that's what "we have to talk about" if you ask me.