Removing GIL only increases complexity.

bunderbunder · 2024-10-07T16:07:38 1728317258

But they've worked very hard at shielding most users from that complexity. And the end result - making multithreading a truly viable alternative to multiprocessing for typical use cases - will open up many opportunities for Python users to simplify their software designs.

I suppose only time will tell if that effort succeeds. But the intent is promising.

dangom · 2024-10-07T18:49:28 1728326968

Do you have any references or examples that describe how this simplification would come about? Would love to learn more about it.

bunderbunder · 2024-10-07T21:21:35 1728336095

See the motivation section of the PEP for a summary with citations for further reading: https://peps.python.org/pep-0703/

dagmx · 2024-10-07T15:38:13 1728315493

For the runtime, but not the language

pkkm · 2024-10-07T16:50:24 1728319824

It definitely does, but don't you think that it could be worth it if it makes multithreading usable for CPU-heavy tasks?

tightbookkeeper · 2024-10-07T16:55:55 1728320155

No. Python is orders of magnitude slower than even C# or Java. It’s doing hash table lookups per variable access. I would write a separate program to do the number crunching.

Everyone must now pay the mental cost of multithreading for the chance that you might want to optimize something.

zbentley · 2024-10-07T17:26:56 1728322016

> It’s doing hash table lookups per variable access.

That hasn't been true for many variable accesses for a very long time. LOAD_FAST, LOAD_CONST, and (sometimes) LOAD_DEREF provide references to variables via pointer offset + chasing, often with caches in front to reduce struct instantiations as well. No hashing is performed. Those access mechanisms account for the vast majority (in my experience; feel free to check by "dis"ing code yourself) of Python code that isn't using locals()/globals()/eval()/exec() tricks. The remaining small minority I've seen is doing weird rebinding/shadowing stuff with e.g. closures and prebound exception captures.

https://github.com/python/cpython/blob/10094a533a947b72d01ed...

So too for object field accesses; slotted classes significantly improve field lookup cost, though unlike LOAD_FAST users have to explicitly opt into slotting.

Don't get me wrong, there are some pretty regrettably ordinary behaviors that Python makes much slower than they need to be (per-binding method refcounting comes to mind, though I hear that's going to be improved). But the old saw of "everything is a dict in python, even variable lookups use hashing!" has been incorrect for years.

tightbookkeeper · 2024-10-07T17:36:37 1728322597

Thanks for the correction and technical detail. I’m not saying this is bad, it’s just the nature of this kind of dynamic language. Productivity over performance.

pkkm · 2024-10-07T21:07:55 1728335275

> Everyone must now pay the mental cost of multithreading for the chance that you might want to optimize something.

I'm assuming that by "everyone" you mean everyone who works on the Python implementation's C code? Because I don't see how that makes sense if you mean Python programmers in general. As far as I know, things will stay the same if your program is single-threaded or uses multiprocessing/asyncio. The changes only affect programs that start threads, in which case you need to take care of synchronization anyway.

int_19h · 2024-10-07T17:25:33 1728321933

Python doesn't do hash table lookups for local variable access. This only applies to globals and attributes of Python classes that don't use __slots__.

The mental cost of multithreading is there regardless because GIL is usually at the wrong granularity for data consistency. That is, it ensures that e.g. adding or deleting a single element to a dict happens atomically, but more often than not, you have a sequence of operations like that which need to be locked. In practice, in any scenario where your data is shared across threads, the only sane thing is to use explicit locks already.

kstrauser · 2024-10-07T17:48:16 1728323296

> No. Python is orders of magnitude slower than even C# or Java.

That sounds like a fantastic reason to make it run faster on the multi-core CPUs we're commonly running it on today.

tightbookkeeper · 2024-10-07T18:15:25 1728324925

The cost to write and debug multithreaded code is high and not limited to the area you use it. And for all that you get a 2-8x speed up.

So if you care about performance why are you writing that part in python?

> multi-core CPUs we're commonly running it on today.

If you spawn processes to do work you get multi core for free. Think of the whole system, not just your program.

kstrauser · 2024-10-07T19:06:13 1728327973

Pretend for a second that I'm in a setting where:

1. The whole system is dedicated to running my one program, 2. I want to use multi threading to share large amounts of state between workers because that's appropriate to my specific use case, and 3. A 2-8x speedup without having to re-write parts of the code in another language would be fan-freaking-tastic.

In other worse, I know what I'm doing, I've been doing this since the 90s, and I can imagine this improvement unlocking a whole lot of use cases that've been previously unviable.

tightbookkeeper · 2024-10-07T20:06:16 1728331576

You can imagine that situation. But all python code is now impacted to support that case.

Having a “python only” ecosystem makes about as much sense as a “bash only” ecosystem. Your tech stack includes much more.

> In other worse, I know what I'm doing, I've been doing this since the 90s

ditto. So that’s not relevant.

kstrauser · 2024-10-07T20:51:48 1728334308

Sounds like a lot of speculation on your end because we don't have lots of evidence about how much this will affect anything, because until just now it's not been possible to get that information.

> ditto. So that’s not relevant.

Then I'm genuinely surprised you've never once stumbled across one of the many, many use cases where multithreaded CPU-intensive code would be a nice, obvious solution to a problem. You seem to think these are hypothetical and my experience has been that these are very real.

tightbookkeeper · 2024-10-07T21:55:57 1728338157

> Sounds like a lot of speculation on your end

This issue is discussed extensively in “the art of Unix programming” if we want to play the authority and experience game.

> multithreaded CPU-intensive code would be a nice, obvious solution to a problem

Processes are well supported in python. But if you’re maxing your CPU core with the right algorithm then python was probably the wrong tool.

> my experience has been that these are very real.

When you’re used to working one way it may seem impossible to frame the problem differently. Just to remind you this is a NEW feature in python. JavaScript, perl, and bash, also do not support multi threading for similar reasons.

One school of design says if you can think of a use case, add that feature. Another tries to maintain invariants of a system.

EasyMark · 2024-10-08T14:29:03 1728397743

If you’re in a primarily python coding house, your argument won’t mean anything when you bring up you’ll have to rewrite millions of lines of code in C# or Java, you might as well ask them to liquidate the company and start fresh.

EasyMark · 2024-10-08T14:27:06 1728397626

“Make things as simple as possible, but no simpler”, I for one am glad they’ll be letting us use modern CPUs much more easily instead of it being designed around 1998 cpus