My body is ready. I love python because the ease of writing and logic. Hopefully the more complicated free-threaded approach is comprehensive enough to write it like we traditionally write python. Not saying it is or isn't I just haven't dived enough into python multithreading because it is hard to put those demons back once you pull them out.
The semantic changes are negligible for authors of Python code. All the complexity falls on the maintainers of the CPython interpreter and on authors of native extension modules.
The GIL does not prevent race conditions in your Python code. It only prevents race conditions in internal data structures inside the interpreter and in atomic operations, i.e., operations that take a single Python bytecode. But many things that appear atomic in Python code take more than one Python bytecode. The GIL gives you no protection if you do such operations in multiple threads on the same object.
I think what many will experience, is that they want to switch to multithreading without GIL, but learn they have code that will have race conditions. But that don't have race conditions today, as it's run as multiple processes, and not threads.
For instance our webserver. It uses multiple processes. Each request then can modify some global variable, use as cache or whatever, and only after it's completely done handling the request the same process will serve a new request. But when people see the GIL is gone, they probably would like to start using it. Can handle more requests without spamming processes / using beefy computer with lots of RAM etc.
And then one might discover new race conditions one never really had before.
If you are writing Python code, the GIL can already be dropped at pretty much any point and there isn't much way of controlling when. Iirc, this includes in the middle of things like +=. There are some operations that Python defines as atomic, but, as I recall, there aren't all that many.
In what way is the GIL preventing races for your use case?
It is not about your code, it is about C extensions you are relying on. Without GIL, you can't even be sure that refcounting works reliably. Bugs in C extensions are always possible. No GIL makes them more likely. Even if you are not the author of C extension, you have to debug the consequences.
I mean that, if the GIL didn't prevent races, it would be trivially removable. Races that are already there in people's Python code have probably been debugged (or at least they are tolerated), so there are some races that will happen when the GIL is removed, and they will be a surprise.
The GIL prevents the corruption of Pythons internal structures. It's hard to remove because:
1. Lots of extensions, which can control when they release the GIL unlike regular Python code, depend on it
2. Removing the GIL requires some sort of other mechanism to protect internal Python stuff
3. But for a long time, such a mechanism was resisted by th Python team because all attempts to remove the GIL either made single threaded code slower or were considered too complicated.
But, as far as I understand, the GIL does somewhere between nothing and very little to prevent races in pure Python code. And, my rough understanding, is that removing the GIL isn't expected to really impact pure Python code.
My understanding, is that many extensions will release the GIL when doing anything expensive. So, if you are doing CPU or IO bound operations in an extension _and_ you are calling that operation in multiple threads, even with the GIL you can potentially fully utilize all of the CPUs in your machine.
> removing the GIL isn't expected to really impact pure Python code.
If your Python code assumes it's just going to run in a single thread now, and it is run in a single thread without the GIL, yes, removing the GIL will make no difference.
> If your Python code assumes it's just going to run in a single thread now, and it is run in a single thread without the GIL, yes, removing the GIL will make no difference.
I'm not sure I understand your point.
Yes, singled thread code will run the same with or without the GIL.
My understanding, was that multi-threaded pure-Python code would also run more or less the same without the GIL. In that, removing the GIL won't introduce races into pure-Python code that is already race free with the GIL. (and that relatedly, pure-Python code that suffers from races without the GIL also already suffers from them with the GIL)
Are you saying that you expect that pure-Python code will be significantly impacted by the removal of the GIL? If so, I'd love to learn more.
> removing the GIL won't introduce races into pure-Python code that is already race free with the GIL.
What do you mean by "race free"? Do you mean the code expects to be run in multiple threads and uses the tools provided by Python, such as locks, mutexes, and semaphores, to ensure thread safety, and has been tested to ensure that it is race free when run multi-threaded? If that is what you mean, then yes, of course such code will still be race free without the GIL, because it was never depending on the GIL to protect it in the first place.
But there is a lot of pure Python code out there that is not written that way. Removal of the GIL would allow such code to be naively run in multiple threads using, for example, Python's support for thread pools. Anyone under the impression that removing the GIL was intended to allow this sort of thing without any further checking of the code is mistaken. That is the kind of thing my comment was intended to exclude.
> But there is a lot of pure Python code out there that is not written that way. Removal of the GIL would allow such code to be naively run in multiple threads using, for example, Python's support for thread pools.
I guess this is what I don't understand. This code could already be run in multiple threads today, with a GIL. And it would be broken - in all the same ways it would be broken without a GIL, correct?
> Anyone under the impression that removing the GIL was intended to allow this sort of thing without any further checking of the code is mistaken. That is the kind of thing my comment was intended to exclude.
Ah, so, is your point that removing the GIL will cause people to take non-multithread code and run it in multiple threads without realizing that it is broken in that context? That its not so much a technical change, but a change of perception that will lead to issues?
> This code could already be run in multiple threads today, with a GIL.
Yes.
> And it would be broken - in all the same ways it would be broken without a GIL, correct?
Yes, but the absence of the GIL would make race conditions more likely to happen.
> is your point that removing the GIL will cause people to take non-multithread code and run it in multiple threads without realizing that it is broken in that context?
Yes. They could run it in multiple threads with the GIL today, but as above, race conditions might not show up as often, so it might not be realized that the code is broken. But also, with the GIL there is the common perception that Python doesn't do multithreading well anyway, so it's less likely to be used for that. With the GIL removed, I suspect many people will want to use multithreading a lot more in Python to parallelize code, without fully realizing the implications.
> Yes, but the absence of the GIL would make race conditions more likely to happen.
Does it though? I'm not saying it doesn't, I'm quite curious. Switching between threads with the GIL is already fairly unpredictable from the perspective of pure-Python code. Does it get significantly more troublesome without the GIL?
> Yes. They could run it in multiple threads with the GIL today, but as above, race conditions might not show up as often, so it might not be realized that the code is broken. But also, with the GIL there is the common perception that Python doesn't do multithreading well anyway, so it's less likely to be used for that. With the GIL removed, I suspect many people will want to use multithreading a lot more in Python to parallelize code, without fully realizing the implications.
> Switching between threads with the GIL is already fairly unpredictable from the perspective of pure-Python code.
But it still prevents multiple threads from running Python bytecode at the same time: in other words, at any given time, only one Python bytecode can be executing in the entire interpreter.
Without the GIL that is no longer true; an arbitrary number of threads can all be executing a Python bytecode at the same time. So even Python-level operations that only take a single bytecode now must be protected to be thread-safe--where under the GIL, they didn't have to be. That is a significant increase in the "attack surface", so to speak, for race conditions in the absence of thread safety protections.
(Note that this does mean that even multi-threaded code that was race-free with the GIL due to using explicit locks, mutexes, semaphores, etc., might not be without the GIL if those protections were only used for multi-bytecode operations. In practice, whether or not a particular Python operation takes a single bytecode or multiple bytecodes is not something you can just read off from the Python code--you have to either have intimate knowledge of the interpreter's internals or you have to explicitly disassemble each piece of code and look at the bytecode that is generated. Of course the vast majority of programmers don't do that, they just use thread safety protections for every data mutation, which will work without the GIL as well as with it.)
> if the GIL didn't prevent races, it would be trivially removable
Nobody is saying the GIL doesn't prevent races at all. We are saying that the GIL does not prevent races in your Python code. It's not "trivially removable" because it does prevent races in the interpreter's internal data structures and in operations that are done in a single Python bytecode, and there are a lot of possible races in those places.
Also, perhaps you haven't considered the fact that Python provides tools such as mutexes, locks, and semaphores to help you prevent races in your Python code. Python programmers who do write multi-threaded Python code (for example, code where threads spend most of their time waiting on I/O, which releases the GIL and allows other threads to run) do have to use these tools. Why? Because the GIL by itself does not prevent races in your Python code. You have to do it, just as you do with multi-threaded code in any language.
> Races that are already there in people's Python code have probably been debugged
Um, no, they haven't, because they've never been exposed to multi-threading. Most people's Python code is not written to be thread-safe, so it can't safely be parallelized as it is, GIL or no GIL.
The article states the goal is to eventually (after some years of working out the major kinks and performance regressions) promote Free-Threaded Python to be the default cPython distribution.
What are the common use cases for threading in Python? I feel like that's a lower level tool than most Python projects would want, compared to asyncio or multiprocessing.Pool. JS is the most comparable thing to Python, and it got pretty darn far without threads.
Working with asyncio sucks when all you want is to be able to do some things in the background, possibly concurrently. You have to rewrite the worker code using those stupid async await keywords. It's an obnoxious constraint that completely breaks down when you want to use unaware libraries. The thread model is just a million times easier to use because you don't have to change the code.
Asyncio is designed for things like webservers or UIs where some framework is probably already handling the main event loop. What are you doing where you just want to run something else in the background, and IPC isn't good enough?
Non-blocking HTTP requests is an extremely common need, for instance. Why the hell did we need to reinvent special asyncio-aware request libraries for it? It's absolute madness. Thread pools are much easier to work with.
> where some framework is probably already handling the main event loop
This is both not really true and also irrelevant. When you need a flask (or whatever) request handler to do parallel work, asyncio is still pretty bullshit to use vs threads.
Non-blocking HTTP request is the bread and butter use case for asyncio. Most JS projects are doing something like this, and they don't need to manage threads for it. You want to manage your own thread pool for this, or are you going to spawn and kill a thread every time you make a request?
> Non-blocking HTTP request is the bread and butter use case for asyncio
And the amount of contorting that has to be done for it in Python would be hilarious if it weren't so sad.
> Most JS projects
I don't know what JavaScript does, but I do know that Python is not JavaScript.
> You want to manage your own thread pool for this...
In Python, concurrent futures' ThreadPoolExecutor is actually nice to use and doesn't require rewriting existing worker code. It's already done, has a clean interface, and was part of the standard library before asyncio was.
ThreadPoolExecutor is the most similar thing to asyncio: It hands out promises, and when you call .result(), it's the same as await. JS even made its own promises implicitly compatible with async/await. I'm mentioning what JS does because you're describing a very common JS use case, and Python isn't all that different.
If you have async stuff happening all over the place, what do you use, a global ThreadPoolExecutor? It's not bad, but a bit more cumbersome and probably less efficient. You're running multiple OS threads that are locking, vs a single-threaded event loop. Gets worse the more long-running blocking calls there are.
Also, I was originally asking about free threads. GIL isn't a problem if you're just waiting on I/O. If you want to compute on multiple cores at once, there's multiprocessing, or more likely you're using stuff like numpy that uses C threads anyway.
Again, Python's implementation of asyncio does not allow you to background worker code without explicitly altering that worker code to be aware of asyncio. Threads do. They just don't occupy the same space.
> Also, I was originally asking about free threads...there's multiprocessing
Eh, the obvious reason to not want to use separate processes is a desire for some kind of shared state without the cost or burden of IPC. The fact that you suggested multiprocessing.Pool instead of concurrent_futures.ProcessPoolExecutor and asked about manual pool management feels like it tells me a little bit about where your head is at here wrt Python.
Basically true in JS too. You're not supposed to do blocking calls in async code. You also can't "await" an async call inside a non-async func, though you could fire-and-forget it.
Right, but how often does a Python program have complex shared state across threads, rather than some simple fan-out-fan-in, and also need to take advantage of multiple cores?
The primary thing that tripped me up about async/await, specifically only in Python, is that the called function does not begin running until you await it. Before that moment, it's just an unstarted generator.
To make background jobs, I've used the class-based version to start a thread, then the magic method that's called on await simply joins the thread. Which is a lot of boilerplate to get a little closer to how async works in (at least) js and c#.
Rust's version of async/await is the same in that respect, where futures don't do anything until you poll them (e.g., by awaiting them): if you want something to just start right away, you have to call out to the executor you're using, and get it to spawn a new task for it.
Though to be fair, people complain about this in Rust as well. I can't comment much on it myself, since I haven't had any need for concurrent workloads that Rayon (a basic thread-pool library with work stealing) can't handle.
That is a common split in language design decisions. I think the argument for the python-style where you have to drive it to begin is more useful as you can always just start it immediately but also let's you delay computation or pass it around similar to a Haskell thunk.
I feel you. I know asyncio is "the future", but I usually just want to write a background task, and really hate all the gymnastics I have to do with the color of my functions.
I feel like "asyncio is the future" was invented by the same people who think it's totally normal to switch to a new javascript web framework every 6 months.
JS had an event loop since the start. It's an old concept that Python seems to have lifted, as did Rust. I used Python for a decade and never really liked the way it did threads.
Python's reactor pattern, or event loop as you call it, started with the "Twisted" framework or library. And that was first published in 2003. That's a full 6 years before Node.js was released which I assume was the first time anything event-loopy started happening in the JS world.
I forgot to mention that it came into prominence in the Python world through the Tornado http server library that did the same thing. Slowly over time, more and more language features were added to give native or first-class-citizen support to what a lot of people were doing behind the scenes (in sometimes very contrived abuses of generator functions).
I am a big advocate for ThreadPoolExecutor. I'm saying it's superior to asyncio. The person I'm responding to was asking why use threads when you can use asyncio instead.
So, in Rust they had threading since forever and they are now hyped with this new toy called async/await (and all the new problems it brings), while in Python they've had async/await and are now excited to see the possibilities of this new toy called threads (and all its problems). That's funny!
Yeah I've never liked the async stuff. I've used the existing theading library and it's been fine, for those programs that are blocked on i/o most of the time. The GIL hasn't been a problem. Those programs often ran on single core machines anyway. We would have been better off without the GIL in the first place, but we may be in for headaches by removing it now.
It’s hard to say because we’ve come up with a lot of ways to work around the fact that threaded Python has always sucked. Why? Because there’d been no demand to improve it. Why? Because no one used it. Why? Because it sucked.
I’m looking forward to seeing how people use a Python that can be meaningfully threaded. While It may take a bit to built momentum, I suspect that in a few years there’ll be obvious use cases that are widely deployed that no one today has even really considered.
Maybe a place to look for obvious use cases is in other languages. JS doesn't have threads, but Swift does. The reason I can't think of one is, free threads are most useful for full parallelism that isn't "embarrassingly parallel," otherwise IPC does fine.
So far, I've rarely seen that. Best example I deal with was a networking project with lots of communication across threads, and that one was too performance-sensitive to even use C++, let alone Py. Other things I can think of are OS programming which again has to be C or Rust.
That's the kind of thing I stumble across all the time. Indexing all the symbols in a codebase:
results = Counter()
for file in here.glob('*.py'):
symbols = parse(file)
results.update(symbols)
Scanning image metadata:
for image in here.glob('*.png'):
headers = png.Reader(image)
...
Now that I think about it, most of my use cases involve doing expensive things to all the files in a directory, but in ways where it'd be really sweet if I could do it all in the same process space instead of using a multiprocessing pool (which is otherwise an excellent way to skin that cat).
I've never let that stop me from getting the job done. There's always a way, and if we can't use tool A, then we'll make tool B work. It'll still be nice if it pans out that decent threading is at least an option.
These are "embarassingly parallel" examples that multiprocessing is ok for, though. There was always the small caveat that you can't pickle a file handle, but it wasn't a real problem. Threads are more useful if you have lots of shared state and mutexes.
I think these examples would also perform well with GIL'd threads, since the actual Python part is just waiting on blocking calls that do the expensive work. But maybe not.
> Threads are more useful if you have lots of shared state and mutexes.
That's what always kicks me in such things. If the processes are truly completely separable, awesome! It never seems like they are as much as I wish they were.