I'll give some more detail. concurrent.futures is designed to be a new consistent API wrapper around the functionality in the multiprocessing and threading libraries. One example of an improvement is the API for the map function. In multiprocessing, it only accepts a single argument for the function you're calling so you have to either do partial application or use starmap. In concurrent.futures, the map function will pass through any number of arguments.
The API was designed to be a standard that could be used by other libraries. Before if you started with thread and then realised you were GIL-limited then switching from the threading module to the multiprocessing module was a complete change. With concurrent.futures, the only thing that needs change is:
with ThreadPoolExecutor() as executor:
executor.map(...)
to
with ProcessPoolExecutor() as executor:
executor.map(...)
The API has been adopted by other third-party modules too, so you can do Dask distributed computing with:
with distributed.Client().get_executor() as executor:
executor.map(...)
or MPI with
with MPIPoolExecutor() as executor:
executor.map(...)
> Before if you started with thread and then realised you were GIL-limited then switching from the threading module to the multiprocessing module was a complete change
Is this true?
I've been switching back and forth between multiprocessing.Pool and multiprocessing.dummy.Pool for a very long time. Super easy, barely an inconvenience.
The API was designed to be a standard that could be used by other libraries. Before if you started with thread and then realised you were GIL-limited then switching from the threading module to the multiprocessing module was a complete change. With concurrent.futures, the only thing that needs change is:
to The API has been adopted by other third-party modules too, so you can do Dask distributed computing with: or MPI with and nothing else need change.This is why I chose to use it to teach my Parallel Python course (https://milliams.com/courses/parallel_python/).