Concurrency in Python with Different Functions

I love efficiency! I love the feeling of looking at code that is super optimized. Even if the benefits will never be felt because, to be realistic, even the inefficient version I may have started with will finish in under one tenth of a second on a Raspberry Pi 2 (ha). Yes, there are sayings around over optimizing, premature optimization, great vs. good-enough, etc. It’s still a delight to see code that is as-efficient as a language allows. Maybe it balances the feelings I get driving on the freeway…

But there are times in my various jobs where huge gains can be had by running tasks concurrently. Sometimes I need to run a SQL query across 150 databases. I could save a lot of time if I could run multiple sets-of-test in-parallel instead of in-order. One project I work on has some APIs that need to do a few, slow tasks (send an email, log to a 3rd-party service, process a file) that are not part of the API response** – the User would definitely prefer that these tasks ran concurrently rather than stick around.

** this project will never have any kind of asynchronous queue as the architect really dislikes them, despite use-cases like this

The Usual…

There are a number of really great articles on concurrency from some of my favorite sites/resources like Speed Up Your Python Program With Concurrency on realpython.com or How To Use ThreadPoolExecutor in Python 3 on digitalocean.com. Most everything I see, though, walks through running a single function concurrently. The most common use-case is “getting websites”. An over-simplified version might look like:

import requests
from concurrent.futures import ThreadPoolExecutor, map


def <strong>get_website_content</strong>(url: string):
    return requests.get(url=url, timeout=10)

urls = [
    "https://www.example.com",
    "https://www.other_example.com",
    "https://www.third_example.com",
]

with ThreadPoolExecutor(max_workers=4) as executor:
    results = executor.map(<strong>get_website_content</strong>, urls)
    for item in results:
        print(item)

Look at the links above for good examples of this kind of “run a single function concurrently” need. For example, the get_website_content() function.

The Less Usual…

But that’s not what I need – I need to run different functions concurrently!

So here’s a super-simple example of running different functions concurrently.

import time
from concurrent.futures import ThreadPoolExecutor, wait, as_completed


def first(n):
    time.sleep(n)
    return "first() is done"

def second(n):
    time.sleep(n)
    return "second() is done"

def third(n):
    time.sleep(n)
    return "third() is done"


with ThreadPoolExecutor(max_workers=4) as executor:
    futures = ( 
        executor.submit(first, 2),
        executor.submit(second, 1),
        executor.submit(third, 1),
    )

    results = wait(futures)
    print("Done:")
    for item in results.done:
        print(item.result())
    print("\n\n")
    print("Not Done:")
    for item in results.not_done:
        print(item.result())

    # Or...
    # for done in as_completed(futures):
    #     print(f"The outcome is {done.result()}")

Noticeable differences:

What about the other approaches to concurrency in Python?

There is one, primary reason I like using concurrent.futures: I can simply swap ThreadPoolExecutor and ProcessPoolExecutor and go from concurrent Threads to concurrent Processes with the change of a few letters. That is relevant because:

  1. As all of the documentation will tell you, if the function(s) being run are CPU intensive (which is really never my use-case), then use concurrent Processes. So if I do actually need to run Process-heavy functions, I just write my functions and change the “with ... as executor” line.
  2. Debuggers in the IDEs I use (ex. PyCharm by JetBrains) cannot debug Threads but can debug Processes. That is, if I am using ThreadPoolExecutor and I want to put a breakpoint inside a function, the debugger will not work.

    Fun tip when I want to use concurrent Threads but I want to debug one of the functions: swap-out ThreadPoolExecutor with ProcessPoolExecutor, put a breakpoint inside the function(s) I want to debug, run the code with the debugger enabled, then swap the Executors back.

Leave a comment