Parallel HTTP requests in Python - lmmx/devnotes GitHub Wiki

Edit: thankfully this is now outdated! See Parallel asynchronous GET requests with asyncio instead

  • I've come across this in Go and thought it looked hairy, but the solution in Python seems simple (and old news even!)

  • Will Larson's 2008 blog post shares the following (which I've Py3-ified and replaced urllib with requests):

    from threading import Thread, enumerate
    from requests import get
    from time import sleep
    from functools import reduce
    
    UPDATE_INTERVAL = 0.01
    
    class URLThread(Thread):
        def __init__(self, url):
            super(URLThread, self).__init__()
            self.url = url
            self.response = None
    
        def run(self):
            self.response = get(self.url)
    
    def multi_get(uris, timeout=2.0):
        def alive_count(lst):
            alive = map(lambda x: 1 if x.isAlive() else 0, lst)
            return reduce(lambda a,b: a+b, alive)
        threads = [URLThread(uri) for uri in uris]
        for thread in threads:
            thread.start()
        while alive_count(threads) > 0 and timeout > 0.0:
            timeout = timeout - UPDATE_INTERVAL
            sleep(UPDATE_INTERVAL)
        return [(x.url, x.response) for x in threads]
    
  • simply execute this function and it will return an array of (url, payload) tuples corresponding to the input sites array order once the last request is returned