Best practice with retries with requests

Wednesday, Apr 19, 2017
69 comments Python

tl;dr; I have a lot of code that does response = requests.get(...) in various Python projects. This is nice and simple but the problem is that networks are unreliable. So it's a good idea to wrap these network calls with retries. Here's one such implementation.

The First Hack


import time
import requests

# DON'T ACTUALLY DO THIS. 
# THERE ARE BETTER WAYS. HANG ON!

def get(url):
    try:
        return requests.get(url)
    except Exception:
        # sleep for a bit in case that helps
        time.sleep(1)
        # try again
        return get(url)

This, above, is a terrible solution. It might fail for sooo many reasons. For example SSL errors due to missing Python libraries. Or the URL might have a typo in it, like get('http:/www.example.com').

Also, perhaps it did work but the response is a 500 error from the server and you know that if you just tried again, the problem would go away.



# ALSO A TERRIBLE SOLUTION

while True:
    response = get('http://www.example.com')
    if response.status_code != 500:
        break
    else:
        # Hope it won't 500 a little later
        time.sleep(1)

What we need is a solution that does this right. Both for 500 errors and for various network errors.

The Solution

Here's what I propose:


import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry


def requests_retry_session(
    retries=3,
    backoff_factor=0.3,
    status_forcelist=(500, 502, 504),
    session=None,
):
    session = session or requests.Session()
    retry = Retry(
        total=retries,
        read=retries,
        connect=retries,
        backoff_factor=backoff_factor,
        status_forcelist=status_forcelist,
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    return session

Usage example...


response = requests_retry_session().get('https://www.peterbe.com/')
print(response.status_code)

s = requests.Session()
s.auth = ('user', 'pass')
s.headers.update({'x-test': 'true'})

response = requests_retry_session(session=s).get(
    'https://www.peterbe.com'
)

It's an opinionated solution but by its existence it demonstrates how it works so you can copy and modify it.

Testing The Solution

Suppose you try to connect to a URL that will definitely never work, like this:


t0 = time.time()
try:
    response = requests_retry_session().get(
        'http://localhost:9999',
    )
except Exception as x:
    print('It failed :(', x.__class__.__name__)
else:
    print('It eventually worked', response.status_code)
finally:
    t1 = time.time()
    print('Took', t1 - t0, 'seconds')

There is no server running in :9999 here on localhost. So the outcome of this is...

It failed :( ConnectionError
Took 1.8215010166168213 seconds

Where...

1.8 = 0 + 0.6 + 1.2

The algorithm for that backoff is documented here and it says:

A backoff factor to apply between attempts after the second try (most errors are resolved immediately by a second try without a delay). urllib3 will sleep for: {backoff factor} * (2 ^ ({number of total retries} - 1)) seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.0s, 0.2s, 0.4s, ...] between retries. It will never be longer than Retry.BACKOFF_MAX. By default, backoff is disabled (set to 0).

It does 3 retry attempts, after the first failure, with a backoff sleep escalation of: 0.6s, 1.2s.
So if the server never responds at all, after a total of ~1.8 seconds it will raise an error:

In this example, the simulation is matching the expectations (1.82 seconds) because my laptop's DNS lookup is near instant for localhost. If it had to do a DNS lookup, it'd potentially be slightly more on the first failure.

Works In Conjunction With `timeout`

Timeout configuration is not something you set up in the session. It's done on a per-request basis. httpbin makes this easy to test. With a sleep delay of 10 seconds it will never work (with a timeout of 5 seconds) but it does use the timeout this time. Same code as above but with a 5 second timeout:


t0 = time.time()
try:
    response = requests_retry_session().get(
        'http://httpbin.org/delay/10',
        timeout=5
    )
except Exception as x:
    print('It failed :(', x.__class__.__name__)
else:
    print('It eventually worked', response.status_code)
finally:
    t1 = time.time()
    print('Took', t1 - t0, 'seconds')

And the output of this is:

It failed :( ConnectionError
Took 21.829053163528442 seconds

That makes sense. Same backoff algorithm as before but now with 5 seconds for each attempt:

21.8 = 5 + 0 + 5 + 0.6 + 5 + 1.2 + 5

Works For 500ish Errors Too

This time, let's run into a 500 error:


t0 = time.time()
try:
    response = requests_retry_session().get(
        'http://httpbin.org/status/500',
    )
except Exception as x:
    print('It failed :(', x.__class__.__name__)
else:
    print('It eventually worked', response.status_code)
finally:
    t1 = time.time()
    print('Took', t1 - t0, 'seconds')

The output becomes:

It failed :( RetryError
Took 2.353440046310425 seconds

Here, the reason the total time is 2.35 seconds and not the expected 1.8 is because there's a delay between my laptop and httpbin.org. I tested with a local Flask server to do the same thing and then it took a total of 1.8 seconds.

Discussion

Yes, this suggested implementation is very opinionated. But when you've understood how it works, understood your choices and have the documentation at hand you can easily implement your own solution.

Personally, I'm trying to replace all my requests.get(...) with requests_retry_session().get(...) and when I'm making this change I make sure I set a timeout on the .get() too.

The choice to consider a 500, 502 and 504 errors "retry'able" is actually very arbitrary. It totally depends on what kind of service you're reaching for. Some services only return 500'ish errors if something really is broken and is likely to stay like that for a long time. But this day and age, with load balancers protecting a cluster of web heads, a lot of 500 errors are just temporary. Obivously, if you're trying to do something very specific like requests_retry_session().post(...) with very specific parameters you probably don't want to retry on 5xx errors.

Comments

Post your own comment

Anonymous June 5, 2017

Your first ( ostensibly "horrible") solution works the best for me, the rest is too verbose.

Anonymous September 9, 2023

"robust code is too verbose"

yikes

Anonymous June 27, 2017

"Obivously, if you're trying to do something very specific like requests_retry_session().post(...) with very specific parameters you probably don't want to retry on 5xx errors."

Actually, this wouldn't work with the current solution, retry is not applied for POST by default - it needs to be specifically white listed if it's wanted (bite my ass ;) )

Otherwise, thanks for the great article!

Alexander September 11, 2017

I had to make more than 8 000 requests. My script had been stumbling after several hundreds requests. Your solution—requests_retry_session()—saved my day. Thanks!

Franck December 9, 2017

really cool thx !

Tai February 3, 2018

nice, thank you!

serena February 5, 2018

cool!

Alexander February 21, 2018

This is awesome! Thank you!

Felipe Dornelas March 11, 2018

Networks are unreliable, but TCP is fault-tolerant. The problem is that application servers are unreliable.

Anonymous April 3, 2018

How do I also handle 404 errors with this?

Anonymous December 16, 2018

Exactly, this hangs when hitting 404 errors...

Anonymous August 26, 2020

The author has said - figure out what errors are retry-able and retry those. is 404 retry-able ?!

Kiprono September 15, 2022

Why would you want to retry on 404?

Anonymous April 23, 2018

There is no need to set connect retries, or read retries, total retries takes precedent over the rest of the retries, so set it once there and it works for read, redirect, connect, status retries

Rob May 4, 2018

Awesome! This resolved my issue!

Max May 8, 2018

Pretty cool! Thank you!!!

Alex June 7, 2018

You set status_forcelist, but status kwarg is set to None as default (according to urllib3.util.retry.Retry docs), so retries on bad-statuses-reason will never be made.
Should we specify connect=retries or I have misunderstanding?
P.S. sorry for my english

suineg March 27, 2019

came across this review because I'm getting this problem, how to solve it?

suineg March 27, 2019

I decided so but I'm not sure if it's right because get an error:

Add:
status=3,
method_whitelist=frozenset(['POST'])

err: requests.exceptions.RetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/504 (Caused by ResponseError('too many 504 error responses',))

Jack August 28, 2019

Did you ever sort this?

Pasystem June 25, 2018

Good job!!! Thank you!!!

Miro Hrončok September 12, 2018

Googled this and it worked like a charm! Thank You.

Alex December 8, 2018

how to use proxy?

Chris January 12, 2020

I use this:

# sample proxy, does not work, set your own proxy
proxies = {"https": "https://381.177.84.291:9040"}

# create session
session = self.requests_retry_session()

# get request
response = session.get(url, proxies=proxies)

Alex January 14, 2020

Thank you, this option also works:
resp = requests_retry_session().post(
'http://httpbin.org/post',
proxies= {"http": "http://381.177.84.291:9040"}
)

Ron December 28, 2018

Love it!

However, is there a way to print/log all reponses?
E.g. When it retries 3 times, print the status code of all three requests?

Peter Bengtsson December 28, 2018

I doubt it but requests uses logging. You just need to configure your logging to turn it up so you can see these kinds of things happening.

Jeenge April 11, 2019

Thanks! made my code much more reliable. Thanks for posting this for everyone to use.

evandrix May 14, 2019

there is a typo - "sesssion"

Anonymous June 5, 2019

trying the time out i get
NameError: global name 'time' is not defined

Peter Bengtsson June 6, 2019

You need to inject ‘import time’ first.

OwN June 13, 2019

How do you propose dealing with this situation?

https://stackoverflow.com/questions/56482980/python-requests-not-throwing-an-exception-when-using-session-with-httpadapter

I can't seem to get anyone to respond, and my script is totally broken at the moment.

Anthony Camilo August 12, 2019

Did you ever get an answer, i'm on the same boat.

Brikend Rama August 1, 2019

Thank you for your code snippet. Works great

Jeff Walters August 6, 2019

Excellent solution. Thank you for posting this article/solution! I had no idea that the HTTPAdapters existed. You just saved me a few hours of my life.

DEnilson Grupp Fernandes August 23, 2019

Excellent. Thanks for that

David September 11, 2019

Does not work if requests fails to read a chunked response :(

David September 11, 2019

The following will setup an HTTP server to repro (set the sleep to be greater than your read timeout):
import ssl
from time import sleep
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

PORT = 8001

class CustomHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        print "SLEEP"
        self.send_response(200)
        self.send_header('Transfer-Encoding', 'chunked')
        self.send_header('Content-type', 'text/plain')
        self.end_headers()
        self.wfile.write('Hello')
        sleep(5)
        self.wfile.write(', world!')
        print "WAKE"
        pass

httpd = HTTPServer(("", PORT), CustomHandler)
httpd.socket = ssl.wrap_socket(httpd.socket, keyfile='/home/local/ANT/schdavid/tmp/key.pem', certfile='/home/local/ANT/schdavid/tmp/cert.pem')

try:
    httpd.serve_forever()
except KeyboardInterrupt:
    print
    print 'Goodbye'
    httpd.socket.close()

VJ September 14, 2019

Thanks... it worked well for me. Good Article...

Anonymous October 1, 2019

It was simple but elegant. It covered almost everything. Keep up the good work!

Guillermo Chussir October 4, 2019

Very good idea. I'll try this on my scripts. Thanks!

Anonymous October 16, 2019

How to do mock unit testing on request_retry_session?

Peter Bengtsson October 16, 2019

Do you have to? Also, doesn't that depend greatly on how you mock `requests`?

Shill Shocked October 23, 2019

How could you apply this to the Spotipy library?

Peter Bengtsson October 25, 2019

Use https://pypi.org/project/redo/ instead and watch for certain HTTPErrors

Shill Shocked October 27, 2019

Thanks man. I will look into, so far I'm having luck with catching http_status from exceptions in preliminary testing. I'll see if redo is easier to implement.

Anonymous November 26, 2019

I think you just saved my bachelor thesis

Xavier Bustamante Talavera December 21, 2019

Thank you for this! Networks are unreliable systems, so it is strange this is just not even by default.
I took this + session timeouts to make a mini package: https://pypi.org/project/retry-requests/

Jamie March 11, 2020

Thank you for the blog post, it is very helpful. What is the license of the code in the blog post?

Peter Bengtsson March 12, 2020

No license. Help yourself.

Dirk Bangel September 19, 2020

This mean we can use it for commercial purpose without any restriction?

Peter Bengtsson September 20, 2020

Yes.

Stas September 30, 2020

I hope you would consider giving back too, if you dont do it already :-)

Shad Sterling April 22, 2020

5xx error responses might include a retry-after header, which you should honor. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After

Peter Bengtsson April 22, 2020

Good point! But it belongs to requests.packages.urllib3.util.retry.Retry
Do you know if it supports it already?

Shad Sterling April 24, 2020

It shouldn't. retry-after could give you a date next week, that lib shouldn't hang your script until then.

Daniela June 8, 2020

Great solution! How can it be adapted so that if the request fails for a SSL certificate issue it retries but this time with verify=false. I've also asked on Stackoverflow but receive no reply: https://stackoverflow.com/questions/62258005/requests-retry-with-verify-false-if-sslerror
Thanks for your help.

Peter Bengtsson June 8, 2020

Something like this?

```
from requests.exceptions import SSLError
session = requests_retry_session()
try:
r = session.get(url)
except SSLError:
r = session.get(url, verify=False)
```

Daniela June 8, 2020

Thank you so much Peter. I'll give it a go!

Fernando June 12, 2020

Thank you! I am implementing something like this, but how can I have a 2 minute interval between retries? Should I use timeout = 120?

Swee Tat Lim August 16, 2020

Hi,

I tried your code with the following:

```
def get_session():
    result = requests.Session()
    retries = Retry(
        total=3, connect=3, read=3,
        redirect=3, status=3, backoff_factor=1,
        status_forcelist=[429, 500, 502, 503, 504],
        method_whitelist=[
            "HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"],
    )
    adapter = HTTPAdapter(max_retries=retries)
    result.mount("http://", adapter)
    result.mount("https://", adapter)
    return result
```

In my unit test, I did the following:
```
    @responses.activate
    def test_get_session_500_retry(self):
        responses.add(responses.POST,
                      self.url,
                      status=500,
                      json={'something': 'nothing'}
                      )

        session = get_session()
        session.hooks["response"] = [logging_hook]
        print(f"url({self.url})")
        wait_time = datetime.now() + timedelta(seconds=10)
        r = session.post(self.url, timeout=10)
        waited_time = wait_time - datetime.now()
        self.assertGreaterEqual(waited_time, timedelta(seconds=0))
        self.assertEqual(r.status_code, 500)
        assert responses.assert_call_count(
            self.url, 1) is True
```

The strange part to me is that the assert_call_count is 1 instead of 3 which I set in the config

Peter Bengtsson August 16, 2020

I think all bets are off when you use one of those request/response mocking libs.

Swee Tat Lim August 18, 2020

How do you test reliably that the retries in the call works as expected?

Anonymous September 9, 2023

use something like mockoon (https://mockoon.com) and set up HTTP routes for 200 OK and a couple statuses in your status_forcelist. turn on random responses and you'll see it working in the logs.

Stas September 30, 2020

Very nice piece of code :-)
And thank you for sharing.

Anonymous March 24, 2021

POST not working

Anonymous February 19, 2022

I like your first solution.

Anonymous September 9, 2023

great article! works great inside a custom api wrapper.

DFP December 20, 2023

If the server response is a 503 (Service Unavailable) mainly because an update or maintenance, I would check for the header "Retry-After" and if there I will retry after the those seconds. Hope this helps anyone :)

Go to top of the page

Best practice with retries with requests

The First Hack

The Solution

Testing The Solution

Works In Conjunction With timeout

Works For 500ish Errors Too

Discussion

Comments

Related posts

Works In Conjunction With `timeout`