tl;dr; I have a lot of code that does response = requests.get(...)
in various Python projects. This is nice and simple but the problem is that networks are unreliable. So it's a good idea to wrap these network calls with retries. Here's one such implementation.
The First Hack
import time
import requests
# DON'T ACTUALLY DO THIS.
# THERE ARE BETTER WAYS. HANG ON!
def get(url):
try:
return requests.get(url)
except Exception:
# sleep for a bit in case that helps
time.sleep(1)
# try again
return get(url)
This, above, is a terrible solution. It might fail for sooo many reasons. For example SSL errors due to missing Python libraries. Or the URL might have a typo in it, like get('http:/www.example.com')
.
Also, perhaps it did work but the response is a 500 error from the server and you know that if you just tried again, the problem would go away.
# ALSO A TERRIBLE SOLUTION
while True:
response = get('http://www.example.com')
if response.status_code != 500:
break
else:
# Hope it won't 500 a little later
time.sleep(1)
What we need is a solution that does this right. Both for 500 errors and for various network errors.
The Solution
Here's what I propose:
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
def requests_retry_session(
retries=3,
backoff_factor=0.3,
status_forcelist=(500, 502, 504),
session=None,
):
session = session or requests.Session()
retry = Retry(
total=retries,
read=retries,
connect=retries,
backoff_factor=backoff_factor,
status_forcelist=status_forcelist,
)
adapter = HTTPAdapter(max_retries=retry)
session.mount('http://', adapter)
session.mount('https://', adapter)
return session
Usage example...
response = requests_retry_session().get('https://www.peterbe.com/')
print(response.status_code)
s = requests.Session()
s.auth = ('user', 'pass')
s.headers.update({'x-test': 'true'})
response = requests_retry_session(session=s).get(
'https://www.peterbe.com'
)
It's an opinionated solution but by its existence it demonstrates how it works so you can copy and modify it.
Testing The Solution
Suppose you try to connect to a URL that will definitely never work, like this:
t0 = time.time()
try:
response = requests_retry_session().get(
'http://localhost:9999',
)
except Exception as x:
print('It failed :(', x.__class__.__name__)
else:
print('It eventually worked', response.status_code)
finally:
t1 = time.time()
print('Took', t1 - t0, 'seconds')
There is no server running in :9999
here on localhost
. So the outcome of this is...
It failed :( ConnectionError Took 1.8215010166168213 seconds
Where...
1.8 = 0 + 0.6 + 1.2
The algorithm for that backoff is documented here and it says:
A backoff factor to apply between attempts after the second try (most errors are resolved immediately by a second try without a delay). urllib3 will sleep for:
{backoff factor} * (2 ^ ({number of total retries} - 1))
seconds. If the backoff_factor is 0.1, then sleep() will sleep for [0.0s, 0.2s, 0.4s, ...] between retries. It will never be longer than Retry.BACKOFF_MAX. By default, backoff is disabled (set to 0).
It does 3 retry attempts, after the first failure, with a backoff sleep escalation of: 0.6s, 1.2s.
So if the server never responds at all, after a total of ~1.8 seconds it will raise an error:
In this example, the simulation is matching the expectations (1.82 seconds) because my laptop's DNS lookup is near instant for localhost
. If it had to do a DNS lookup, it'd potentially be slightly more on the first failure.
Works In Conjunction With timeout
Timeout configuration is not something you set up in the session. It's done on a per-request basis. httpbin makes this easy to test. With a sleep delay of 10 seconds it will never work (with a timeout of 5 seconds) but it does use the timeout this time. Same code as above but with a 5 second timeout:
t0 = time.time()
try:
response = requests_retry_session().get(
'http://httpbin.org/delay/10',
timeout=5
)
except Exception as x:
print('It failed :(', x.__class__.__name__)
else:
print('It eventually worked', response.status_code)
finally:
t1 = time.time()
print('Took', t1 - t0, 'seconds')
And the output of this is:
It failed :( ConnectionError Took 21.829053163528442 seconds
That makes sense. Same backoff algorithm as before but now with 5 seconds for each attempt:
21.8 = 5 + 0 + 5 + 0.6 + 5 + 1.2 + 5
Works For 500ish Errors Too
This time, let's run into a 500 error:
t0 = time.time()
try:
response = requests_retry_session().get(
'http://httpbin.org/status/500',
)
except Exception as x:
print('It failed :(', x.__class__.__name__)
else:
print('It eventually worked', response.status_code)
finally:
t1 = time.time()
print('Took', t1 - t0, 'seconds')
The output becomes:
It failed :( RetryError Took 2.353440046310425 seconds
Here, the reason the total time is 2.35 seconds and not the expected 1.8 is because there's a delay between my laptop and httpbin.org
. I tested with a local Flask server to do the same thing and then it took a total of 1.8 seconds.
Discussion
Yes, this suggested implementation is very opinionated. But when you've understood how it works, understood your choices and have the documentation at hand you can easily implement your own solution.
Personally, I'm trying to replace all my requests.get(...)
with requests_retry_session().get(...)
and when I'm making this change I make sure I set a timeout on the .get()
too.
The choice to consider a 500, 502 and 504 errors "retry'able" is actually very arbitrary. It totally depends on what kind of service you're reaching for. Some services only return 500'ish errors if something really is broken and is likely to stay like that for a long time. But this day and age, with load balancers protecting a cluster of web heads, a lot of 500 errors are just temporary. Obivously, if you're trying to do something very specific like requests_retry_session().post(...)
with very specific parameters you probably don't want to retry on 5xx errors.
Comments
Post your own commentYour first ( ostensibly "horrible") solution works the best for me, the rest is too verbose.
"robust code is too verbose"
yikes
"Obivously, if you're trying to do something very specific like requests_retry_session().post(...) with very specific parameters you probably don't want to retry on 5xx errors."
Actually, this wouldn't work with the current solution, retry is not applied for POST by default - it needs to be specifically white listed if it's wanted (bite my ass ;) )
Otherwise, thanks for the great article!
I had to make more than 8 000 requests. My script had been stumbling after several hundreds requests. Your solution—requests_retry_session()—saved my day. Thanks!
really cool thx !
nice, thank you!
cool!
This is awesome! Thank you!
Networks are unreliable, but TCP is fault-tolerant. The problem is that application servers are unreliable.
How do I also handle 404 errors with this?
Exactly, this hangs when hitting 404 errors...
The author has said - figure out what errors are retry-able and retry those. is 404 retry-able ?!
Why would you want to retry on 404?
There is no need to set connect retries, or read retries, total retries takes precedent over the rest of the retries, so set it once there and it works for read, redirect, connect, status retries
Awesome! This resolved my issue!
Pretty cool! Thank you!!!
You set status_forcelist, but status kwarg is set to None as default (according to urllib3.util.retry.Retry docs), so retries on bad-statuses-reason will never be made.
Should we specify connect=retries or I have misunderstanding?
P.S. sorry for my english
came across this review because I'm getting this problem, how to solve it?
I decided so but I'm not sure if it's right because get an error:
Add:
status=3,
method_whitelist=frozenset(['POST'])
err: requests.exceptions.RetryError: HTTPConnectionPool(host='httpbin.org', port=80): Max retries exceeded with url: /status/504 (Caused by ResponseError('too many 504 error responses',))
Did you ever sort this?
Good job!!! Thank you!!!
Googled this and it worked like a charm! Thank You.
how to use proxy?
I use this:
# sample proxy, does not work, set your own proxy
proxies = {"https": "https://381.177.84.291:9040"}
# create session
session = self.requests_retry_session()
# get request
response = session.get(url, proxies=proxies)
Thank you, this option also works:
resp = requests_retry_session().post(
'http://httpbin.org/post',
proxies= {"http": "http://381.177.84.291:9040"}
)
Love it!
However, is there a way to print/log all reponses?
E.g. When it retries 3 times, print the status code of all three requests?
I doubt it but requests uses logging. You just need to configure your logging to turn it up so you can see these kinds of things happening.
Thanks! made my code much more reliable. Thanks for posting this for everyone to use.
there is a typo - "sesssion"
trying the time out i get
NameError: global name 'time' is not defined
You need to inject ‘import time’ first.
How do you propose dealing with this situation?
https://stackoverflow.com/questions/56482980/python-requests-not-throwing-an-exception-when-using-session-with-httpadapter
I can't seem to get anyone to respond, and my script is totally broken at the moment.
Did you ever get an answer, i'm on the same boat.
Thank you for your code snippet. Works great
Excellent solution. Thank you for posting this article/solution! I had no idea that the HTTPAdapters existed. You just saved me a few hours of my life.
Excellent. Thanks for that
Does not work if requests fails to read a chunked response :(
The following will setup an HTTP server to repro (set the sleep to be greater than your read timeout):
import ssl
from time import sleep
from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
PORT = 8001
class CustomHandler(BaseHTTPRequestHandler):
def do_GET(self):
print "SLEEP"
self.send_response(200)
self.send_header('Transfer-Encoding', 'chunked')
self.send_header('Content-type', 'text/plain')
self.end_headers()
self.wfile.write('Hello')
sleep(5)
self.wfile.write(', world!')
print "WAKE"
pass
httpd = HTTPServer(("", PORT), CustomHandler)
httpd.socket = ssl.wrap_socket(httpd.socket, keyfile='/home/local/ANT/schdavid/tmp/key.pem', certfile='/home/local/ANT/schdavid/tmp/cert.pem')
try:
httpd.serve_forever()
except KeyboardInterrupt:
print
print 'Goodbye'
httpd.socket.close()
Thanks... it worked well for me. Good Article...
It was simple but elegant. It covered almost everything. Keep up the good work!
Very good idea. I'll try this on my scripts. Thanks!
How to do mock unit testing on request_retry_session?
Do you have to? Also, doesn't that depend greatly on how you mock `requests`?
How could you apply this to the Spotipy library?
Use https://pypi.org/project/redo/ instead and watch for certain HTTPErrors
Thanks man. I will look into, so far I'm having luck with catching http_status from exceptions in preliminary testing. I'll see if redo is easier to implement.
I think you just saved my bachelor thesis
Thank you for this! Networks are unreliable systems, so it is strange this is just not even by default.
I took this + session timeouts to make a mini package: https://pypi.org/project/retry-requests/
Thank you for the blog post, it is very helpful. What is the license of the code in the blog post?
No license. Help yourself.
This mean we can use it for commercial purpose without any restriction?
Yes.
I hope you would consider giving back too, if you dont do it already :-)
5xx error responses might include a retry-after header, which you should honor. See https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Retry-After
Good point! But it belongs to requests.packages.urllib3.util.retry.Retry
Do you know if it supports it already?
It shouldn't. retry-after could give you a date next week, that lib shouldn't hang your script until then.
Great solution! How can it be adapted so that if the request fails for a SSL certificate issue it retries but this time with verify=false. I've also asked on Stackoverflow but receive no reply: https://stackoverflow.com/questions/62258005/requests-retry-with-verify-false-if-sslerror
Thanks for your help.
Something like this?
```
from requests.exceptions import SSLError
session = requests_retry_session()
try:
r = session.get(url)
except SSLError:
r = session.get(url, verify=False)
```
Thank you so much Peter. I'll give it a go!
Thank you! I am implementing something like this, but how can I have a 2 minute interval between retries? Should I use timeout = 120?
Hi,
I tried your code with the following:
```
def get_session():
result = requests.Session()
retries = Retry(
total=3, connect=3, read=3,
redirect=3, status=3, backoff_factor=1,
status_forcelist=[429, 500, 502, 503, 504],
method_whitelist=[
"HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE"],
)
adapter = HTTPAdapter(max_retries=retries)
result.mount("http://", adapter)
result.mount("https://", adapter)
return result
```
In my unit test, I did the following:
```
@responses.activate
def test_get_session_500_retry(self):
responses.add(responses.POST,
self.url,
status=500,
json={'something': 'nothing'}
)
session = get_session()
session.hooks["response"] = [logging_hook]
print(f"url({self.url})")
wait_time = datetime.now() + timedelta(seconds=10)
r = session.post(self.url, timeout=10)
waited_time = wait_time - datetime.now()
self.assertGreaterEqual(waited_time, timedelta(seconds=0))
self.assertEqual(r.status_code, 500)
assert responses.assert_call_count(
self.url, 1) is True
```
The strange part to me is that the assert_call_count is 1 instead of 3 which I set in the config
I think all bets are off when you use one of those request/response mocking libs.
How do you test reliably that the retries in the call works as expected?
use something like mockoon (https://mockoon.com) and set up HTTP routes for 200 OK and a couple statuses in your status_forcelist. turn on random responses and you'll see it working in the logs.
Very nice piece of code :-)
And thank you for sharing.
POST not working
I like your first solution.
great article! works great inside a custom api wrapper.
If the server response is a 503 (Service Unavailable) mainly because an update or maintenance, I would check for the header "Retry-After" and if there I will retry after the those seconds. Hope this helps anyone :)