tl;dr; Use except GeneratorExit
if your Python generator needs to know the consumer broke out.
Suppose you have a generator that yields things out. After each yield you want to execute some code that does something like logging or cleaning up. Here one such trivialized example:
The Problem
def pump():
numbers = [1, 2, 3, 4]
for number in numbers:
yield number
print("Have sent", number)
print("Last number was sent")
for number in pump():
print("Got", number)
print("All done")
The output is, as expected:
Got 1 Have sent 1 Got 2 Have sent 2 Got 3 Have sent 3 Got 4 Have sent 4 Last number was sent All done
In this scenario, the consumer of the generator (the for number in pump()
loop in this example) gets every thing the generator generates so after the last yield
the generator is free to do any last minute activities which might be important such as closing a socket or updating a database.
Suppose the consumer is getting a bit "impatient" and breaks out as soon as it has what it needed.
def pump():
numbers = [1, 2, 3, 4]
for number in numbers:
yield number
print("Have sent", number)
print("Last number was sent")
for number in pump():
print("Got", number)
# THESE TWO NEW LINES
if number >= 2:
break
print("All done")
What do you think the out is now? I'll tell you:
Got 1 Have sent 1 Got 2 All done
In other words, the potentially important lines print("Have sent", number)
and print("Last number was sent")
never gets executed! The generator could tell the consumer (through documentation) of the generator "Don't break! If you don't want me any more raise a StopIteration". But that's not a feasible requirement.
The Solution
But! There is a better solution and that's to catch GeneratorExit
exceptions.
def pump():
numbers = [1, 2, 3, 4]
try:
for number in numbers:
yield number
print("Have sent", number)
except GeneratorExit:
print("Exception!")
print("Last number was sent")
for number in pump():
print("Got", number)
if number == 2:
break
print("All done")
Now you get what you might want:
Got 1 Have sent 1 Got 2 Exception! Last number was sent All done
Next Level Stuff
Note in the last example's output, it never prints Have sent 2
even though the generator really did send that number. Suppose that's an important piece of information, then you can reach that inside the except GeneratorExit
block. Like this for example:
def pump():
numbers = [1, 2, 3, 4]
try:
for number in numbers:
yield number
print("Have sent", number)
except GeneratorExit:
print("Have sent*", number)
print("Last number was sent")
for number in pump():
print("Got", number)
if number == 2:
break
print("All done")
And the output is:
Got 1 Have sent 1 Got 2 Have sent* 2 Last number was sent All done
The *
is just in case we wanted to distinguish between a break happening or not. Depends on your application.
Comments
Post your own commentThis got me thinking (after playing recently with try/finally, inspired by golang's defer), would try/finally work here? It seems it does
```
def tf():
try:
for i in range(10):
try:
yield i
finally:
print("Sent", i)
finally:
print("Done")
for i in tf():
print("Got", i)
if i == 3:
break
results in
Got 0
Sent 0
Got 1
Sent 1
Got 2
Sent 2
Got 3
Sent 3
Done
Actually, I kinda like your solution better.
However, today was the first time I've ever needed or used GeneratorExit and I guess it gives you slightly more control.
They both work, but I actually prefer exception GeneratorExit approach as being more explicit in what is going on, rather that double finally somehow magically quashing the exception and not re-raising it. That one is my head scratcher for the night,
The GeneratorExit seems to get lost when using a factory approach. I changed the pump definition in "The Solution" to accept a dummy var and then reference the generator as follows. I'm running Python 3.7.4. I'm wondering if this is expected behavior and I am confused, and/or maybe my approach is incorrect.
def pump(dummy):
numbers = [1, 2, 3, 4]
try:
for number in numbers:
yield number
print("Have sent", number)
except GeneratorExit:
print("Exception!")
print("Last number was sent")
pump4 = pump(4)
for number in pump4:
print("Got", number)
if number == 2:
break
print("All done")
Got 1
Have sent 1
Got 2
All done
>>>
# BUT THIS WORKS!
for number in pump(4):
print("Got", number)
if number == 2:
break
print("All done")
Got 1
Have sent 1
Got 2
Exception!
Last number was sent
All done
>>>
This is because you hold a reference (i.e., pump4) to the generator, thus it cannot be garbage collected. GeneratorExit is raised when the generator is about to be destroyed. That's what I'm assumed.
Yes, this makes perfect sense. If i recreate pump4 later by say running the example again, I do see the exception from the previous generator as it is destroyed. This is consistent with your comment. I think it's interesting to note that the the exception is triggered not by the loop break per se, but by subsequent reference destruction by another action such as garbage collection or reassignment of the var. Thank you Yao! I learned something important.
If the generator referenced pump4 is running again the result:
for number in pump4:
print("Got", number)
if number == 2:
break
print("All done")
Have sent 2
Got 3
Have sent 3
Got 4
Have sent 4
Last number was sent
All done
>>>
The pump4 generator re-started where it left-off.