tl;dr; Use except GeneratorExit if your Python generator needs to know the consumer broke out.

Suppose you have a generator that yields things out. After each yield you want to execute some code that does something like logging or cleaning up. Here one such trivialized example:

The Problem


def pump():
    numbers = [1, 2, 3, 4]
    for number in numbers:
        yield number
        print("Have sent", number)
    print("Last number was sent")


for number in pump():
    print("Got", number)

print("All done")

The output is, as expected:

Got 1
Have sent 1
Got 2
Have sent 2
Got 3
Have sent 3
Got 4
Have sent 4
Last number was sent
All done

In this scenario, the consumer of the generator (the for number in pump() loop in this example) gets every thing the generator generates so after the last yield the generator is free to do any last minute activities which might be important such as closing a socket or updating a database.

Suppose the consumer is getting a bit "impatient" and breaks out as soon as it has what it needed.


def pump():
    numbers = [1, 2, 3, 4]
    for number in numbers:
        yield number
        print("Have sent", number)
    print("Last number was sent")


for number in pump():
    print("Got", number)
    # THESE TWO NEW LINES
    if number >= 2:
        break

print("All done")

What do you think the out is now? I'll tell you:

Got 1
Have sent 1
Got 2
All done

In other words, the potentially important lines print("Have sent", number) and print("Last number was sent") never gets executed! The generator could tell the consumer (through documentation) of the generator "Don't break! If you don't want me any more raise a StopIteration". But that's not a feasible requirement.

The Solution

But! There is a better solution and that's to catch GeneratorExit exceptions.


def pump():
    numbers = [1, 2, 3, 4]
    try:
        for number in numbers:
            yield number
            print("Have sent", number)
    except GeneratorExit:
        print("Exception!")
    print("Last number was sent")


for number in pump():
    print("Got", number)
    if number == 2:
        break
print("All done")

Now you get what you might want:

Got 1
Have sent 1
Got 2
Exception!
Last number was sent
All done

Next Level Stuff

Note in the last example's output, it never prints Have sent 2 even though the generator really did send that number. Suppose that's an important piece of information, then you can reach that inside the except GeneratorExit block. Like this for example:


def pump():
    numbers = [1, 2, 3, 4]
    try:
        for number in numbers:
            yield number
            print("Have sent", number)
    except GeneratorExit:
        print("Have sent*", number)
    print("Last number was sent")


for number in pump():
    print("Got", number)
    if number == 2:
        break
print("All done")

And the output is:

Got 1
Have sent 1
Got 2
Have sent* 2
Last number was sent
All done

The * is just in case we wanted to distinguish between a break happening or not. Depends on your application.

Comments

Post your own comment
Ivo van der WIjk

This got me thinking (after playing recently with try/finally, inspired by golang's defer), would try/finally work here? It seems it does

```
def tf():
    try:
        for i in range(10):
            try:
                yield i
            finally:
                print("Sent", i)
    finally:
        print("Done")

for i in tf():
    print("Got", i)
    if i == 3:
        break

results in

Got 0
Sent 0
Got 1
Sent 1
Got 2
Sent 2
Got 3
Sent 3
Done

Peter Bengtsson

Actually, I kinda like your solution better.

However, today was the first time I've ever needed or used GeneratorExit and I guess it gives you slightly more control.

Just_Googling

They both work, but I actually prefer exception GeneratorExit approach as being more explicit in what is going on, rather that double finally somehow magically quashing the exception and not re-raising it. That one is my head scratcher for the night,

Don Heidrich

The GeneratorExit seems to get lost when using a factory approach. I changed the pump definition in "The Solution" to accept a dummy var and then reference the generator as follows. I'm running Python 3.7.4. I'm wondering if this is expected behavior and I am confused, and/or maybe my approach is incorrect.

def pump(dummy):
    numbers = [1, 2, 3, 4]
    try:
        for number in numbers:
            yield number
            print("Have sent", number)
    except GeneratorExit:
        print("Exception!")
    print("Last number was sent")


pump4 = pump(4)
for number in pump4:
    print("Got", number)
    if number == 2:
        break
print("All done")

Got 1
Have sent 1
Got 2
All done
>>>


# BUT THIS WORKS!

for number in pump(4):
    print("Got", number)
    if number == 2:
        break
print("All done")

Got 1
Have sent 1
Got 2
Exception!
Last number was sent
All done
>>>

Yao

This is because you hold a reference (i.e., pump4) to the generator, thus it cannot be garbage collected. GeneratorExit is raised when the generator is about to be destroyed. That's what I'm assumed.

Don Heidrich

Yes, this makes perfect sense. If i recreate pump4 later by say running the example again, I do see the exception from the previous generator as it is destroyed. This is consistent with your comment. I think it's interesting to note that the the exception is triggered not by the loop break per se, but by subsequent reference destruction by another action such as garbage collection or reassignment of the var. Thank you Yao! I learned something important.

Angel Salazar

If the generator referenced pump4 is running again the result:

for number in pump4:
    print("Got", number)
    if number == 2:
        break
print("All done")

Have sent 2
Got 3
Have sent 3
Got 4
Have sent 4
Last number was sent
All done
>>>

The pump4 generator re-started where it left-off.

Your email will never ever be published.

Related posts