20-Generator

Introduction

Generators are a simple and powerful possibility to create or to generate iterators. On the surface they look like functions, but there is both a syntactical and a semantic difference. Instead of return statements you will find inside of the body of a generator only yield statements, i.e. one or more yield statements.

Another important feature of generators is that the local variables and the execution start is automatically saved between calls. This is necessary, because unlike an ordinary function successive calls to a generator function don't start execution at the beginning of the function. Instead, the new call to a generator function will resume execution right after the yield statement in the code, where the last call exited. In other words: When the Python interpreter finds a yield statement inside of an iterator generated by a generator, it records the position of this statement and the local variables, and returns from the iterator. The next time this iterator is called, it will resume execution at the line following the previous yield statement. There may be more than one yield statement in the code of a generator or the yield statement might be inside the body of a loop. If there is a return statement in the code of a generator, the execution will stop with a StopIteration exception error if this code is executed by the Python interpreter.

Everything what can be done with a generator can be implemented with a class based iterator as well. But the crucial advantage of generators consists in automatically creating the methods __iter__() and next().

Generators provide a very neat way of producing data which is huge or infinite.

The following is a simple example of a generator, which is capable of producing four city names:

def city_generator(): yield("Konstanz") yield("Zurich") yield("Schaffhausen") yield("Stuttgart")

It's possible to create an iterator with this generator, which generates one after the other the four cities Konstanz, Zurich, Schaffhausen and Stuttgart.

>>> from city_generator import city_generator >>> x = city_generator() >>> print x.next() Konstanz >>> print x.next() Zurich >>> print x.next() Schaffhausen >>> print x.next() Stuttgart >>> print x.next() Traceback (most recent call last): File "<stdin>", line 1, in <module> StopIteration >>>

As we can see, we have generated an iterator x in the interactive shell. Every call of the method next() returns another city. After the last city, i.e. Stuttgart, has been created, another call of x.next() raises an error, saying that the iteration has stopped, i.e. "StopIteration".

Can we send a reset to an iterator is a frequently asked question, so that it can start the iteration all over again. There is no reset, but it's possible to create another generator. This can be done e.g. by having the statement "x = city_generator()" again.

Thought at first sight the yield statement looks like the return statement of a function, we can see in this example that there is a big difference. If we had a return statement instead of a yield in the previous example, it would be a function. But this function would always return "Konstanz" and never any of the other cities, i.e. Zurich, Schaffhausen or Stuttgart.

Method of Operation

As we have elaborated in the introduction of this chapter, the generators offer a comfortable method to generate iterators, and that's why they are called generators.

Method of working:

    • A generator is called like a function. It's return value is an iterator object. The code of the generator will not be executed in this stage.

    • The iterator can be used by calling the next method. The first time the execution starts like a function, i.e. the first line of code within the body of the iterator. The code is executed until a yield statement is reached.

    • yield returns the value of the expression, which is following the keyword yield. This is like a function, but Python keeps track of the position of this yield and the state of the local variables is stored for the next call. At the next call, the execution continues with the statement following the yield statement and the variables have the same values as they had in the previous call.

    • The iterator is finished, if the generator body is completely worked through or if the program flow encounters a return statement without a value.

We will illustrate this behaviour in the following example of a generator which generates the Fibonacci numbers.

The Fibonacci sequence is named after Leonardo of Pisa, who was known as Fibonacci (a contraction of filius Bonacci, "son of Bonaccio"). In his textbook Liber Abaci, which appeared in the year 1202) he had an exercise about the rabbits and their breeding: It starts with a newly-born pair of rabbits, i.e. a male and a female animal. It takes one month until they can mate. At the end of the second month the female gives birth to a new pair of rabbits. Now let's suppose that every female rabbit will bring forth another pair of rabbits every month after the end of the first month. We have to mention that Fibonacci's rabbits never die. They question is how large the population will be after a certain period of time.

This produces a sequence of numbers: 0,1,1,2,3,5,8,13

This sequence can be defined in mathematical terms like this:

Fn = Fn - 1 + Fn - 2

with the seed values:

F0 = 0 and F1 = 1

def fibonacci(n): """Fibonacci numbers generator, first n""" a, b, counter = 0, 1, 0 while True: if (counter > n): return yield a a, b = b, a + b counter += 1 f = fibonacci(5) for x in f: print x, print

The generator above can be used to create the first n Fibonacci numbers, or better (n+1) numbers because the 0th number is also included.

In the next example we show you a version which is capable of returning an endless iterator. We have to take care when we use this iterator that a termination criterium is used:

def fibonacci(): """Fibonacci numbers generator""" a, b = 0, 1 while True: yield a a, b = b, a + b f = fibonacci() counter = 0 for x in f: print x, counter += 1 if (counter > 10): break print

Recursive Generators

Like functions generators can be recursively programmed. The following example is a generator to create all the permutations of a given list of items.

For those who don't know what permutations are, we have a short introduction:

Formal Definition:

A permutation is a rearrangement of the elements of an ordered list. In other words: Every arrangement of n elements is called a permutation.

In the following lines we show you all the permutations of the letter a, b and c:

a b c

a c b

b a c

b c a

c a b

c b a

The number of permutations on a set of n elements is given by n!

n! = n*(n-1)*(n-2) ... 2 * 1 The permutation generator can be called with an arbitrary list of objects. The iterator returned by this generator generates all the possible permutations:

def permutations(items): n = len(items) if n==0: yield [] else: for i in range(len(items)): for cc in permutations(items[:i]+items[i+1:]): yield [items[i]]+cc for p in permutations(['r','e','d']): print ''.join(p) for p in permutations(list("game")): print ''.join(p)

Generator of Generators

The second generator of our Fibonacci sequence example generates an iterator, which can theoretically produce all the Fibonacci numbers, i.e. an infinite number. But you shouldn't try to produce all these numbers, as we would do in the following example:

list(fibonacci())

This will show you very fast the limits of your computer.

In most practical applications, we only need the first n elements of an "endless" iterator. We can use another generator, in our example firstn, to create the first n elements of a generator g:

def firstn(g, n): for i in range(n): yield g.next()

The following script returns the first 10 elements of the Fibonacci sequence:

#!/usr/bin/env python def fibonacci(): """Ein Fibonacci-Zahlen-Generator""" a, b = 0, 1 while True: yield a a, b = b, a + b def firstn(g, n): for i in range(n): yield g.next() print list(firstn(fibonacci(), 10))

Reference

Generators

Yield and Generators Explained