Why we should use generators? [Python]

Utpal Kumar   4 minute read      

Generators don’t hold the entire result in memory. It yields one result at a time.

Before we get into the idea of generators, we need to understand the difference between “iterables” and “iterators”.

Iterables and Iterators

Let me start by saying that the list, tuples,strings, dictionaries, etc are iterables. Let us see an example:

testList = [1, 2, 3]
for val in testList:
    print(val)

This will output (as we can guess):

1
2
3

For an object to be an iterable, it needs to have a method __iter__(). We can check if our list has this method by investigating using the built-in dir function:

print(dir(testList))

This outputs a list:

['__add__', '__class__', '__contains__', '__delattr__', '__delitem__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__iadd__', '__imul__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__reversed__', '__rmul__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', 'append', 'clear', 'copy', 'count', 'extend', 'index', 'insert', 'pop', 'remove', 'reverse', 'sort']

This has a method __iter__ as we can see. Iterators, unlike iterables, has a state where it knows where it is during an iteration and knows how to get the next value.

In the above example of the list object, if we query its next value, then it will not know.

print(next(testList))

This throws an error:

Traceback (most recent call last):
  File "iterables.py", line 7, in <module>
    print(next(testList))
TypeError: 'list' object is not an iterator

If we convert the list object into an iterator using the __iter__ method, then we can apply the next function on it:

testIter = testList.__iter__()
print(testIter)

or

testIter = iter(testList)
print(testIter)

This will print:

<list_iterator object at 0x7fd6d6f2de90>

If we apply the next function now, it will work:

print(next(testIter))
print(next(testIter))
print(next(testIter))
1
2
3

But if we print the next again then it will raise StopIteration error.

print(next(testIter))
print(next(testIter))
print(next(testIter))
print(next(testIter))
1
2
3
Traceback (most recent call last):
  File "iterables.py", line 15, in <module>
    print(next(testIter))
StopIteration

This means that the iterators knows where to stop.

If we use the for loop to run this, then Python will automatically figure it out where to stop:

for val in testIter:
    print(val)
1
2
3

To understand this, let us perform the same operation using the while loop.

while True:
    try:
        item = next(testIter)
        print(item)
    except StopIteration:
        break

There are several built-in iterator functions in Python such as the range function we most often use. We can create our own class for iterators. Let us create the one similar to range:

class rangeNew:
    def __init__(self, start, end, step=1):
        self.value = start
        self.end = end
        self.step = step

    def __iter__(self):
        return self

    def __next__(self):
        if self.value >= self.end:
            raise StopIteration

        current = self.value
        self.value += self.step
        return current
rangeIter = rangeNew(1, 10, 2)

for val in rangeIter:
    print(val)

1
3
5
7
9

Generators

  • Generators don’t hold the entire result in memory. It yields one result at a time.
  • Ways of creating generators:

    1. Using a function

      def squares_gen(num):
              for i in num:
                      yield i**2
      
      def squares(num):
              results=[]
              for i in num:
                      results.append(i**2)
              return results
      
      • Elapsed time for list: 7.360722 Seconds

      • Elapsed time for generators: 5.999999999950489e-06 Seconds

      • Difference in time taken for the list and generators: 7.360716 Seconds for num = np.arange(1,10000000)

    2. Like a list comprehension

      resl = [i**2 for i in num]
      
      resg = (i**2 for i in num)
      
      • Elapsed time for list: 7.663468000000001 Seconds

      • Elapsed time for generators: 9.999999999621423e-06 Seconds

      • Difference in time taken: 7.663458000000001 Seconds for num = np.arange(1,10000000)

  • Getting the results from the generator function:
    1. Using next
      resg = squares_gen(num)
      print('res of generators: ',next(resg))
      print('res of generators: ',next(resg))
      print('res of generators: ',next(resg))
      
    2. Using loop:
      for n in resg:
          print(n)
      

Advantages of using generators:

  1. The generator codes are more readable.
  2. Generators are much faster and uses little memory.

Results:

  1. Using function is a faster way of creating values in Python than using loop or list comprehension for both lists and generators.
  2. The difference between using list or generators is more pronounced when using a comprehension (though generators are still much faster.)
  3. When we need the result of whole array at a time then the amount of time (or memory) taken to create a list or list(generators) are almost same.
How to used Generator Memory usage

Overall, generators gives a performance boost not only in execution time but with the memory as well.

Appendix

How I calculated the time taken by the process

  • Calculate sum of the system and user CPU time of the current process.
    • time.process_time provides the system and user CPU time of the current process in seconds.
    • Use time.process_time_ns to get the result in nanoseconds

NOTE: The “time taken” shown in this study is subjective to different computers and varies each time depending on the state of the CPU. But each and everytime, the using generators are much faster.

References:

  1. Python Tutorial: Iterators and Iterables - What Are They and How Do They Work? by Corey Schafer

Disclaimer of liability

The information provided by the Earth Inversion is made available for educational purposes only.

Whilst we endeavor to keep the information up-to-date and correct. Earth Inversion makes no representations or warranties of any kind, express or implied about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services or related graphics content on the website for any purpose.

UNDER NO CIRCUMSTANCE SHALL WE HAVE ANY LIABILITY TO YOU FOR ANY LOSS OR DAMAGE OF ANY KIND INCURRED AS A RESULT OF THE USE OF THE SITE OR RELIANCE ON ANY INFORMATION PROVIDED ON THE SITE. ANY RELIANCE YOU PLACED ON SUCH MATERIAL IS THEREFORE STRICTLY AT YOUR OWN RISK.


Leave a comment