Iterables vs Iterators in Python - Writing efficient code

Do you know what an iterable is? And an iterator? How to recognize these structures in Python? Answering these questions is the goal of this article.

Table of Contents

Iterables

Let’s start by creating a function that receives a set of values, squares them, and returns a list with such squares:

def square_of_numbers(iterable_of_numbers):
    result = []
    for number in iterable_of_numbers:
        result.append(number**2)
    return result

Note that the function parameter was named iterable_of_numbers. The goal is to demonstrate what can be iterable and how to recognize an iterable.

Let’s start with the language’s documentation. In its glossary, an iterable is defined as:

An object capable of returning its members one at a time. Examples of iterables include all sequences types (such as list, str, and tuple) and some non-sequence types like dict, file objects, and objects of any classes you define with an __iter__() or __getitem__() method that implements Sequence semantics.

Iterables can be used in a for loop and in many other places where a sequence is needed (zip(), map(), …). When an iterable object is passed as an argument to the built-in function iter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call iter() or deal with the iterator objects yourself. The for statement does this automatically for you, creating a temporary unnmaed variable to hold the iterator for the duration of the loop.

Nothing better than an example to understand these technicalities. Let’s start by creating a variable numbers that will be a tuple of integers, since the glossary says that sequential types are naturally iterable:

numbers = (1, 2, 3)

Note that the documentation says that an iterable can be recognized from a __iter__() or __getitem__() method. To check the attributes and methods available for an object, there is the dir function:

dir(numbers)

['__add__',
 '__class__',
 '__class_getitem__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'count',
 'index']

Note that such methods are present. If you have difficulty finding them in the list, the following code confirms the presence of such methods (and still leaves you curious to research about set in Python 🙂

set(('__iter__', '__getitem__')) & set(dir(numbers))

{'__getitem__', '__iter__'}

OK, numbers is an iterable. Following the glossary text, it is possible to go through each item using a for loop. Look again at the body of the function created at the beginning of the article and notice that this is exactly what happens.

square_of_numbers(numbers)

[1, 4, 9]

As an exercise, set numbers as a list and see that it works equally. Create your own exercises with the types mentioned in the glossary to understand the subject even more.

Another way to recognize numbers as an iterable is through a list comprehension, since, in this case, a for loop also occurs:

[number**2 for number in numbers]

[1, 4, 9]

The glossary also says that it is possible to pass an iterable in language functions such as map. The map applies a function to all items in an iterable. Let’s create an anonymous function that also squares all items:

map(lambda x : x**2, numbers)

Note that map returns an object in memory that is actually an iterator, our next subject. But, before that, just to show that it worked, let’s pass the map to a list:

list(map(lambda x : x**2, numbers))

[1, 4, 9]

Iterators

Let’s consult the glossary again:

An object representing a stream of data. Repeated calls to the iterator’s __next__() method (or passing it to the built-in function next()) return successive items in the stream. When no more data are available, a StopIteration exception is raised instead. At this point, the iterator object is exhausted and any further calls to its __next__() method just raise StopIteration again. Iterators must have a __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as a list) produces a new iterator each time you pass it to the iter() function or use it in a for loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.

To understand, let’s create an iterator by passing our numbers variable to the iter() function and assigning it to a new variable num_iter:

num_iter = iter(numbers)

Let’s check the type of the created variable:

type(num_iter)

tuple_iterator

As the type itself shows, it is an iterator. Being an iterator, following the documentation, there must be a __iter__() method and also a __next__(). Let’s check:

dir(num_iter)

['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getstate__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__length_hint__',
 '__lt__',
 '__ne__',
 '__new__',
 '__next__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__subclasshook__']

set(('__iter__', '__next__')) & set(dir(num_iter))

{'__iter__', '__next__'}

Now, let’s understand what the documentation means by successive calls to next():

next(num_iter)

Note that only the first number was displayed. Let’s make three more calls:

next(num_iter)

next(num_iter)

next(num_iter)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[16], line 1
----> 1 next(num_iter)

StopIteration:

As the documentation describes, when the iterator is exhausted, a StopIteration exception is generated indicating that the entire iterator has been consumed.

Just to clarify a question that may be going through your head, not necessarily an iterable is an iterator. Example:

next(numbers)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 next(numbers)

TypeError: 'tuple' object is not an iterator

See that numbers itself is not an iterator although it is an iterable.

Now let’s check that the map object generated in the previous section is also an iterable. Remember:

map(lambda x : x**2, numbers)

Let’s assign it to a variable to make it easier to handle the object:

num_map = map(lambda x : x**2, numbers)

Let’s see the type of the object and make successive calls to next:

type(num_map)

map

next(num_map)

next(num_map)

next(num_map)

next(num_map)

---------------------------------------------------------------------------
StopIteration                             Traceback (most recent call last)
Cell In[24], line 1
----> 1 next(num_map)

StopIteration:

It is the same behavior seen with the object created via iter().

As the documentation says that an iterator is also an iterable, we can pass the map object as an argument to our function:

square_of_numbers(map(lambda x : x**2, numbers))

[1, 16, 81]

Since the anonymous function already squared, the for loop inside the square_of_numbers function is squaring again. Hence, the values obtained.

The for loop automatically handles StopIteration, so we don’t see the exception when the iterator is consumed.

When the iterator is passed into a container type, it is consumed completely, also without raising the exception:

list(map(lambda x : x**2, numbers))

[1, 4, 9]

tuple(map(lambda x : x**2, numbers))

(1, 4, 9)

If the iterator has started to be consumed and then has been passed to a container, only the rest of the iterator will be part. For example, let’s recreate our map iterator and call next once:

num_map = map(lambda x : x**2, numbers)

next(num_map)

Now, let’s pass the iterator to a list:

list(num_map)

[4, 9]

See that only the last two items appear in the list. After all, the iterator has no record of the items passed. This is the meaning of the phrase “an object representing a stream of data” from the glossary. The iterator had already returned the item 1, the next in the stream were 4 and 9, these being passed to the list when it completely consumed the iterator.

In case you want a historical context about iterators in the language, I recommend reading PEP 234.

Conclusion

As you may have noticed, iterators and iterables frequently appear in the language and you may not have even realized that you were using them at various times. And understanding these concepts is crucial to understand more complex structures of the language such as, for example, generators, which will soon be the subject of an article on the site.

Did you like this article? It is part of Python Drops, a set of shorter posts focused on fundamentals of the Python language and programming in general. You can read more of these articles by searching for the “drops” tag here on the site. Until next time!

Iterables vs Iterators in Python – Writing efficient code

Iterables

Iterators

Conclusion

About The Author

Francisco Bustamante

Leave a Comment Cancel Reply

Iterables

Iterators

Conclusion

About The Author

Francisco Bustamante

Related Posts

Leave a Comment Cancel Reply