Do you know what an iterable is? And an iterator? How to recognize these structures in Python? Answering these questions is the goal of this article.
Let’s start by creating a function that receives a set of values, squares them, and returns a list with such squares:
def square_of_numbers(iterable_of_numbers):
result = []
for number in iterable_of_numbers:
result.append(number**2)
return result
Note that the function parameter was named iterable_of_numbers
. The goal is to demonstrate what can be iterable and how to recognize an iterable.
Let’s start with the language’s documentation. In its glossary, an iterable is defined as:
An object capable of returning its members one at a time. Examples of iterables include all sequences types (such as
list
,str
, andtuple
) and some non-sequence types likedict
, file objects, and objects of any classes you define with an__iter__()
or__getitem__()
method that implementsSequence
semantics.
Iterables can be used in a
for
loop and in many other places where a sequence is needed (zip()
,map()
, …). When an iterable object is passed as an argument to the built-in functioniter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()
or deal with the iterator objects yourself. Thefor
statement does this automatically for you, creating a temporary unnmaed variable to hold the iterator for the duration of the loop.
Nothing better than an example to understand these technicalities. Let’s start by creating a variable numbers
that will be a tuple of integers, since the glossary says that sequential types are naturally iterable:
numbers = (1, 2, 3)
Note that the documentation says that an iterable can be recognized from a __iter__()
or __getitem__()
method. To check the attributes and methods available for an object, there is the dir
function:
dir(numbers)
['__add__', '__class__', '__class_getitem__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'count', 'index']
Note that such methods are present. If you have difficulty finding them in the list, the following code confirms the presence of such methods (and still leaves you curious to research about set
in Python 🙂
set(('__iter__', '__getitem__')) & set(dir(numbers))
{'__getitem__', '__iter__'}
OK, numbers
is an iterable. Following the glossary text, it is possible to go through each item using a for
loop. Look again at the body of the function created at the beginning of the article and notice that this is exactly what happens.
square_of_numbers(numbers)
[1, 4, 9]
As an exercise, set numbers
as a list and see that it works equally. Create your own exercises with the types mentioned in the glossary to understand the subject even more.
Another way to recognize numbers
as an iterable is through a list comprehension, since, in this case, a for
loop also occurs:
[number**2 for number in numbers]
[1, 4, 9]
The glossary also says that it is possible to pass an iterable in language functions such as map
. The map
applies a function to all items in an iterable. Let’s create an anonymous function that also squares all items:
map(lambda x : x**2, numbers)
Note that map
returns an object in memory that is actually an iterator, our next subject. But, before that, just to show that it worked, let’s pass the map
to a list:
list(map(lambda x : x**2, numbers))
[1, 4, 9]
Iterators
Let’s consult the glossary again:
An object representing a stream of data. Repeated calls to the iterator’s
__next__()
method (or passing it to the built-in functionnext()
) return successive items in the stream. When no more data are available, aStopIteration exception
is raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()
method just raiseStopIteration
again. Iterators must have a__iter__()
method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist
) produces a new iterator each time you pass it to theiter()
function or use it in afor
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
To understand, let’s create an iterator by passing our numbers
variable to the iter()
function and assigning it to a new variable num_iter
:
num_iter = iter(numbers)
Let’s check the type of the created variable:
type(num_iter)
tuple_iterator
As the type itself shows, it is an iterator. Being an iterator, following the documentation, there must be a __iter__()
method and also a __next__()
. Let’s check:
dir(num_iter)
['__class__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__length_hint__', '__lt__', '__ne__', '__new__', '__next__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__']
set(('__iter__', '__next__')) & set(dir(num_iter))
{'__iter__', '__next__'}
Now, let’s understand what the documentation means by successive calls to next()
:
next(num_iter)
1
Note that only the first number was displayed. Let’s make three more calls:
next(num_iter)
2
next(num_iter)
3
next(num_iter)
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) Cell In[16], line 1 ----> 1 next(num_iter) StopIteration:
As the documentation describes, when the iterator is exhausted, a StopIteration
exception is generated indicating that the entire iterator has been consumed.
Just to clarify a question that may be going through your head, not necessarily an iterable is an iterator. Example:
next(numbers)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) Cell In[17], line 1 ----> 1 next(numbers) TypeError: 'tuple' object is not an iterator
See that numbers
itself is not an iterator although it is an iterable.
Now let’s check that the map
object generated in the previous section is also an iterable. Remember:
map(lambda x : x**2, numbers)
Let’s assign it to a variable to make it easier to handle the object:
num_map = map(lambda x : x**2, numbers)
Let’s see the type of the object and make successive calls to next
:
type(num_map)
map
next(num_map)
1
next(num_map)
4
next(num_map)
9
next(num_map)
--------------------------------------------------------------------------- StopIteration Traceback (most recent call last) Cell In[24], line 1 ----> 1 next(num_map) StopIteration:
It is the same behavior seen with the object created via iter()
.
As the documentation says that an iterator is also an iterable, we can pass the map
object as an argument to our function:
square_of_numbers(map(lambda x : x**2, numbers))
[1, 16, 81]
Since the anonymous function already squared, the for
loop inside the square_of_numbers
function is squaring again. Hence, the values obtained.
The for
loop automatically handles StopIteration
, so we don’t see the exception when the iterator is consumed.
When the iterator is passed into a container type, it is consumed completely, also without raising the exception:
list(map(lambda x : x**2, numbers))
[1, 4, 9]
tuple(map(lambda x : x**2, numbers))
(1, 4, 9)
If the iterator has started to be consumed and then has been passed to a container, only the rest of the iterator will be part. For example, let’s recreate our map
iterator and call next
once:
num_map = map(lambda x : x**2, numbers)
next(num_map)
1
Now, let’s pass the iterator to a list:
list(num_map)
[4, 9]
See that only the last two items appear in the list. After all, the iterator has no record of the items passed. This is the meaning of the phrase “an object representing a stream of data” from the glossary. The iterator had already returned the item 1
, the next in the stream were 4
and 9
, these being passed to the list when it completely consumed the iterator.
In case you want a historical context about iterators in the language, I recommend reading PEP 234.
Conclusion
As you may have noticed, iterators and iterables frequently appear in the language and you may not have even realized that you were using them at various times. And understanding these concepts is crucial to understand more complex structures of the language such as, for example, generators, which will soon be the subject of an article on the site.
Did you like this article? It is part of Python Drops, a set of shorter posts focused on fundamentals of the Python language and programming in general. You can read more of these articles by searching for the “drops” tag here on the site. Until next time!