Do you know what an iterable is? And an iterator? How to recognize these structures in Python? Answering these questions is the goal of this article.
Iterables #
Let’s start by creating a function that receives a set of values, squares them, and returns a list with such squares:
def square_of_numbers(iterable_of_numbers):
result = []
for number in iterable_of_numbers:
result.append(number**2)
return result
Note that the function parameter was named iterable_of_numbers. The goal is to
demonstrate what can be iterable and how to recognize an iterable.
Let’s start with the language’s documentation. In its glossary, an iterable is defined as:
An object capable of returning its members one at a time. Examples of iterables include all sequences types (such as
list,str, andtuple) and some non-sequence types likedict, file objects, and objects of any classes you define with an__iter__()or__getitem__()method that implementsSequencesemantics.
Iterables can be used in a
forloop and in many other places where a sequence is needed (zip(),map(), …). When an iterable object is passed as an argument to the built-in functioniter(), it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to calliter()or deal with the iterator objects yourself. Theforstatement does this automatically for you, creating a temporary unnmaed variable to hold the iterator for the duration of the loop.
Nothing better than an example to understand these technicalities. Let’s start
by creating a variable numbers that will be a tuple of integers, since the
glossary says that sequential types are naturally iterable:
numbers = (1, 2, 3)
Note that the documentation says that an iterable can be recognized from a
__iter__() or __getitem__() method. To check the attributes and methods
available for an object, there is the
dir function:
dir(numbers)
['__add__',
'__class__',
'__class_getitem__',
'__contains__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getitem__',
'__getnewargs__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__len__',
'__lt__',
'__mul__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__rmul__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'count',
'index']
Note that such methods are present. If you have difficulty finding them in the
list, the following code confirms the presence of such methods (and still leaves
you curious to research about
set in Python
:-)
set(('__iter__', '__getitem__')) & set(dir(numbers))
{'__getitem__', '__iter__'}
OK, numbers is an iterable. Following the glossary text, it is possible to go
through each item using a for loop. Look again at the body of the function
created at the beginning of the article and notice that this is exactly what
happens.
square_of_numbers(numbers)
[1, 4, 9]
As an exercise, set numbers as a list and see that it works equally. Create
your own exercises with the types mentioned in the glossary to understand the
subject even more.
Another way to recognize numbers as an iterable is through a list
comprehension,
since, in this case, a for loop also occurs:
[number**2 for number in numbers]
[1, 4, 9]
The glossary also says that it is possible to pass an iterable in language
functions such as map.
The map applies a function to all items in an iterable. Let’s create an
anonymous
function
that also squares all items:
map(lambda x : x**2, numbers)
<map at 0x7f80d42ccbb0>
Note that map returns an object in memory that is actually an iterator, our
next subject. But, before that, just to show that it worked, let’s pass the
map to a list:
list(map(lambda x : x**2, numbers))
[1, 4, 9]
Iterators #
Let’s consult the glossary again:
An object representing a stream of data. Repeated calls to the iterator’s
__next__()method (or passing it to the built-in functionnext()) return successive items in the stream. When no more data are available, aStopIteration exceptionis raised instead. At this point, the iterator object is exhausted and any further calls to its__next__()method just raiseStopIterationagain. Iterators must have a__iter__()method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as alist) produces a new iterator each time you pass it to theiter()function or use it in aforloop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container.
To understand, let’s create an iterator by passing our numbers variable to the
iter() function and assigning it to a new variable num_iter:
num_iter = iter(numbers)
Let’s check the type of the created variable:
type(num_iter)
tuple_iterator
As the type itself shows, it is an iterator. Being an iterator, following the
documentation, there must be a __iter__() method and also a __next__().
Let’s check:
dir(num_iter)
['__class__',
'__delattr__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__getstate__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__iter__',
'__le__',
'__length_hint__',
'__lt__',
'__ne__',
'__new__',
'__next__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__setstate__',
'__sizeof__',
'__str__',
'__subclasshook__']
set(('__iter__', '__next__')) & set(dir(num_iter))
{'__iter__', '__next__'}
Now, let’s understand what the documentation means by successive calls to
next():
next(num_iter)
1
Note that only the first number was displayed. Let’s make three more calls:
next(num_iter)
2
next(num_iter)
3
next(num_iter)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[16], line 1
----> 1 next(num_iter)
StopIteration:
As the documentation describes, when the iterator is exhausted, a
StopIteration exception is generated indicating that the entire iterator has
been consumed.
Just to clarify a question that may be going through your head, not necessarily an iterable is an iterator. Example:
next(numbers)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[17], line 1
----> 1 next(numbers)
TypeError: 'tuple' object is not an iterator
See that numbers itself is not an iterator although it is an iterable.
Now let’s check that the map object generated in the previous section is also
an iterable. Remember:
map(lambda x : x**2, numbers)
<map at 0x7f80d41ceef0>
Let’s assign it to a variable to make it easier to handle the object:
num_map = map(lambda x : x**2, numbers)
Let’s see the type of the object and make successive calls to next:
type(num_map)
map
next(num_map)
1
next(num_map)
4
next(num_map)
9
next(num_map)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
Cell In[24], line 1
----> 1 next(num_map)
StopIteration:
It is the same behavior seen with the object created via iter().
As the documentation says that an iterator is also an iterable, we can pass the
map object as an argument to our function:
square_of_numbers(map(lambda x : x**2, numbers))
[1, 16, 81]
Since the anonymous function already squared, the for loop inside the
square_of_numbers function is squaring again. Hence, the values obtained.
The for loop automatically handles StopIteration, so we don’t see the
exception when the iterator is consumed.
When the iterator is passed into a container type, it is consumed completely, also without raising the exception:
list(map(lambda x : x**2, numbers))
[1, 4, 9]
tuple(map(lambda x : x**2, numbers))
(1, 4, 9)
If the iterator has started to be consumed and then has been passed to a
container, only the rest of the iterator will be part. For example, let’s
recreate our map iterator and call next once:
num_map = map(lambda x : x**2, numbers)
next(num_map)
1
Now, let’s pass the iterator to a list:
list(num_map)
[4, 9]
See that only the last two items appear in the list. After all, the iterator has
no record of the items passed. This is the meaning of the phrase “an object
representing a stream of data” from the glossary. The iterator had already
returned the item 1, the next in the stream were 4 and 9, these being
passed to the list when it completely consumed the iterator.
In case you want a historical context about iterators in the language, I recommend reading PEP 234.
Conclusion #
As you may have noticed, iterators and iterables frequently appear in the language and you may not have even realized that you were using them at various times. And understanding these concepts is crucial to understand more complex structures of the language such as, for example, generators, which will soon be the subject of an article on the site.
Did you like this article? It is part of Python Drops, a set of shorter posts focused on fundamentals of the Python language and programming in general. You can read more of these articles by searching for the “drops” tag here on the site. Until next time!