您的位置:首页 > 编程语言 > Python开发

Python-iterator and generator

2017-04-23 10:43 148 查看

Python: iterator and generator

Iterator

iterable:

An object capable of returning its members one at a time. Examples of iterables include all sequence types (such as
list
,
str
, and
tuple
) and some non-sequence types like
dict
and
file
and objects of any classes you define with an
__iter__()
or
__getitem__()
method. Iterables can be used in a
for
loop and in many other places where a sequence is needed (
zip()
,
map()
, …). When an iterable object is passed as an argument to the built-in function
iter()
, it returns an iterator for the object. This iterator is good for one pass over the set of values. When using iterables, it is usually not necessary to call
iter()
or deal with iterator objects yourself. The
for
statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop [1].

iterator:

An object representing a stream of data. Repeated calls to the iterator’s
next()
method return successive items in the stream. When no more data are available a
StopIteration
exception is raised instead. At this point, the iterator object is exhausted and any further calls to its
next()
method just raise
StopIteration
again. Iterators are required to have an
__iter__()
method that returns the iterator object itself so every iterators is also iterable and may be used in most places where other iterables are accepted. One notable exception is code which attempts multiple iteration passes. A container object (such as
list
) produces a fresh new iterator each time you pass it to the
iter()
function or use it in a
for
loop. Attempting this with an iterator will just return the same exhausted iterator object used in the previous iteration pass, making it appear like an empty container [1].

In formal terms, an object that implements the
__iter__
method is iterable, and the object implementing
next
is the iterator[2].

如上所述iterable对象包括诸如
list
,
str
,
dict
,
file
等类型,也包括定义了
__iter__()
__getitem__()
方法的对象。此时该对象只为iterable,同时定义了
__iter__()
next()
(Python 3.*中为
__next__()
)的对象才是iterator。若
itr
为一个iterable对象,执行built-in方法
iter(itr)
将返回一个iterator。对于返回的iterator可以调用对象定义的
next()
方法获取iterable对象
itr
的下一个值,当
next()
无法获取下一个值时产生
StopIteration
异常,而且在以后调用该iterator时都将产生
StopIteration
异常。这是因为
next()
是单向执行的,即不能访问已经访问过的值。测试代码如下:

>>> itr = ['a', 'b', 'c', 'd']
>>> cnt = iter(itr)
>>> cnt.next()
'a'
>>> cnt.next()
'b'
>>> cnt.next()
'c'
>>> cnt.next()
'd'
>>> cnt.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> cnt.next()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration


for
statement
:

The
for
statement is used to iterate over the elements of a sequence or other iterable object:



#!/usr/bin/env python

class iterator(object):
def __init__(self, init, limit):
super(iterator, self).__init__()
self.cnt = init
self.limit = limit

def __iter__(self):
print 'call to __iter__'
return self

def next(self): #this be __next__(self) in python 3.*.
print 'call to __next__'
self.cnt += 1
if self.cnt >= self.limit: raise StopIteration
return self.cnt

def main():
for i in iterator(0, 5):
print i

if __name__ == '__main__':
main()


执行结果如下:

call to __iter__
call to __next__
1
call to __next__
2
call to __next__
3
call to __next__
4
call to __next__


可以看出
iter()
只在进入
for
语句时执行一次,而在其它每次
for
迭代语句中只调用一次
next()
方法返回一个值。注意,对于一个iterable对象,iterator是通过该iterable对象的
__iter__()
方法获得的,而每次调用的
next()
方法是该iterator中定义的
next()
,测试代码如下所示:

#!/usr/bin/env python

class iterator(object):
def __init__(self, init, limit):
print 'call to the iterator\'s __init__()'
super(iterator, self).__init__()
self.cnt = init
self.limit = limit

def next(self):
print 'call to the iterator\'s next()'
self.cnt += 1
if self.cnt >= self.limit: raise StopIteration
return self.cnt

class iterable(object):
def __init__(self, init, limit):
print 'call to the iterable\'s __init__()'
self.init, self.limit = init, limit
super(iterable, self).__init__()

def __iter__(self):
print 'call to the iterable\'s __iter__()'
return iterator(self.init, self.limit)

def main():
for i in iterable(0, 5):
print i

if __name__ == '__main__':
main()


执行结果如下:

call to the iterable's __init__()
call to the iterable's __iter__()
call to the iterator's __init__()
call to the iterator's next()
1
call to the iterator's next()
2
call to the iterator's next()
3
call to the iterator's next()
4
call to the iterator's next()


虽然[1]中对
__iter__()
有如下要求,但是从上例中可以看出,如
__iter__()
返回另一iterator对象,程序仍能够正常执行。

object.__iter__(self)
:

This method is called when an iterator is required for a container. This method should return a new iterator object that can iterate over all the objects in the container. For mapping, it should iterate over the keys of the container, and should also be made available as the method
iterkeys()


Iterator objects also need to implement this method; they are required to return themselves [1].

Generator

generator function

A function or method which uses the
yield
statement is called a generator function. Such a function, when called, always returns an iterator object which can be used to execute the body of the function: calling the iterator’s
next()
method will cause the function to execute untile it provides a value using the
yield
statement. When the function executes a
return
statement or falls off the end, a
StopIteration
exception is raised and the iterator will have reached the end of the set of values to be returned [1].

generator

A function which returns an iterator. It looks like a normal function except that it contains
yield
statements for producing a series of values usable in a for-loop or that can be retrieved one at a time with the
next()
function. Each
yield
temporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the generator resumes, it picks-up where it left-off(in contrast to functions which start fresh on every invocation) [1].

yield expression

yield_expression ::= "yield" [expression_list]


The
yield
expression is only used when defining a generator function, and can only be used in the body of a function definition. Using a
yield
expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.

When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function. The execution starts when one of the generator’s method is called. At that time, the execution proceeds to the first
yield
expression, where it is suspended again, returning the value of
expression_list
to generator’s caller. By suspended we mean that all local state is retained, including the current bindings of local variables, the instruction pointer, and the internal evaluation stack. When the execution is resumed by calling one of the generator’s methods, the function can proceed exactly as if the
yield
expression was just another external call. The value of the
yield
expression after resuming depends on the method which resumed the execution.

All of this makes generator functions quite similar to coroutines; they yield multiple times, they have more than one entry point and their execution can be suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; the control is always transferred to the generator’s caller [1].

StopIteration

Raised by built-in function
next()
and an iterator’s
__next__()
method to signal that there are no further items produced by the iterator. The exception object has a single attribute value, which is given as an argument when constructing the exception, and defaults to
None
. When a generator and coroutine function returns, a new
StopIteration
instance is raised, and the value returned by the function is used as the value parameter to the constructor of the exception. If a generator function defined in the presence of a
from __future__ import generator_stop
directive raises
StopIteration
, it will be converted into a
RuntimeError
(retaining the
StopIteration
as the new exception’s cause) [1].

简单来说,generator就是包含了
yield
表达式的函数,generator自己自动实现了
__iter__()
next()
方法,因此在定义generator时无需我们定义这两个方法。generator每次执行到
yield
表达式就被挂起(suspend),即generator的执行状态被保存(指令指针,堆栈等),并将控制权交还给调用该generator的函数。当这个函数再次调用任一该generator的方法时,控制权又交还至该generator,且它继续从上次挂起的位置开始执行,测试代码如下:

#!/usr/bin/env python

def generator(init, limit):
print 'In the generator: enter generator.'
cnt = init
while cnt < limit:
print 'In the generator: before yield.'
cnt += 1
print 'Control switch.'
yield cnt
print 'In the generator: after yield'

def main():
gen = generator(0, 5)
for i in range(6):
print 'Control switch.'
print gen.next()
print 'In the main loop'

if __name__ == '__main__':
main()


测试结果如下:

Control switch.
In the generator: enter generator.
In the generator: before yield.
Control switch.
1
In the main loop
Control switch.
In the generator: after yield
In the generator: before yield.
Control switch.
2
In the main loop
Control switch.
In the generator: after yield
In the generator: before yield.
Control switch.
3
In the main loop
Control switch.
In the generator: after yield
In the generator: before yield.
Control switch.
4
In the main loop
Control switch.
In the generator: after yield
In the generator: before yield.
Control switch.
5
In the main loop
Control switch.
In the generator: after yield
Traceback (most recent call last):
File "./test3.py", line 21, in <module>
main()
File "./test3.py", line 17, in main
print gen.next()
StopIteration


除上述描述外,还可以看到当generator在执行
return
语句时将产生
StopIteration
异常。在Python 2.5后,generator还能够接收外部函数调用
generator.send(val)
发送的参数
val
,这些generator方法的定义如下:

generator.next()

Starts the execution of a generator function or resumes it at the last executed
yield
expression. When a generator function is resumed with a
next()
method, the current
yield
expression always evaluates to
None
. The execution then continues to the next
yield
expression, where the generator is suspended again, and the value of the
expression_list
is returned to
next()
’s caller. If the generator exits without yielding another value, a
StopIteration
exception is raised [1].

generator.send(value)

Resumes the execution and “send” a value into the generator function. The
value
argument becomes the result of the current
yield
expression. The
send()
method returns the next value yielded by the generator, or raises
StopIteration
if the generator exists without yielding another value. When
send()
is called to start the generator, it must be called with
None
as the argument, because there is no
yield
expression that could receive the value [1].

其中
generator.next()
用来将控制权返回给generator,而
generator.send(val)
除了将控制权返回给generator外还能够将一个参数传递给generator中的
yield
表达式,测试代码如下:

#!/usr/bin/env python

def repeater(value):
new = None
print 'Enter generator.'
while True:
print 'Before yield: new =', new, 'value =', value
new = (yield value)
print 'After yield: new =', new, 'value =', value
if new is not None:
print 'Now, update value.'
value = new

def main():
r = repeater(42)
print r.send(None)
while True:
value = raw_input('>').strip()
if value == 'q':
break
if value == 'None':
print r.next()
else:
print r.send(value)

if __name__ == '__main__':
main()


测试结果如下:

Enter generator.
Before yield: new = None value = 42
42
>This is a test
After yield: new = This is a test value = 42
Now, update value.
Before yield: new = This is a test value = This is a test
This is a test
>None
After yield: new = None value = This is a test
Before yield: new = None value = This is a test
This is a test
>None
After yield: new = None value = This is a test
Before yield: new = None value = This is a test
This is a test
>78
After yield: new = 78 value = This is a test
Now, update value.
Before yield: new = 78 value = 78
78
>q


Generator Expression

A generator expression is a compact generator notation
a5fe
in parentheses:

generator_expression ::= "(" expression comp_for ")"


A generator expression yields a new generator object. Its sysntax is the same as for comprehensions, except that it is enclosed in parentheses instead of brackets or curly braces.

Variables used in the generator expression evaluated lazily when the
__next__()
method is called for generator object (in the same fashion as normal generators). However, the leftmost
for
clause is immediately evaluated, so that an error produced by it can be seen before any other possible error in the code that handles the generator expression. Subsequent
for
clauses cannot be evaluated immediately since they may depend on the previous
for
loop. The parentheses can be omitted on calls with only one argument.

References:

[1] “The Python Language Reference,” release 2.7.12.

[2] M. L. Hetland, “Begin Python from Novice to Professional,” Second Edition, Apress.

[3] W. J. Chun “Core Python Programming,” Second Edition, Prentice Hall
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python