您的位置:首页 > 编程语言 > Python开发

python-2-1 如何在列表, 字典, 集合中根据条件筛选数据-列表解析-filter

2017-01-14 10:18 856 查看
2-1 如何在列表, 字典, 集合中根据条件筛选数据

预备知识:

本节中我们会用到randint,lambda,timeit,filter等关键字

通常做法是迭代遍历当前的列表,然后再将列表中满足条件的元素存在另外一个列表中

from random import randint

randint(a,b) Return random integer in range [a, b], including both end points.

timeit函数的用法 timeit.timeit(‘func()’,’func所在的域’,func要执行多少次,默认执行一百万次)

from timeit import timeit
print timeit('filter(lambda x: x > 0, [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2])',number=1)
print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]',number=1)
print timeit('filter(lambda x: x > 0, datalist)',setup='from __main__ import datalist',number=1)
print timeit('[x for x in datalist if x >0 ]',setup='from __main__ import datalist',number=1)

t3=Timer("test3()","from __main__ import test3")
print t3.timeit(1000000)
或者
print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]')


解决方案:

列表:

randint是一个两边都是闭区间的函数rand(-10,10)代表产生的随机数在-10和10之间,其中包括 -10,10两个端点
生成随机列表 [ randint(-10,10) for _ in range(10)]

方法一:列表迭代
res=[]
for x in datalist:
if x >= 0:
res.append(x)

方法二:列表解析

[expr for iter_var in iterable] 关键在于这个for,迭代iterable对象的所有条目,前边的expr应用于序列的每个成员,最后结果是该表达式产生的列表

[x for x in datalist if x>=0]

[(x ** 2) for x in xrange(6)]
===>map(lambda x :x **2 ,xrange(6)]
[expr for iter_var in iterable if conf_expr] #可以支持多重for 循环 多重 if 语句
(expr for iter_var in iterable if conf_expr) #生成器表达式

使用生成器可以节省内存
f = open('test.txt','r')
len([word for line in f for word in line.split()])

sum(len(word) for line in f for word in line.split())
max(len(x.strip()) for x in open('test.txt'))

方法三:filter函数

filter(...)  #filter函数中的函数的返回类型是bool型的
filter(function or None, sequence) -> list, tuple, or string

Return those items of sequence for which function(item) is true.  If
function is None, return the items that are true.  If sequence is a tuple
or string, return the same type, else return a list.

def func(x):
if x > 't':
return x

filter(lambda x: x>=0,datalist)
filter(lambda x:x >='t','strxxx')
filter(None,'strxxx')
filter(func,'strxxx')
filter(lambda x : x >4,tuple1)

help(map)
Help on built-in function map in module __builtin__:

map(...) map用来快速生成一个列表,函数中的function 是个表达式,对后面给定的列表进行一定的运算,如果碰到后面有几组列表传进来,map会试着去将这个几个seq 组合起来
map(function, sequence[, sequence, ...]) -> list

Return a list of the results of applying the function to the items of
the argument sequence(s).  If more than one sequence is give
cba1
n, the
function is called with an argument list consisting of the corresponding
item of each sequence, substituting None for missing values when not all
sequences have the same length.  If the function is None, return a list of
the items of the sequence (or a list of tuples if more than one sequence).

map(lambda x,y,z: str(x) + str(y) +str(z),('xxx','yyyzzz'),('123','456'),('abc','def'))
==>['xxx123abc', 'yyyzzz456def']
==>str1 = map(lambda h: h.replace(' ',''),str1)
str1 =["aa","bb","c c","d d","e e"]
str1 = map(lambda h: h.replace(' ',''),str1)
print str1
['aa', 'bb', 'cc', 'dd', 'ee']

>>> help(reduce)
Help on built-in function reduce in module __builtin__:

reduce(...) #reduce从sequence中取出两个元素,把这个两个元素作为结果1,再取第三个元素,结果1和第三个元素 会得出结果2,如此迭代完列表中所有的元素
reduce(function, sequence[, initial]) -> value

Apply a function of two arguments cumulatively to the items of a sequence,
from left to right, so as to reduce the sequence to a single value.
For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
of the sequence in the calculation, and serves as a default when the
sequence is empty.

>>>

reduce(lambda a,b: a&b,map(dict.viewkeys,[dict1,dict2,dict3]))  #取出三个字典中的key的交集,注意这里将map也用进来了
print reduce(func,map(dict.viewkeys,[dict1,dict2,dict3]))
print reduce(lambda x,y:x + y,[1,2,3,4,5])   #reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates ((((1+2)+3)+4)+5)


import time
import timeit
from random import randint

#通过迭代遍历列表的方式

datalist = [ randint(-10,10)  for _ in xrange(10) ]
res=[]
for x in datalist:
if x >=0:
res.append(x)
print res

#filter function

res2 = filter(lambda x: x > 0,datalist)
print res2

#list 列表解析
print [x for x in datalist if x >0 ]

#匿名函数和列表解析的比较 列表解析速度更快
#timeit函数的用法
from timeit import timeit
print timeit('filter(lambda x: x > 0, [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2])',number=1)
print timeit('[x for x in [1,3,-5,5,3,3,3,3,3,33,1,2,3,4,5,5,5,5,5,2] if x >0 ]',number=1)
print timeit('filter(lambda x: x > 0, datalist)',setup='from __main__ import datalist',number=1)
print timeit('[x for x in datalist if x >0 ]',setup='from __main__ import datalist',number=1)

'''
字典过滤

'''
dict1={x:randint(60,100) for x in xrange(20014540,20014550)}
print dict1
print {k:v for k,v in dict1.iteritems() if v >80}

'''
过滤集合

'''
set1=set(datalist)
print {x for x in set1 if x%3==0}
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  过滤列表