您的位置:首页 > 编程语言 > Python开发

Python re 模块整理

2014-10-17 10:44 381 查看
整理下Python re模块几个重要的东西。

使用re 模块,我的习惯

1.编译pattern  

pattern=re.compile(r'hello')

2.使用re的搜索匹配函数

pattern.search("hello world")

3.获取匹配结果

if match:

print match.groups()

re的匹配函数有 match,search,findall,finditer,split,我常用的就这5个

match 返回的是 tuple 元组

search 返回的是 tuple 元组

findall 返回的是list 列表

finditer 返回的是iter 迭代器

split 返回的是list 列表

具体的测试例子如下所示:

xluren@test re_compile]$ cat demo.py
import re

str1='218.205.750.157 46 TCP_MISS [16/Oct/2014:19:29:38 +0800] "GET /i.jpg HTTP/1.1" 200 4576 "-" "-" "GT-droid" "2297768042"'

str2='www.baidu.cn 220.162.917.199 9 TCP_HIT [16/Oct/2014:21:01:39 +0800] "GET /r.gif HTTP/0.0" 200 13815 "-" "-" "vroid" ""'

pattern=re.compile(r'([\w\d.]{0,})\s([0-9.]+)\s(\d+|-)\s(\w+)\s\[([^\[\]]+)\s\+\d+\]\s"((?:[^"]|\")+)"\s(\d{3})\s(\d+|-)\s"((?:[^"]|\")+|-)"\s"(.+|-)"\s"((?:[^"]|\")+)"\s"(.{0,}|-)"$')

print "="*10
print "match test"
match=pattern.match(str1)
if match:
print match.groups()

match=pattern.match(str2)
if match:
print match.groups()
print "return type is :",type(match.groups()).__name__

print "="*10
print "search"
search=pattern.search(str1)
if search:
print search.groups()

search=pattern.search(str2)
if search:
print search.groups()
print "return type is ",type(search.groups()).__name__

print "="*10
print "split"
split=pattern.split(str1)
if split:
print split
print 'return type is ',type(split).__name__

print "="*10
print "finditer"
finditer=pattern.finditer(str1)
if finditer:
for i in finditer:
print i
finditer=pattern.finditer(str2)
if finditer:
for i in finditer:
print i.group()
print "return type is ",type(finditer).__name__

print "="*10
print "findall"
findall=pattern.findall(str1)
if findall:
print findall
findall=pattern.findall(str2)
if findall:
print findall
print "return type is ",type(findall).__name__

print "="*10
p = re.compile(r'(\w+) (\w+)')
s = 'i say, hello world!'
print p.sub(r'\2 \1', s)
[xluren@test re_compile]$

测试输出结果:
[xluren@test re_compile]$ python demo.py
==========
match test
('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')
return type is : tuple
==========
search
('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')
return type is tuple
==========
split
['218.205.750.157 46 TCP_MISS [16/Oct/2014:19:29:38 +0800] "GET /i.jpg HTTP/1.1" 200 4576 "-" "-" "GT-droid" "2297768042"']
return type is list
==========
finditer
www.baidu.cn 220.162.917.199 9 TCP_HIT [16/Oct/2014:21:01:39 +0800] "GET /r.gif HTTP/0.0" 200 13815 "-" "-" "vroid" ""
return type is callable-iterator
==========
findall
[('www.baidu.cn', '220.162.917.199', '9', 'TCP_HIT', '16/Oct/2014:21:01:39', 'GET /r.gif HTTP/0.0', '200', '13815', '-', '-', 'vroid', '')]
return type is list
==========
say i, world hello!
[xluren@test re_compile]$
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python re