您的位置：首页 > 编程语言 > Python开发

正则表达式 python3.x （一）

2016-05-09 13:48 621 查看

什么是正则表达式：记录文本规则的代码，不是python中特有的功能，是一种通用的方法。

1.1、没有特殊符号，只有基本字母或数字的完全匹配。例如：匹配文本中“is”

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses. "
m =re.findall(r"is", text)
if m:
print(m)
else:
print("not match")

C:\Python34\python.exeC:/Python34/1.py
['is', 'is', 'is','is']

Process finishedwith exit code 0

1.2、“\b”在正则表达式中表示单词的开头或结尾，空格、标点、换行都算是单词的分割。例如：匹配“is”这个单词

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses. "
m =re.findall(r"\bis\b", text)
if m:
print(m)
else:
print("not match")

C:\Python34\python.exeC:/Python34/1.py
['is', 'is']

Process finishedwith exit code 0

1.3“[]”表示满足括号中任一字符。比如：匹配出“i”、“s”字符

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses. "
m =re.findall(r"[is]", text)
if m:
print(m)
else:
print("not match")

C:\Python34\python.exeC:/Python34/1.py
['i', 's', 'i', 'i','s', 's', 'i', 's', 'i', 's', 'i', 'i', 's', 's', 's', 'i', 's', 's', 's']

Process finishedwith exit code 0

1.4“.”表示除去换行符以外的任意字符。例如：匹配“is.”

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses. "
m =re.findall(r"is.", text)
if m:
print(m)
else:
print("is")

C:\Python34\python.exeC:/Python34/1.py
['isb', 'is ','ist', 'is ']

Process finishedwith exit code 0

1.5“\S”表示不是空白符的任意字符，“S'大写。例如：“\Sis\S”

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses. "
m =re.findall(r"\Sis\S", text)
if m:
print(m)
else:
print("is")

C:\Python34\python.exeC:/Python34/1.py
['Disb', 'sist']

Process finishedwith exit code 0

……
2.1“.*”贪婪匹配和“.*？”懒惰匹配
“.”表示除去换行符以外的任意字符；
“*”表示任意数量的连续字符；
“？”表示最小匹配。

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses"
m =re.findall(r"i.*s", text)
if m:
print(m)
else:
print("is")

C:\Python34\python.exeC:/Python34/1.py
['isbelief is moreresistant than faith because it is sustained by the senses']

Process finishedwith exit code 0

import re
text ="Disbelief is more resistant than faith because it is sustained by thesenses"
m =re.findall(r"i.*?s", text)
if m:
print(m)
else:
print("is")

C:\Python34\python.exeC:/Python34/1.py
['is', 'ief is','is', 'ith becaus', 'it is', 'ined by the s']

Process finishedwith exit code 0

2.2正则表达式修饰符 -选项标志：

修辞符	描述
re.I	执行不区分大小写的匹配。
re.L	根据当前的语言环境解释词组。这种解释影响字母组（w和W），以及单词边界的行为（和B）
re.M	使$匹配一行（串的不只是端部）的尾部，使^匹配的行（串不只是开始）的开始
re.S	使一个句号（点）匹配任何字符，包括换行符
re.U	根据Unicode字符集解释的字母。这个标志会影响w, W, , B的行为。
re.X	许可证“cuter”正则表达式语法。它忽略空格（除了一组[]或当用一个反斜杠转义内），并把转义＃作为注释标记

2.3正则表达式模式：

模式	描述
^	匹配的开始的
$	匹配行尾
.	匹配除换行符的任何单个字符。使用-m选项允许其匹配换行符也是如此。
[...]	匹配括号内任何单个字符
[^...]	匹配任何单个字符不在括号中
re*	匹配0个或多个匹配前面表达式。
re+	匹配1个或多个先前出现的表达式。
re?	匹配0或1前面出现的表达式。
re{ n}	精确匹配n个前面表达式的数量。
re{ n,}	匹配n或多次出现上述表达式。
re{ n, m}	匹配至少n次和前面表达式的大多数出现m次。
a\| b	匹配a或b。
(re)	组正则表达式并记住匹配的文本。
(?imx)	暂时切换上 i, m 或 x正则表达式的选项。如果括号中，仅该区域受到影响。
(?-imx)	暂时关闭切换 i, m, 或 x 正则表达式的选项。如果括号中，仅该区域受到影响。
(?: re)	组正则表达式而不匹配的记住文字。
(?imx: re)	暂时切换上i, m, 或 x 括号内的选项。
(?-imx: re)	暂时关闭切换i, m, 或 x 括号内的选项。
(?#...)	注释
(?= re)	指定使用的模式位置，没有一个范围。
(?! re)	指定使用模式取反位置，没有一个范围。
(?> re)	匹配独立的模式而不反向追踪。
w	匹配单词字符。
W	匹配非单词字符
s	匹配的空白，等价于[ tñ r F]
S	匹配非空白
d	匹配的数字。等价于[0-9]
D	匹配非数字
A	匹配字符串的开始
	匹配字符串的结尾。如果一个换行符的存在，它只是换行之前匹配
z	匹配字符串的结尾
G	匹配点，最后一次匹配结束
	匹配单词边界之外时，括号内。匹配退格键（0×08），括号里面的时候
B	匹配非单词边界
, , etc.	匹配换行符，回车符，制表符等
1...9	匹配第n个分组的子表达式。
10	匹配，如果它已经匹配第n个分组的子表达式。否则指的是一个字符码的八进制表示。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航