您的位置：首页 > 运维架构 > Shell

shell 基础正则表达式即扩展正则表达式

2020-07-25 15:11 736 查看

一、基础正则表达式

查找特定字符串

[root@promote ~]# grep -ni 'the' test.txt
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.
4:The year ahead will test our political establishment to the limit.

“-n”表示显示行号、“-i”表示不区分大小写
若反向选择，如查找不包含“the”字符的行，则需要通过 grep 命令的“-v”选项实现，并配合“-n”一起使用显示行号

[root@promote ~]# grep -vn 'the' test.txt
1:s short and fat.
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.
5:PI=3.141592653589793238462643383249901429
6:a wood cross!
7:Actions speak louder than words
8:AxyzxyzxyzxyzC
9:I bet this place is really spooky late at night! Misfortunes never come alone/single.
10:w
11:wo
12:woo
13:wooood

利用中括号来查找集合字符

[root@promote ~]# grep -n 'sh[io]rt' test.txt
1:s short and fat.
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.

想要查找“shirt”与“short”这两个字符串时，可以发现这两个字符串均包含“sh”与“rt”。此时执行以下命令即可同时查找到“shirt”与“short”这两个字符串，其中“[]”中无论有几个字符，都仅代表一个字符，也就是说“[io]”表示匹配“i”或者“o”。

查找包含重复单个字符时，只需要执行以下命令即可

[root@promote ~]# grep -ni 'bb' test.txt
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.

若不希望前面存在小写字母，可以使用“grep -n‘[^a-z]oo’test.txt”命令实现，其中
“a-z”表示小写字母，大写字母则通过“A-Z”表示。

[root@promote ~]# grep -n '[^a-z]oo' test.txt
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.

查找包含数字的行可以通过“grep -n‘[0-9]’test.txt”命令来实现

[root@promote ~]# grep -n '[0-9]' test.txt
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.
5:PI=3.141592653589793238462643383249901429

查找行首“^”与行尾字符“$”

基础正则表达式包含两个定位元字符：“^”（行首）与“$”（行尾）
如果想要查询以“the”字符串为行首的行，则可以通过“^”元字符来实现。

[root@promote ~]# grep -n '^the' test.txt
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.

若查询不以字母开头的行则使用“[a-zA-Z]”规则。

[root@promote ~]# grep -n '^[^a-z A-Z]' test.txt
14:12elqhwe
15:333service

“^”符号在元字符集合“[]”符号内外的作用是不一样的，在“[]”符号内表示反向选择，在“[]” 符号外则代表定位行首。反之，若想查找以某一特定字符结尾的行则可以使用“KaTeX parse error: Expected group after '^' at position 26: …白行时，执行“grep -n‘^̲’test.txt”命令即可

查找任意一个字符“.”与重复字符“*”

正则表达式中小数点（.）是一个元字符，代表任意一个字符。
“*”代表的是重复零个或多个前面的单字符，允许空字符
例如执行以下命令就可以查找“w??d”的字符串，即共有四个字符，以 t 开头 e 结尾。

[root@promote ~]# grep -ni 't.e' test.txt
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.
4:The year ahead will test our political establishment to the limit.

执行以下命令即可查询任意数字所在行

[root@promote ~]# grep -n '[0-9][0-9]*' test.txt
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.
5:PI=3.141592653589793238462643383249901429
14:12elqhwe
15:333service

查找连续字符范围“{}”

“{}”在 Shell 中具有特殊意义，所以在使用“{}”字符时，需要利用转义字符“\”，将“{}”字符转换成普通字符。“{}”字符的使用方法如下所示。

查询两个 o 的字符。

[root@promote ~]# grep -n 'o\{2\}' test.txt
2:He was wearing a blue polo shirt with black pants. The home of Football on BBC Sport online.
3:the tongue is boneless but it breaks bones.12! google is the best tools for search keyword.
6:a wood cross!
9:I bet this place is really spooky late at night! Misfortunes never come alone/single.
12:woo
13:wooood

查询以 w 开头以 d 结尾，中间包含 2～5 个 o 的字符串。

[root@promote ~]# grep -n 'wo\{2,5\}d' test.txt
6:a wood cross!
13:wooood

查询以 w 开头以 d 结尾，中间包含 2 个或 2 个以上 o 的字符串。

[root@promote ~]# grep -n 'wo\{2,\}d' test.txt
6:a wood cross!
13:wooood

元字符总结

通过上面几个简单的示例，可以了解到常见的基础正则表达式的元字符主要包括以下几个。

元字符	意义
^	匹配输入字符串的开始位置。除非在方括号表达式中使用，表示不包含该字符集合。要匹配“^” 字符本身，请使用“^”
$	匹配输入字符串的结尾位置。如果设置了RegExp 对象的 Multiline 属性，则“KaTeX parse error: Undefined control sequence: \n at position 6: ”也匹配‘\̲n̲’或‘\r’。要匹配“”字符本身，请使用“$”
.	匹配除“\r\n”之外的任何单个字符
\	反斜杠，又叫转义字符，去除其后紧跟的元字符或通配符的特殊意
*	匹配前面的子表达式零次或多次
[]	字符集合。匹配所包含的任意一个字符
[^]	赋值字符集合。匹配未包含的一个任意字符
[n1-n2]	字符范围。匹配指定范围内的任意一个字符
{n}	n 是一个非负整数，匹配确定的 n 次
{n,}	n 是一个非负整数，至少匹配 n 次
{n,m}	m 和 n 均为非负整数，其中 n<=m，最少匹配 n 次且最多匹配m 次

二、扩展正则表达式

通常情况下会使用基础正则表达式就已经足够了，但有时为了简化整个指令，需要使用范围更广的扩展正则表达式。例如，使用基础正则表达式查询除文件中空白行与行首为“#”

之外的行（通常用于查看生效的配置文件），执行“grep -v‘^KaTeX parse error: Expected group after '^' at position 21: ….txt | grep -v‘^̲#’”即可实现。这里需要使用管…|^#’test.txt”，其中，单引号内的管道符号表示或者（or）。此外，grep 命令仅支持基础正则表达式，如果使用扩展正则表达式，需要使用 egrep 或 awk 命令。awk 命令在后面的小节进行讲解，这里我们直接使用 egrep 命令。egrep 命令与 grep 命令的用法基本相似。egrep 命令是一个搜索文件获得模式，使用该命令可以搜索文件中的任意字符串和符号，也可以搜索一个或多个文件的字符串，一个提示符可以是单个字符、一个字符串、一个字或一个句子。与基础正则表达式类型相同，扩展正则表达式也包含多个元字符，常见的扩展正则表达式的元字符主要包括以下几个。

元字符	意义
+	重复一个或者一个以上的前一个字符
?	零个或者一个的前一个字符
丨	使用或者（or）的方式找出多个字符
()	查找“组”字符串
()+	辨别多个重复的组

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航