您的位置:首页 > 运维架构 > Linux

Linux高级文本处理工具之sed(二)

2017-01-07 00:04 495 查看

基础命令

以下是要介绍的全部基础命令:

一、替换命令: s替换命令的语法是:
[address]s/pattern/replacement/flags
其中[address]是指地址,pattern是替换命令的匹配表达式,replacement则是对应的替换内容,flags是指替换的标志位,它可以包含以下一个或者多个值:● n: 一个数字(取值范围1-512),表明仅替换第n个被pattern匹配的内容;● g: 表示全局替换,替换所有被pattern匹配的内容;● p: 仅当行被pattern匹配时,打印模式空间的内容;● w file:仅当行被pattern匹配时,将模式空间的内容输出到文件file中;实例1:如果flags为空,则默认替换第一次匹配
[root@localhost ~]# echo 'a b c d' | sed 's/ /;/'
a;b c d
实例2:flags中包含g,则表示全局匹配
[root@localhost ~]# echo 'a b c d'|sed 's/ /;/g'
a;b;c;d
实例3:flags中明确指定替换第n次的匹配,例如n=2
[root@localhost ~]# echo 'a b c d'|sed 's/ /;/2'
a b;c d
实例4:当替换命令的pattern与地址部分是一样的时候,比如/regexp/s/regexp/replacement/可以省略替换命令中的pattern部分。现在要在substitute command后面增加(“s”),同时在被修改的行前面增加+号
[root@localhost ~]# cat -nE paragraph.txt
1  The substitute command is applied to the lines matching the address. If no address is specified, it is applied to all lines that match the pattern, a regular expression. If a regular expression is supplied as an address, and no pattern is specified, the substitute command matches what is matched by the address. This can be useful when the substitute command is one of multiple commands applied at the same address. For an example, see the section “Checking Out Reference Pages” later in this chapter.$

[root@localhost ~]# sed '/substitute command/{s//&("s")/g;s/^/+ /}' paragraph.txt
+ The substitute command("s") is applied to the lines matching the address. If no address is specified, it is applied to all lines that match the pattern, a regular expression. If a regular expression is supplied as an address, and no pattern is specified, the substitute command("s") matches what is matched by the address. This can be useful when the substitute command("s") is one of multiple commands applied at the same address. For an example, see the section “Checking Out Reference Pages” later in this chapter.
说明:replacement部分用到了&这个元字符,它代表之前匹配的内容在replacemnt部分中也有几个特殊的元字符,它们分别是:● &: 被pattern匹配的内容●\num: 被pattern匹配的第num个分组(正则表达式中的概念,..括起来的部分称为分组)●\: 转义符号,用来转义&,\, 回车等符号实例5:修改上面的配置来减少开机启动的时候创建的tty个数,比如我们只想要2个
[root@localhost ~]# grep -B 1 ACTIVE_CONSOLES /etc/sysconfig/init
# What ttys should gettys be started on?
ACTIVE_CONSOLES=/dev/tty[1-6]

[root@localhost ~]# sed -r -i 's@(ACTIVE_CONSOLES=/dev/tty\[1-)6\]@\12\]@' /etc/sysconfig/init

[root@localhost ~]# What ttys should gettys be started on?
ACTIVE_CONSOLES=/dev/tty[1-2]
说明:其中-i参数表示直接修改原文件,-r参数是指使用扩展的正则表达式(ERE),扩展的正则表达式中分组的括号不需要用反斜杠转义。这里 [ 是有特殊含义的(表示字符组),所以需要转义。在替换的内容中使用\1来引用这个匹配的分组内容,1代表分组的编号,表示第一个分组。二、删除命令: d删除命令的语法是:
[address]d
删除命令可以用于删除多行内容,例如1,3d会删除1到3行。删除命令会将模式空间中的内容全部删除,并且导致后续命令不会执行并且读入新行,因为当前模式空间的内容已经为空。实例1:
[root@localhost ~]# sed '2,$d;=' list  #删除并打印行号
1
John Daggett, 341 King Road, Plymouth MA
实例2:
[root@localhost ~]# sed '=;2,$d' list
1
John Daggett, 341 King Road, Plymouth MA
2
3
4
5
6
7
8
实例3:
[root@localhost ~]# sed '2,${d;=}' list
John Daggett, 341 King Road, Plymouth MA
实例4:
[root@localhost ~]# sed -e '2,$d' -e '1=' list
1
John Daggett, 341 King Road, Plymouth MA
说明:仔细对比实例1,2,3,4之间的区别,实例1是先读取list文件中第一行,不符合第二行到最后一行的匹配条件,因此不执行d删除命令,继续执行;注意分号标志此命令语句执行结束;=命令执行的条件无地址定界默认全文,因此匹配模式,继续处理上条d删除命令的输出结果,执行=打印行号命令。实例3的{}相当一个语句块,语句块中的命令相当于对匹配到第二行到最后一行的内容执行。三、插入行/追加行/替换行命令: i/a/c这三个命令的语法如下所示:
# Append 追加
[line-address]a\
text
# Insert 插入
line-address]i\
text
# Change 行替换
[address]c\
text
说明:以上三个命令,行替换命令(c)允许地址为多个地址,其余两个都只允许单个地址追加命令是指在匹配的行后面插入文本text;相反地,插入命令是指匹配的行前面插入文本text;最后,行替换命令会将匹配的行替换成文本text。文本text并没有被添加到模式空间,而是直接输出到屏幕,因此后续的命令也不会应用到添加的文本上。注意,即使使用-n参数也无法抑制添加的文本的输出。实例1:在list文件中第二行后面添加 '------'
[root@localhost ~]# cat list
John Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston MA

[root@localhost ~]# sed  '2a-------------------------------------------' list
John Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
-------------------------------------------
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston MA
实例2:在list文件中第三行前添加'--------'
[root@localhost ~]# sed '3i----------------------------------------' list
John Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
----------------------------------------
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
Terry Kalkas, 402 Lans Road, Beaver Falls PA
Eric Adams, 20 Post Road, Sudbury MA
Hubert Sims, 328A Brook Road, Roanoke VA
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
Sal Carpenter, 73 6th Street, Boston MA
我们来测试下文本是否确实没有添加到模式空间,因为模式空间中的内容默认是会打印到屏幕的:
[root@localhost ~]# sed -n '2a-------------' list
-------------
通过-n参数来抑制输出后发现插入的内容依然被输出,所以可以判定插入的内容没有被添加到模式空间。实例3:使用行替换命令将第2行到最后一行的内容全部替换成'-—-'
[root@localhost ~]# sed '2,$c------------------' list
John Daggett, 341 King Road, Plymouth MA
------------------
四、打印命令: p/l/=这里纯粹的打印命令应该是指p,但是因为后两者(l和=)和p差不多,并且相对都比较简单,所以这里放到一起介绍。这三个命令的语法是:
[address]p
[address]=
[address]l
p命令用于打印模式空间的内容,它不清除模式空间,也不改变脚本控制流程,例如打印list文件的第一行:实例1:
[root@localhost ~]# cat -n list
1  John Daggett, 341 King Road, Plymouth MA
2  Alice Ford, 22 East Broadway, Richmond VA
3  Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
4  Terry Kalkas, 402 Lans Road, Beaver Falls PA
5  Eric Adams, 20 Post Road, Sudbury MA
6  Hubert Sims, 328A Brook Road, Roanoke VA
7  Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
8  Sal Carpenter, 73 6th Street, Boston MA

[root@localhost ~]# sed -n '1p' list #打印list文件第一行
John Daggett, 341 King Road, Plymouth MA
l命令类似p命令,不过会显示控制字符,这个命令和vim的list命令相似。实例2:
[root@localhost ~]# echo -e 'a\n b\n c d' | sed -n 'l'
a$
b$
c d$
=命令显示当前行行号。实例3:
[root@localhost ~]# sed '=' list
1
John Daggett, 341 King Road, Plymouth MA
2
Alice Ford, 22 East Broadway, Richmond VA
3
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
4
Terry Kalkas, 402 Lans Road, Beaver Falls PA
5
Eric Adams, 20 Post Road, Sudbury MA
6
Hubert Sims, 328A Brook Road, Roanoke VA
7
Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
8
Sal Carpenter, 73 6th Street, Boston MA
五、转换命令:y转换命令的语法是:
[address]y/SET1/SET2/
它的作用是在匹配的行上,将SET1中出现的字符替换成SET2中对应位置的字符,例如1,3y/abc/xyz/会将1到3行中出现的a替换成x,b替换成y,c替换成z。是不是觉得这个功能很熟悉,其实这一点和tr命令是一样的。可以通过y命令将小写字符替换成大写字符。实例1:
[root@localhost ~]# echo "hello, world" | sed 'y/abcdefghijklmnopqrstuvwxyz/ABCDEFGHIJKLMNOPQRSTUVWXYZ/'
HELLO, WORLDHELLO, WORLD
说明:SET1、SET2不支持模式匹配六、取下一行命令: n取下一行命令的语法为:
[address]n
n命令为将下一行的内容提前读入,并且将之前读入的行(在模式空间中的行)输出到屏幕,然后后续的命令会应用到新读入的行上。因此n命令也会同d命令一样改变sed的控制流程。实例1:将text文件中.H1后面的空行删除
[root@localhost ~]# cat text
.H1 "On Egypt"

Napoleon, pointing to the Pyramids, said to his troops:
"Soldiers, forty centuries have their eyes upon you."
[root@localhost ~]# sed '/.H1/{n;/^$/d}' text
.H1 "On Egypt"
Napoleon, pointing to the Pyramids, said to his troops:
"Soldiers, forty centuries have their eyes upon you."
注意:此处只能使用 / ,不能使用@、#等符号。七、读写文件命令: r/w读写文件命令的语法是:
[line-address]r file
[address]w file
读命令将指定的文件读取到匹配行之后,并且输出到屏幕,这点类似追加命令(a)。实例1:将包含公司名称列表的文件内容读取到text文件中[company-list]之后
[root@localhost ~]# cat text
For service, contact any of the following companies:
[Company-list]
Thank you.
[root@localhost ~]# cat company.list
Allied
Mayflower
United
[root@localhost ~]# sed '/^\[Company-list\]/r company.list' text #注意需要对[]进行转义
For service, contact any of the following companies:
[Company-list]
Allied
Mayflower
United
Thank you.
实例2:读取实例1的文件并且删除[Company-list]这一行
[root@localhost ~]# sed -e '/^\[Company-list\]/r company.list' -e '/^\[Company-list\]/d;' text
For service, contact any of the following companies:
Allied
Mayflower
United
Thank you.
注意:此处不能使用分号连接,如下所示
[root@localhost ~]# sed '/^\[Company-list\]/r company.list;/^\[Company-list\]/d;' text
For service, contact any of the following companies:
[Company-list]
Thank you.
因为r命令会把company.list;/^[Company-list]/d;整体作为文件名称识别,此文件名不存在所以读取操作认为文件不存在不会报错,原样输出。实例3:将text文件中不同区域的人名写到不同的文件当中
[root@localhost ~]# cat text
Adams, Henrietta Northeast
Banks, Freda South
Dennis, Jim Midwest
Garvey, Bill Northeast
Jeffries, Jane West
Madison, Sylvia Midwest
Sommes, Tom South

[root@localhost ~]# sed '/Northeast$/w region.northeast
/South$/w region.south
/Midwest$/w region.midwest
/West$/w region.west' text
Adams, Henrietta Northeast
Banks, Freda South
Dennis, Jim Midwest
Garvey, Bill Northeast
Jeffries, Jane West
Madison, Sylvia Midwest
Sommes, Tom South
[root@localhost ~]# ls region.*
region.midwest  region.northeast  region.south  region.west
[root@localhost ~]# cat region.*
Dennis, Jim Midwest
Madison, Sylvia Midwest
Adams, Henrietta Northeast
Garvey, Bill Northeast
Banks, Freda South
Sommes, Tom South
Jeffries, Jane West
八、退出命令: q退出命令的语法:
[line-address]q
当sed读取到匹配的行之后即退出,不会再读入新的行,并且将当前模式空间的内容输出到屏幕。实例1:打印list文件前三行的内容
[root@localhost ~]# cat -n list #查看list内容
1  John Daggett, 341 King Road, Plymouth MA
2  Alice Ford, 22 East Broadway, Richmond VA
3  Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
4  Terry Kalkas, 402 Lans Road, Beaver Falls PA
5  Eric Adams, 20 Post Road, Sudbury MA
6  Hubert Sims, 328A Brook Road, Roanoke VA
7  Amy Wilde, 334 Bayshore Pkwy, Mountain View CA
8  Sal Carpenter, 73 6th Street, Boston MA
[root@localhost ~]# sed '3q' list
John Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
实例2:使用p命令打印前三行内容
[root@localhost ~]# sed -n '1,3p' list
John Daggett, 341 King Road, Plymouth MA
Alice Ford, 22 East Broadway, Richmond VA
Orville Thomas, 11345 Oak Bridge Road, Tulsa OK
说明:但是对于大文件来说,实例1比实例2效率更高,因为实例1读取到第N行之后就退出了。实例2虽然打印了前N行,但是后续的行还是要继续读入,只不会不作处理。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  sed基础命令