您的位置:首页 > 其它

Awk基本入门[1] Awk Syntax and Basic Commands

2013-06-05 11:31 375 查看
awk是一个操作处理文本文件的强大工具,尤其是处理记录型的文本,也就是每行文本包含多个用分隔符分隔的域。甚至在没有输入文本的情况下也可以做一些逻辑处理。

在接下来的示例中,我们会多次用以下的文档作为操作的对象:

employee.txt is a comma delimited file that contains 5 employee records in the following format:

employee-number,employee-name,employee-title


Create the file:

$ vi employee.txt
101,John Doe,CEO
102,Jason Smith,IT Manager
103,Raj Reddy,Sysadmin
104,Anand Ram,Developer
105,Jane Miller,Sales Manager


items.txt sample file
items.txt is a comma delimited text file that contains 5 item records in the following format:

item-number,item-description,item-category,cost,quantity-available


Create the file:

$ vi items.txt
101,HD Camcorder,Video,210,10
102,Refrigerator,Appliance,850,2
103,MP3 Player,Audio,270,15
104,Tennis Racket,Sports,190,20
105,Laser Printer,Office,475,5


items-sold.txt sample file
items-sold.txt is a space delimited text file that contains 5 item records. Each record is for one particular item that contains the item
number followed by number of items sold for that month (during the last 6 months).

Create the file:

$ vi items-sold.txt
101 2 10 5 8 10 12
102 0 1 4 3 0 2
103 10 6 11 20 5 13
104 2 3 4 0 6 5
105 10 2 5 7 12 6


1、Awk Command Syntax

Basic Awk Syntax:

awk -Fs '/pattern/ {action}' input-file
(or)
awk -Fs '{action}' intput-file


In the above syntax:

-F is the field separator. If you don't specify, it will use an empty space as field delimiter.

The /pattern/ and the {action} should be enclosed inside single quotes.

/pattern/ is optional. If you don't provide it, awk will process all the records from the input-file. If you specify a pattern, it will process only those records from the input-file that match the given pattern.

{action} - These are the awk programming commands, which can be one or multiple awk commands. The whole action block (including all the awk commands together) should be closed between { and }

input-file - The input file that needs to be processed.

也可以将要执行的命令放到一个单独的文件中,然后通过以下的语法来进行调用:

awk -Fs -f myscript.awk input-file


在该语法中,myscript.awk中存放的是要执行的命令。

2、Awk Program Structure (BEGIN, body, END block)

A typical awk program has following three blocks.

1. BEGIN Block
Syntax of begin block:

BEGIN { awk-commands }


The begin block gets executed only once at the beginning, before awk starts executing the body block for all the lines in the input file.

2. Body Block
Syntax of body block:

/pattern/ {action}


The body block gets executed once for every line in the input file.

3. END Block
Syntax of end block:

END { awk-commands }


The end block gets executed only once at the end, after awk completes executing the body block for all the lines in the input-file.

awk执行流程图如下所示:



3、Print Command

默认情况下,print命令(没有参数)会打印输出整条记录,如下例所示,该示例等同于命令'cat employee.txt':

$ awk '{print}' employee.txt
101,John Doe,CEO
102,Jason Smith,IT Manager
103,Raj Reddy,Sysadmin
104,Anand Ram,Developer
105,Jane Miller,Sales Manager


你也可以通过传递特定的域号给print命令,以只打印特定的域,假如我们只想打印雇员的名字(第二列),则可通过以下命令:

$ awk -F ',' '{print $2}' employee.txt
John Doe
Jason Smith
Raj Reddy
Anand Ram
Jane Miller


因为默认情况下的分隔符是空格,所以我们需要通过 -F ',' 来指定分隔符为逗号来正确的获取需要的列。

4、模式匹配

你可以只对匹配特定模式的行执行command,例如:

$ awk -F ',' '/Manager/ {print $2, $3}' employee.txt
Jason Smith IT Manager
Jane Miller Sales Manager


该示例打印经理的姓名和职位。

$ awk -F ',' '/^102/ {print "Emp id 102 is", $2}' employee.txt
Emp id 102 is Jason Smith


该示例只打印编号以102开头的员工的姓名

内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: