您的位置:首页 > 运维架构 > Linux

[置顶] linux大文件分割命令split(学习笔记)

2016-10-17 17:19 656 查看

0x01 简介

linux Ubuntu 16.04LTS输入split --help输出如下:

Usage: split [OPTION]... [FILE [PREFIX]]
Output pieces of FILE to PREFIXaa, PREFIXab, ...;
default size is 1000 lines, and default PREFIX is 'x'.

With no FILE, or when FILE is -, read standard input.

Mandatory arguments to long options are mandatory for short options too.
-a, --suffix-length=N   generate suffixes of length N (default 2)
--additional-suffix=SUFFIX  append an additional SUFFIX to file names
-b, --bytes=SIZE        put SIZE bytes per output file
-C, --line-bytes=SIZE   put at most SIZE bytes of records per output file
-d                      use numeric suffixes starting at 0, not alphabetic
--numeric-suffixes[=FROM]  same as -d, but allow setting the start value
-e, --elide-empty-files  do not generate empty output files with '-n'
--filter=COMMAND    write to shell COMMAND; file name is $FILE
-l, --lines=NUMBER      put NUMBER lines/records per output file
-n, --number=CHUNKS     generate CHUNKS output files; see explanation below
-t, --separator=SEP     use SEP instead of newline as the record separator;
'\0' (zero) specifies the NUL character
-u, --unbuffered        immediately copy input to output with '-n r/...'
--verbose           print a diagnostic just before each
output file is opened
--help     display this help and exit
--version  output version information and exit

The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).

CHUNKS may be:
N       split into N files based on size of input
K/N     output Kth of N to stdout
l/N     split into N files without splitting lines/records
l/K/N   output Kth of N to stdout without splitting lines/records
r/N     like 'l' but use round robin distribution
r/K/N   likewise but only output Kth of N to stdout

GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/split>
or available locally via: info '(coreutils) split invocation'


0x02 参数解析

注意:     默认的行数是1000行

             默认的文件前缀是x开头

0x03 分割小文件

约(36M)

按照每块1M数据分割,大约10秒。

-a, --suffix-length=N     后缀名称的长度
--additional-suffix=SUFFIX  append an additional SUFFIX to file names
-b, --bytes=SIZE          每个输出文件按照字节数分割
-C, --line-bytes=SIZE     每个输出文件按照多少行分割(和参数-b不能同时使用)
-d                        后缀以数字还是字符变化
-e, --elide-empty-files   不产生空的输出文件
--filter=COMMAND      写入到shell命令行
-l, --lines=NUMBER        每个输出文件按照多少行分割
-n, --number=CHUNKS       产生chunks文件
-t, --separator=SEP       使用新字符分割
-u, --unbuffered          无需缓存
--verbose             实时输出
--help                帮助信息
--version             版本信息

split -b 1M -d ooo.txt result --verbose
creating file 'result00'
creating file 'result01'
creating file 'result02'
creating file 'result03'
creating file 'result04'
creating file 'result05'
creating file 'result06'
creating file 'result07'
creating file 'result08'
creating file 'result09'
creating file 'result10'
creating file 'result11'
creating file 'result12'
creating file 'result13'
creating file 'result14'
creating file 'result15'
creating file 'result16'
creating file 'result17'
creating file 'result18'
creating file 'result19'
creating file 'result20'
creating file 'result21'
creating file 'result22'
creating file 'result23'
creating file 'result24'
creating file 'result25'
creating file 'result26'
creating file 'result27'
creating file 'result28'
creating file 'result29'
creating file 'result30'
creating file 'result31'
creating file 'result32'
creating file 'result33'
creating file 'result34'
creating file 'result35'






0x04 分割大文件

约(2.1GB)

每块100M分割,大约2分钟,效率还是比较高得。

split -e -b 100M -d demo.vmem result_demo_ --verbose
creating file 'result_demo_00'
creating file 'result_demo_01'
creating file 'result_demo_02'
creating file 'result_demo_03'
creating file 'result_demo_04'
creating file 'result_demo_05'
creating file 'result_demo_06'
creating file 'result_demo_07'
creating file 'result_demo_08'
creating file 'result_demo_09'
creating file 'result_demo_10'
creating file 'result_demo_11'
creating file 'result_demo_12'
creating file 'result_demo_13'
creating file 'result_demo_14'
creating file 'result_demo_15'
creating file 'result_demo_16'
creating file 'result_demo_17'
creating file 'result_demo_18'
creating file 'result_demo_19'
creating file 'result_demo_20'




内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: