您的位置:首页 > 其它

MapReduce基准测试

2017-12-17 21:45 274 查看
一 基准测试简介
1 测试对于验证系统的正确性、分析系统的性能来说非常重要,能对系统有更全面的了解、能找到系统的瓶颈所在、能对系统性能做更好的改进。
2 Hadoop自带了几个基准测试,被打包在几个jar包中,如hadoop-test.jar和hadoop-examples.jar,在Hadoop环境中可以很方便地运行测试。
3、测试基准主要放在:hadoop-mapreduce-client-jobclient-2.7.4-tests.jar中。

二 查看基准测试
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar
An example program must be given as the first argument.
Valid program names are:
DFSCIOTest: Distributed i/o benchmark of libhdfs.
DistributedFSCheck: Distributed checkup of the file system consistency.
JHLogAnalyzer: Job History Log analyzer.
MRReliabilityTest: A program that tests the reliability of the MR framework by injecting faults/failures
NNdataGenerator: Generate the data to be used by NNloadGenerator
NNloadGenerator: Generate load on Namenode using NN loadgenerator run WITHOUT MR
NNloadGeneratorMR: Generate load on Namenode using NN loadgenerator run as MR job
NNstructureGenerator: Generate the structure to be used by NNdataGenerator
SliveTest: HDFS Stress Test and Live Data Verification.
TestDFSIO: Distributed i/o benchmark.
fail: a job that always fails
filebench: Benchmark SequenceFile(Input|Output)Format (block,record compressed and uncompressed), Text(Input|Output)Format (compressed and uncompressed)
largesorter: Large-Sort tester
loadgen: Generic map/reduce load generator
mapredtest: A map/reduce test check.
minicluster: Single process HDFS and MR cluster.
mrbench: A map/reduce benchmark that can create many small jobs
nnbench: A benchmark that stresses the namenode.
sleep: A job that sleeps at each map and reduce task.
testbigmapoutput: A map/reduce program that works on a very big non-splittable file and does identity map/reduce
testfilesystem: A test for FileSystem read/write.
testmapredsort: A map/reduce program that validates the map-reduce framework's sort.
testsequencefile: A test for flat files of binary key value pairs.
testsequencefileinputformat: A test for sequence file input format.
testtextinputformat: A test for text input format.
threadedmapbench: A map/reduce benchmark that compares the performance of maps with multiple spills over maps with 1 spill


三 查看基准测试帮助
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar testfilesystem
Usage: TestFileSystem -files N -megaBytes M [-noread] [-nowrite] [-noseek] [-fastcheck]
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO
17/12/17 20:54:43 INFO fs.TestDFSIO: TestDFSIO.1.8
Missing arguments.
Usage: TestDFSIO [genericOptions] -read [-random | -backward | -skip [-skipSize Size]] | -write | -append | -truncate | -clean [-compression codecClassName] [-nrFiles N] [-size Size[B|KB|MB|GB|TB]] [-resFile resultFileName] [-bufferSize Bytes] [-rootDir]


四 基准测试例子











五 基准测试——TestDFSIO



六 TestDFSIO写模式创建数据



举例
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -write -nrFiles 10 -size 1MB
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -write -nrFiles 10 -size 1MB
17/12/17 21:18:38 INFO fs.TestDFSIO: TestDFSIO.1.8
17/12/17 21:18:38 INFO fs.TestDFSIO: nrFiles = 10
17/12/17 21:18:38 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
17/12/17 21:18:38 INFO fs.TestDFSIO: bufferSize = 1000000
17/12/17 21:18:38 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
17/12/17 21:18:40 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 10 files
17/12/17 21:18:42 INFO fs.TestDFSIO: created control files for: 10 files
17/12/17 21:18:42 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/12/17 21:18:43 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/12/17 21:18:45 INFO mapred.FileInputFormat: Total input paths to process : 10
17/12/17 21:18:45 INFO mapreduce.JobSubmitter: number of splits:10
17/12/17 21:18:45 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1513516522356_0001
17/12/17 21:18:48 INFO impl.YarnClientImpl: Submitted application application_1513516522356_0001
17/12/17 21:18:48 INFO mapreduce.Job: The url to track the job: http://centos:8088/proxy/application_1513516522356_0001/ 17/12/17 21:18:48 INFO mapreduce.Job: Running job: job_1513516522356_0001
17/12/17 21:19:14 INFO mapreduce.Job: Job job_1513516522356_0001 running in uber mode : false
17/12/17 21:19:14 INFO mapreduce.Job:  map 0% reduce 0%
17/12/17 21:22:49 INFO mapreduce.Job:  map 20% reduce 0%
17/12/17 21:24:34 INFO mapreduce.Job:  map 27% reduce 0%
17/12/17 21:24:43 INFO mapreduce.Job:  map 33% reduce 0%
17/12/17 21:26:00 INFO mapreduce.Job:  map 40% reduce 0%
17/12/17 21:26:30 INFO mapreduce.Job:  map 47% reduce 0%
17/12/17 21:26:39 INFO mapreduce.Job:  map 50% reduce 0%
17/12/17 21:26:42 INFO mapreduce.Job:  map 53% reduce 0%
17/12/17 21:26:44 INFO mapreduce.Job:  map 57% reduce 0%
17/12/17 21:26:46 INFO mapreduce.Job:  map 60% reduce 0%
17/12/17 21:27:01 INFO mapreduce.Job:  map 30% reduce 0%
17/12/17 21:27:02 INFO mapreduce.Job: Task Id : attempt_1513516522356_0001_m_000002_0, Status : FAILED
17/12/17 21:31:13 INFO mapreduce.Job:  map 37% reduce 0%
17/12/17 21:31:20 INFO mapreduce.Job:  map 43% reduce 0%
17/12/17 21:31:55 INFO mapreduce.Job:  map 63% reduce 0%
17/12/17 21:32:13 INFO mapreduce.Job:  map 67% reduce 0%
17/12/17 21:32:14 INFO mapreduce.Job:  map 77% reduce 0%
17/12/17 21:32:15 INFO mapreduce.Job:  map 80% reduce 0%
17/12/17 21:32:42 INFO mapreduce.Job:  map 80% reduce 27%
17/12/17 21:33:14 INFO mapreduce.Job:  map 87% reduce 27%
17/12/17 21:33:15 INFO mapreduce.Job:  map 93% reduce 27%
17/12/17 21:33:48 INFO mapreduce.Job:  map 97% reduce 27%
17/12/17 21:33:58 INFO mapreduce.Job:  map 97% reduce 30%
17/12/17 21:34:00 INFO mapreduce.Job:  map 100% reduce 30%
17/12/17 21:34:07 INFO mapreduce.Job:  map 100% reduce 33%
17/12/17 21:34:11 INFO mapreduce.Job:  map 100% reduce 67%
17/12/17 21:34:14 INFO mapreduce.Job:  map 100% reduce 100%
17/12/17 21:34:31 INFO mapreduce.Job: Job job_1513516522356_0001 completed successfully
17/12/17 21:34:35 INFO mapreduce.Job: Counters: 52
File System Counters
FILE: Number of bytes read=849
FILE: Number of bytes written=1335282
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2320
HDFS: Number of bytes written=10485836
HDFS: Number of read operations=43
HDFS: Number of large read operations=0
HDFS: Number of write operations=12
Job Counters
Failed map tasks=3
Killed map tasks=2
Launched map tasks=15
Launched reduce tasks=1
Other local map tasks=5
Data-local map tasks=10
Total time spent by all maps in occupied slots (ms)=4607642
Total time spent by all reduces in occupied slots (ms)=434337
Total time spent by all map tasks (ms)=4607642
Total time spent by all reduce tasks (ms)=434337
Total vcore-milliseconds taken by all map tasks=4607642
Total vcore-milliseconds taken by all reduce tasks=434337
Total megabyte-milliseconds taken by all map tasks=4718225408
Total megabyte-milliseconds taken by all reduce tasks=444761088
Map-Reduce Framework
Map input records=10
Map output records=50
Map output bytes=743
Map output materialized bytes=903
Input split bytes=1200
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=903
Reduce input records=50
Reduce output records=5
Spilled Records=100
Shuffled Maps =10
Failed Shuffles=3
Merged Map outputs=10
GC time elapsed (ms)=657833
CPU time spent (ms)=55460
Physical memory (bytes) snapshot=839290880
Virtual memory (bytes) snapshot=22878818304
Total committed heap usage (bytes)=1240178688
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=1
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1120
File Output Format Counters
Bytes Written=76
17/12/17 21:34:37 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
17/12/17 21:34:37 INFO fs.TestDFSIO:            Date & time: Sun Dec 17 21:34:37 CST 2017
17/12/17 21:34:37 INFO fs.TestDFSIO:        Number of files: 10
17/12/17 21:34:37 INFO fs.TestDFSIO: Total MBytes processed: 10.0
17/12/17 21:34:37 INFO fs.TestDFSIO:      Throughput mb/sec: 0.023679231656291218
17/12/17 21:34:37 INFO fs.TestDFSIO: Average IO rate mb/sec: 0.026267534121870995
17/12/17 21:34:37 INFO fs.TestDFSIO:  IO rate std deviation: 0.007813000393755735
17/12/17 21:34:37 INFO fs.TestDFSIO:     Test exec time sec: 953.499
17/12/17 21:34:37 INFO fs.TestDFSIO:


七 TestDFSIO读模式创建数据



举例
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -read -nrFiles 10 -size 1MB
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -read -nrFiles 10 -size 1MB
17/12/17 21:40:50 INFO fs.TestDFSIO: TestDFSIO.1.8
17/12/17 21:40:50 INFO fs.TestDFSIO: nrFiles = 10
17/12/17 21:40:50 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
17/12/17 21:40:50 INFO fs.TestDFSIO: bufferSize = 1000000
17/12/17 21:40:50 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
17/12/17 21:40:52 INFO fs.TestDFSIO: creating control file: 1048576 bytes, 10 files
17/12/17 21:40:53 INFO fs.TestDFSIO: created control files for: 10 files
17/12/17 21:40:53 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/12/17 21:40:54 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
17/12/17 21:40:55 INFO mapred.FileInputFormat: Total input paths to process : 10
17/12/17 21:40:55 INFO mapreduce.JobSubmitter: number of splits:10
17/12/17 21:40:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1513516522356_0002
17/12/17 21:40:56 INFO impl.YarnClientImpl: Submitted application application_1513516522356_0002
17/12/17 21:40:56 INFO mapreduce.Job: The url to track the job: http://centos:8088/proxy/application_1513516522356_0002/ 17/12/17 21:40:56 INFO mapreduce.Job: Running job: job_1513516522356_0002
17/12/17 21:41:06 INFO mapreduce.Job: Job job_1513516522356_0002 running in uber mode : false
17/12/17 21:41:06 INFO mapreduce.Job:  map 0% reduce 0%
17/12/17 21:44:24 INFO mapreduce.Job:  map 7% reduce 0%
17/12/17 21:44:27 INFO mapreduce.Job:  map 13% reduce 0%
17/12/17 21:44:30 INFO mapreduce.Job:  map 20% reduce 0%
17/12/17 21:44:34 INFO mapreduce.Job:  map 27% reduce 0%
17/12/17 21:45:02 INFO mapreduce.Job:  map 40% reduce 0%
17/12/17 21:45:19 INFO mapreduce.Job:  map 60% reduce 0%
17/12/17 21:46:14 INFO mapreduce.Job:  map 60% reduce 20%
17/12/17 21:47:24 INFO mapreduce.Job:  map 67% reduce 20%
17/12/17 21:47:26 INFO mapreduce.Job:  map 73% reduce 20%
17/12/17 21:47:34 INFO mapreduce.Job:  map 80% reduce 20%
17/12/17 21:47:50 INFO mapreduce.Job:  map 87% reduce 20%
17/12/17 21:47:57 INFO mapreduce.Job:  map 90% reduce 20%
17/12/17 21:47:59 INFO mapreduce.Job:  map 97% reduce 20%
17/12/17 21:48:02 INFO mapreduce.Job:  map 100% reduce 20%
17/12/17 21:48:43 INFO mapreduce.Job:  map 100% reduce 23%
17/12/17 21:48:46 INFO mapreduce.Job:  map 100% reduce 33%
17/12/17 21:48:55 INFO mapreduce.Job:  map 100% reduce 100%
17/12/17 21:49:07 INFO mapreduce.Job: Job job_1513516522356_0002 completed successfully
17/12/17 21:49:09 INFO mapreduce.Job: Counters: 50
File System Counters
FILE: Number of bytes read=836
FILE: Number of bytes written=1335234
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=10488080
HDFS: Number of bytes written=77
HDFS: Number of read operations=53
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Killed map tasks=2
Launched map tasks=11
Launched reduce tasks=1
Data-local map tasks=11
Total time spent by all maps in occupied slots (ms)=2385578
Total time spent by all reduces in occupied slots (ms)=211717
Total time spent by all map tasks (ms)=2385578
Total time spent by all reduce tasks (ms)=211717
Total vcore-milliseconds taken by all map tasks=2385578
Total vcore-milliseconds taken by all reduce tasks=211717
Total megabyte-milliseconds taken by all map tasks=2442831872
Total megabyte-milliseconds taken by all reduce tasks=216798208
Map-Reduce Framework
Map input records=10
Map output records=50
Map output bytes=730
Map output materialized bytes=890
Input split bytes=1200
Combine input records=0
Combine output records=0
Reduce input groups=5
Reduce shuffle bytes=890
Reduce input records=50
Reduce output records=5
Spilled Records=100
Shuffled Maps =10
Failed Shuffles=0
Merged Map outputs=10
GC time elapsed (ms)=280776
CPU time spent (ms)=30790
Physical memory (bytes) snapshot=1092681728
Virtual memory (bytes) snapshot=22867001344
Total committed heap usage (bytes)=1232293888
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1120
File Output Format Counters
Bytes Written=77
17/12/17 21:49:09 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
17/12/17 21:49:09 INFO fs.TestDFSIO:            Date & time: Sun Dec 17 21:49:09 CST 2017
17/12/17 21:49:09 INFO fs.TestDFSIO:        Number of files: 10
17/12/17 21:49:09 INFO fs.TestDFSIO: Total MBytes processed: 10.0
17/12/17 21:49:09 INFO fs.TestDFSIO:      Throughput mb/sec: 0.0585576089757103
17/12/17 21:49:09 INFO fs.TestDFSIO: Average IO rate mb/sec: 0.1249798983335495
17/12/17 21:49:09 INFO fs.TestDFSIO:  IO rate std deviation: 0.14414168891838972
17/12/17 21:49:09 INFO fs.TestDFSIO:     Test exec time sec: 495.526
17/12/17 21:49:09 INFO fs.TestDFSIO:


八 TestDFSIO清除数据



举例
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -clean
[root@centos hadoop-2.7.4]# ./bin/yarn jar /opt/hadoop-2.7.4/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.4-tests.jar TestDFSIO -clean
17/12/17 21:51:31 INFO fs.TestDFSIO: TestDFSIO.1.8
17/12/17 21:51:31 INFO fs.TestDFSIO: nrFiles = 1
17/12/17 21:51:31 INFO fs.TestDFSIO: nrBytes (MB) = 1.0
17/12/17 21:51:31 INFO fs.TestDFSIO: bufferSize = 1000000
17/12/17 21:51:31 INFO fs.TestDFSIO: baseDir = /benchmarks/TestDFSIO
17/12/17 21:51:32 INFO fs.TestDFSIO: Cleaning up test files


九 参考
http://www.jikexueyuan.com/course/2116.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  MapReduce 基准测试