您的位置:首页 > 运维架构

hadoop.terasort测试

2013-02-25 18:36 363 查看
硬件配置:

node configuration: 2*4-core 16GB-ram 4*1T-storage

node number: 11
软件配置(其他是默认设置):
replication:
1
---------------------------------

测试过程中调节的参数:
mapred.tasktracker.map.tasks.maximum=4(共八个cores,
留一个给datanode和tasktracker使用)
mapred.tasktracker.reduce.tasks.maximum=3

---------------------------------

测试性能的参数:

调节文件块大小:64MB->128MB

调节:

<property>

<name>mapred.map.tasks</name>

<value>2</value>

<description>The default number of
map tasks per job.

Ignored when mapred.job.tracker is "local".

</description>

</property>

[bin/hadoop fs -rmr terasort/input-GB001]
bin/hadoop jar hadoop-0.20.2-examples.jar teragen
10000000
terasort/input-GB001

Generating 10000000 using 2 maps with step of 5000000

10/07/27 12:27:39 INFO mapred.JobClient: Running job:
job_201007271223_0003

10/07/27 12:27:40 INFO mapred.JobClient: map 0%
reduce 0%

10/07/27 12:27:54 INFO mapred.JobClient: map 53%
reduce 0%

10/07/27 12:28:00 INFO mapred.JobClient: map 100%
reduce 0%

10/07/27 12:28:02 INFO mapred.JobClient: Job complete:
job_201007271223_0003

10/07/27 12:28:02 INFO mapred.JobClient: Counters: 6

10/07/27 12:28:02 INFO
mapred.JobClient: Job
Counters

10/07/27 12:28:02 INFO
mapred.JobClient:
Launched map tasks=2

10/07/27 12:28:02 INFO
mapred.JobClient:
FileSystemCounters

10/07/27 12:28:02 INFO
mapred.JobClient:
HDFS_BYTES_WRITTEN=1000000000

10/07/27 12:28:02 INFO
mapred.JobClient: Map-Reduce
Framework

10/07/27 12:28:02 INFO
mapred.JobClient:
Map input records=10000000

10/07/27 12:28:02 INFO
mapred.JobClient:
Spilled Records=0

10/07/27 12:28:02 INFO
mapred.JobClient:
Map input bytes=10000000

10/07/27 12:28:02 INFO
mapred.JobClient:
Map output records=10000000
tersgen测试:

hadoop jar hadoop/hadoop-*-examples.jar
teragen
10 terasort/input-KB001

15s

hadoop jar hadoop/hadoop-*-examples.jar
teragen
10000 terasort/input-MB001

13s

hadoop jar hadoop/hadoop-*-examples.jar teragen
10000000 terasort/input-GB001

22s

hadoop jar hadoop/hadoop-*-examples.jar teragen
20000000 terasort/input-GB002

34s

hadoop jar hadoop/hadoop-*-examples.jar teragen
30000000 terasort/input-GB003

46s

hadoop jar hadoop/hadoop-*-examples.jar teragen
40000000 terasort/input-GB004

55s

hadoop jar hadoop/hadoop-*-examples.jar teragen
50000000 terasort/input-GB005

70s

hadoop jar hadoop/hadoop-*-examples.jar teragen 100000000
terasort/input-GB010

122s(mapred.map.tasks=02)

066s(mapred.map.tasks=04)

048s(mapred.map.tasks=06)

045s(mapred.map.tasks=08)

041s(mapred.map.tasks=09)

038s(mapred.map.tasks=10)

034s(mapred.map.tasks=11)Node
number

034s(mapred.map.tasks=12)

034s(mapred.map.tasks=13)

030s(mapred.map.tasks=14)

030s(mapred.map.tasks=15)

030s(mapred.map.tasks=16)

028s(mapred.map.tasks=20)

028s(mapred.map.tasks=22)2CPU*11Node=22CPU

028s(mapred.map.tasks=23)

028s(mapred.map.tasks=24)

028s(mapred.map.tasks=25)

028±1s(mapred.map.tasks=26)

028±1s(mapred.map.tasks=27)

028±1s(mapred.map.tasks=28)

028±1s(mapred.map.tasks=28)

028±1s(mapred.map.tasks=28)

028s(mapred.map.tasks=30)

029s(mapred.map.tasks=35)

030±1s(mapred.map.tasks=44) available
map number=4Map*11Node

043s(mapred.map.tasks=100)

067s(mapred.map.tasks=200)

------------------------------------------------------------------------------------

bin/hadoop fs -cat terasort/input-GB001/part-00000

.t^#\|v$2\
0AAAAAAAAAABBBBBBBBBBCCCCCCCCCCDDDDDDDDDDEEEEEEEEEEFFFFFFFFFFGGGGGGGGGGHHHHHHHH

75@~?'WdUF
1IIIIIIIIIIJJJJJJJJJJKKKKKKKKKKLLLLLLLLLLMMMMMMMMMMNNNNNNNNNNOOOOOOOOOOPPPPPPPP

w[o||:N&H,
2QQQQQQQQQQRRRRRRRRRRSSSSSSSSSSTTTTTTTTTTUUUUUUUUUUVVVVVVVVVVWWWWWWWWWWXXXXXXXX

------------------------------------------------------------------------------------

bin/hadoop jar hadoop-0.20.2-examples.jar terasort
terasort/input-GB001 terasort/output-GB001

10/07/27 00:11:05 INFO
terasort.TeraSort: starting

10/07/27 00:11:05 INFO mapred.FileInputFormat: Total input paths to
process : 2

10/07/27 00:11:06 INFO util.NativeCodeLoader: Loaded the
native-hadoop library

10/07/27 00:11:06 INFO zlib.ZlibFactory: Successfully loaded
& initialized native-zlib library

10/07/27 00:11:06 INFO compress.CodecPool: Got brand-new
compressor

Making 1 from 100000 records

Step size is 100000.0

10/07/27 00:11:06 INFO mapred.JobClient: Running job:
job_201007270004_0003

10/07/27 00:11:07 INFO mapred.JobClient: map 0%
reduce 0%

10/07/27 00:11:21 INFO mapred.JobClient: map 50%
reduce 0%

10/07/27 00:11:24 INFO mapred.JobClient: map 100%
reduce 0%

10/07/27 00:11:33 INFO mapred.JobClient: map 100%
reduce 14%

10/07/27 00:11:36 INFO mapred.JobClient: map 100%
reduce 25%

10/07/27 00:11:39 INFO mapred.JobClient: map 100%
reduce 33%

10/07/27 00:11:54 INFO mapred.JobClient: map 100%
reduce 69%

10/07/27 00:11:57 INFO mapred.JobClient: map 100%
reduce 74%

10/07/27 00:12:00 INFO mapred.JobClient: map 100%
reduce 79%

10/07/27 00:12:03 INFO mapred.JobClient: map 100%
reduce 83%

10/07/27 00:12:06 INFO mapred.JobClient: map 100%
reduce 88%

10/07/27 00:12:09 INFO mapred.JobClient: map 100%
reduce 93%

10/07/27 00:12:15 INFO mapred.JobClient: map 100%
reduce 100%

10/07/27 00:12:17 INFO mapred.JobClient: Job complete:
job_201007270004_0003

10/07/27 00:12:17 INFO mapred.JobClient: Counters: 19

10/07/27 00:12:17 INFO
mapred.JobClient: Job
Counters

10/07/27 00:12:17 INFO
mapred.JobClient:
Launched reduce tasks=1

10/07/27 00:12:17 INFO
mapred.JobClient:
Rack-local map tasks=4

10/07/27 00:12:17 INFO
mapred.JobClient:
Launched map tasks=16

10/07/27 00:12:17 INFO
mapred.JobClient:
Data-local map tasks=12

10/07/27 00:12:17 INFO
mapred.JobClient:
FileSystemCounters

10/07/27 00:12:17 INFO
mapred.JobClient:
FILE_BYTES_READ=2382257412

10/07/27 00:12:17 INFO
mapred.JobClient:
HDFS_BYTES_READ=1000057358

10/07/27 00:12:17 INFO
mapred.JobClient:
FILE_BYTES_WRITTEN=3402255956

10/07/27 00:12:17 INFO
mapred.JobClient:
HDFS_BYTES_WRITTEN=1000000000

10/07/27 00:12:17 INFO
mapred.JobClient: Map-Reduce
Framework

10/07/27 00:12:17 INFO
mapred.JobClient:
Reduce input groups=10000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Combine output records=0

10/07/27 00:12:17 INFO
mapred.JobClient:
Map input records=10000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Reduce shuffle bytes=951549114

10/07/27 00:12:17 INFO
mapred.JobClient:
Reduce output records=10000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Spilled Records=33355441

10/07/27 00:12:17 INFO
mapred.JobClient:
Map output bytes=1000000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Map input bytes=1000000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Combine input records=0

10/07/27 00:12:17 INFO
mapred.JobClient:
Map output records=10000000

10/07/27 00:12:17 INFO
mapred.JobClient:
Reduce input records=10000000

10/07/27 00:12:17 INFO terasort.TeraSort: done

hadoop jar
hadoop-0.20.2-examples.jar terasort terasort/input~KB001
terasort/output~KB001
22s(2个map)

hadoop jar
hadoop-0.20.2-examples.jar terasort terasort/input~MB001
terasort/output~MB001

22s(2个map因为是批处理,所以省去了网络连接的1s)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB001
terasort/output~GB001

76s=22s+54s(16个map)

hadoop jar hadoop-0.20.2-examples.jar terasort
terasort/input~GB002 terasort/output~GB002

136s=22s+114s(30个map)

hadoop jar hadoop-0.20.2-examples.jar terasort
terasort/input~GB003 terasort/output~GB003

187s=22s+165s(46个map)

hadoop jar hadoop-0.20.2-examples.jar terasort
terasort/input~GB004 terasort/output~GB004

250s=22s+228s(60个map)

hadoop jar hadoop-0.20.2-examples.jar terasort
terasort/input~GB005 terasort/output~GB005

307s=22s+285s(76个map)

hadoop jar hadoop-0.20.2-examples.jar terasort terasort/input~GB010
terasort/output~GB010

793s=22s+771s(150个map)
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: