Hadoop安装实验及MapReduce编程实验指导
实验环境:Red Hat 6.5、Hadoop-2.6.5 JDK1.7.0版本
具体参考实验指导书,本文档做辅助工作,详细命令请看教学实验指导书。
1.Hadoop安装实验
- 准备工作 配置主机名
更改主机名方法:
查看自己虚拟机的IP地址
修改每台服务器上的/etc/hosts文件,添加主机名配置。
- 安装配置JDK
Red Hat系统yum需要注册,所以我们手动下载
Jdk下载版本为:jdk-7u67-linux-x64.tar.gz
配置 在./bashrc文件中添加:
- 创建运行Hadoop程序的用户
在系统中添加运行Hadoop程序的用户hadoop,并修改其登录密码为hadoop
————————————
Linux下赋予普通用户超级权限sudo免密码
vim /etc/sudoers
找到这行 root ALL=(ALL) ALL
在他下面添加xxx ALL=(ALL) ALL
可以sudoers添加下面四行中任意一条
root ALL=(ALL) ALL
user ALL=(ALL) NOPASSWD: ALL
user ALL=(root) //赋予root用户拥有的权限
————————————
- 配置SSH无密钥验证配置
为了实现无口令SSH登录,需要在主机上配置公私密钥方式登录。此时需要通过root帐号登录,修改每个主机上的/etc/ssh/sshd_config文件,去掉RSAAuthentication yes和PubkeyAuthentication yes前面的“#”。
使用创建的hadoop重新登录系统,在每台主机上运行ssh-keygen -t rsa命令生成本地的公私密钥对。
同样的过程需要在3台slave主机上执行。然后通过scp命令将3台slave主机的公钥拷贝到master主机上
其中id_rsa.pub内容如下,是用户hadoop在master主机上的公钥;而slave1.pub、slave2.pub和slave3.pub分别是slave主机的公钥。
之后,将4个主机的公钥加入到master主机的授权文件中[1],之后看起来如下所示:
然后将master主机上的authorized_keys文件分别scp到3台slave主机上
命令 scp authorized_keys slave1:/home/hadoop/.ssh/
此外,还必须修改相关目录及文件的权限,需要在每台服务器上运行:
关闭防火墙
将selinux的值设为disabled
- 安装Hadoop集群 在master上安装
上传文件hadoop-2.6.5.tar.gz到master结点,解压
[hadoop@master ~]tar zxvf hadoop-2.6.5.tar.gz
建立符号连接
[hadoop@master ~]ln -s hadoop-2.6.5 hadoop
为建行配置过程,建立hadoop-config目录,并建立环境变量
[hadoop@master ~]mkdir hadoop-config
[hadoop@master ~]cp hadoop/conf/* ./hadoop-config/
[hadoop@master ~]export HADOOP_CONF_DIR=/home/hadoop/hadoop-config/
建立环境变量
修改etc/hadoop下的配置文件
Core-site.xml
Hdfs-site.xml
masters
slaves
- 修改slave主机
拷贝master主机上的配置和安装文件:
[hadoop@master ~]$ scp -r .bashrc hadoop-config/ hadoop-2.6.5.tar.gz slave1:/home/hadoop/
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/tmp
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/data
[hadoop@slave1 ~]# mkdir /var/lib/cassandra/data/hadoop/name
解压安装文件,创建符号连接
tar zxvf hadoop-0.20.2.tar.gz && ln -s hadoop-2.6.5 hadoop
- 启动服务
若第一次启动服务,首先需要对NameNode结点进行格式化:
[hadoop@master ~]$ hadoop namenode -format
如果报错:则原因是权限不够无法在目录内新建文件,输入
sudo chmod -R a+w /var/lib/
就可以了
如果jps datanode没有启动:
则命令行输入:${HADOOP_HOME}/sbin/hadoop-daemon.sh start datanode
其他节点为启动也类似的方法
关闭Hadoop:stop-all.sh
上传文件
[hadoop@master ~]$ hadoop fs -put hadoop-config/ config
查看
Hadoop fs -ls /usr/Hadoop/
运行例子程序
[hadoop@master hadoop]hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar wordcount /usr/hadoop/config usr/hadoop/results
结果:
…
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000027_0 decomp: 41 len: 45 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 41 bytes from map-output for attempt_local2077883811_0001_m_000027_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 41, inMemoryMapOutputs.size() -> 11, commitMemory -> 25025, usedMemory ->25066
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000003_0 decomp: 3942 len: 3946 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 3942 bytes from map-output for attempt_local2077883811_0001_m_000003_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 3942, inMemoryMapOutputs.size() -> 12, commitMemory -> 25066, usedMemory ->29008
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000016_0 decomp: 1723 len: 1727 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1723 bytes from map-output for attempt_local2077883811_0001_m_000016_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1723, inMemoryMapOutputs.size() -> 13, commitMemory -> 29008, usedMemory ->30731
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000002_0 decomp: 4795 len: 4799 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 4795 bytes from map-output for attempt_local2077883811_0001_m_000002_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4795, inMemoryMapOutputs.size() -> 14, commitMemory -> 30731, usedMemory ->35526
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000029_0 decomp: 15 len: 19 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 15 bytes from map-output for attempt_local2077883811_0001_m_000029_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 15, inMemoryMapOutputs.size() -> 15, commitMemory -> 35526, usedMemory ->35541
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000017_0 decomp: 1777 len: 1781 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1777 bytes from map-output for attempt_local2077883811_0001_m_000017_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1777, inMemoryMapOutputs.size() -> 16, commitMemory -> 35541, usedMemory ->37318
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000011_0 decomp: 2140 len: 2144 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2140 bytes from map-output for attempt_local2077883811_0001_m_000011_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2140, inMemoryMapOutputs.size() -> 17, commitMemory -> 37318, usedMemory ->39458
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000001_0 decomp: 4637 len: 4641 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 4637 bytes from map-output for attempt_local2077883811_0001_m_000001_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 4637, inMemoryMapOutputs.size() -> 18, commitMemory -> 39458, usedMemory ->44095
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000025_0 decomp: 938 len: 942 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 938 bytes from map-output for attempt_local2077883811_0001_m_000025_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 938, inMemoryMapOutputs.size() -> 19, commitMemory -> 44095, usedMemory ->45033
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000024_0 decomp: 1019 len: 1023 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1019 bytes from map-output for attempt_local2077883811_0001_m_000024_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1019, inMemoryMapOutputs.size() -> 20, commitMemory -> 45033, usedMemory ->46052
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000012_0 decomp: 2144 len: 2148 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2144 bytes from map-output for attempt_local2077883811_0001_m_000012_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2144, inMemoryMapOutputs.size() -> 21, commitMemory -> 46052, usedMemory ->48196
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000000_0 decomp: 12150 len: 12154 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 12150 bytes from map-output for attempt_local2077883811_0001_m_000000_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 12150, inMemoryMapOutputs.size() -> 22, commitMemory -> 48196, usedMemory ->60346
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000026_0 decomp: 386 len: 390 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 386 bytes from map-output for attempt_local2077883811_0001_m_000026_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 386, inMemoryMapOutputs.size() -> 23, commitMemory -> 60346, usedMemory ->60732
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000013_0 decomp: 2240 len: 2244 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2240 bytes from map-output for attempt_local2077883811_0001_m_000013_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2240, inMemoryMapOutputs.size() -> 24, commitMemory -> 60732, usedMemory ->62972
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000008_0 decomp: 2387 len: 2391 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2387 bytes from map-output for attempt_local2077883811_0001_m_000008_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2387, inMemoryMapOutputs.size() -> 25, commitMemory -> 62972, usedMemory ->65359
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000021_0 decomp: 1323 len: 1327 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1323 bytes from map-output for attempt_local2077883811_0001_m_000021_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1323, inMemoryMapOutputs.size() -> 26, commitMemory -> 65359, usedMemory ->66682
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000009_0 decomp: 2992 len: 2996 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2992 bytes from map-output for attempt_local2077883811_0001_m_000009_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2992, inMemoryMapOutputs.size() -> 27, commitMemory -> 66682, usedMemory ->69674
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000023_0 decomp: 1212 len: 1216 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1212 bytes from map-output for attempt_local2077883811_0001_m_000023_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1212, inMemoryMapOutputs.size() -> 28, commitMemory -> 69674, usedMemory ->70886
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000022_0 decomp: 1202 len: 1206 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 1202 bytes from map-output for attempt_local2077883811_0001_m_000022_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 1202, inMemoryMapOutputs.size() -> 29, commitMemory -> 70886, usedMemory ->72088
19/06/16 21:15:08 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local2077883811_0001_m_000010_0 decomp: 2111 len: 2115 to MEMORY
19/06/16 21:15:08 INFO reduce.InMemoryMapOutput: Read 2111 bytes from map-output for attempt_local2077883811_0001_m_000010_0
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 2111, inMemoryMapOutputs.size() -> 30, commitMemory -> 72088, usedMemory ->74199
19/06/16 21:15:08 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: finalMerge called with 30 in-memory map-outputs and 0 on-disk map-outputs
19/06/16 21:15:08 INFO mapred.Merger: Merging 30 sorted segments
19/06/16 21:15:08 INFO mapred.Merger: Down to the last merge-pass, with 30 segments left of total size: 73995 bytes
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merged 30 segments, 74199 bytes to disk to satisfy reduce memory limit
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merging 1 files, 74145 bytes from disk
19/06/16 21:15:08 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
19/06/16 21:15:08 INFO mapred.Merger: Merging 1 sorted segments
19/06/16 21:15:08 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 74136 bytes
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
19/06/16 21:15:08 INFO mapred.Task: Task:attempt_local2077883811_0001_r_000000_0 is done. And is in the process of committing
19/06/16 21:15:08 INFO mapred.LocalJobRunner: 30 / 30 copied.
19/06/16 21:15:08 INFO mapred.Task: Task attempt_local2077883811_0001_r_000000_0 is allowed to commit now
19/06/16 21:15:08 INFO output.FileOutputCommitter: Saved output of task 'attempt_local2077883811_0001_r_000000_0' to hdfs://master:9000/user/hadoop/usr/hadoop/results/_temporary/0/task_local2077883811_0001_r_000000
19/06/16 21:15:08 INFO mapred.LocalJobRunner: reduce > reduce
19/06/16 21:15:08 INFO mapred.Task: Task 'attempt_local2077883811_0001_r_000000_0' done.
19/06/16 21:15:08 INFO mapred.LocalJobRunner: Finishing task: attempt_local2077883811_0001_r_000000_0
19/06/16 21:15:08 INFO mapred.LocalJobRunner: reduce task executor complete.
19/06/16 21:15:09 INFO mapreduce.Job: map 100% reduce 100%
19/06/16 21:15:09 INFO mapreduce.Job: Job job_local2077883811_0001 completed successfully
19/06/16 21:15:09 INFO mapreduce.Job: Counters: 38
File System Counters
FILE: Number of bytes read=10542805
FILE: Number of bytes written=19367044
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1847773
HDFS: Number of bytes written=36492
HDFS: Number of read operations=1117
HDFS: Number of large read operations=0
HDFS: Number of write operations=33
Map-Reduce Framework
Map input records=2087
Map output records=7887
Map output bytes=105178
Map output materialized bytes=74319
Input split bytes=3547
Combine input records=7887
Combine output records=3940
Reduce input groups=1570
Reduce shuffle bytes=74319
Reduce input records=3940
Reduce output records=1570
Spilled Records=7880
Shuffled Maps =30
Failed Shuffles=0
Merged Map outputs=30
GC time elapsed (ms)=1088
CPU time spent (ms)=0
Physical memory (bytes) snapshot=0
Virtual memory (bytes) snapshot=0
Total committed heap usage (bytes)=4331405312
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=77522
File Output Format Counters
Bytes Written=36492
在线查看part-r-00000文件中的内容
hadoop fs -cat /user/hadoop/results/part-r-00000
结果如下
…
java 3
javadoc 1
job 6
jobs 10
jobs, 1
jobs. 2
jsvc 2
jvm 3
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
jvm.period=10 1
jvm.servers=localhost:8649 1
key 10
key. 1
key="capacity" 1
key="user-limit" 1
keys 1
keystore 9
keytab 2
killing 1
kms 2
kms-audit 1
language 24
last 1
law 24
leaf 2
level 6
levels 2
library 1
license 12
licenses 12
like 3
limit 1
limitations 24
line 1
links 1
list 44
location 2
log 14
log4j.additivity.kms-audit=false 1
log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false 1
log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false 1
log4j.additivity.org.apache.hadoop.mapred.JobInProgress$JobSummary=false 1
log4j.additivity.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=false 1
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd 1
log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} 1
log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd 1
log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} 1
log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout 1
log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter 1
log4j.appender.JSA.File=${hadoop.log.dir}/${hadoop.mapreduce.jobsummary.log.file} 1
log4j.appender.JSA.MaxBackupIndex=${hadoop.mapreduce.jobsummary.log.maxbackupindex} 1
log4j.appender.JSA.MaxFileSize=${hadoop.mapreduce.jobsummary.log.maxfilesize} 1
log4j.appender.JSA.layout.ConversionPattern=%d{yy/MM/dd 1
log4j.appender.JSA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.JSA=org.apache.log4j.RollingFileAppender 1
log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log 1
log4j.appender.MRAUDIT.MaxBackupIndex=${mapred.audit.log.maxbackupindex} 1
log4j.appender.MRAUDIT.MaxFileSize=${mapred.audit.log.maxfilesize} 1
log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout 1
log4j.appender.MRAUDIT=org.apache.log4j.RollingFileAppender 1
log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender 1
log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} 1
log4j.appender.RFA.MaxBackupIndex=${hadoop.log.maxbackupindex} 1
log4j.appender.RFA.MaxFileSize=${hadoop.log.maxfilesize} 1
log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFA=org.apache.log4j.RollingFileAppender 1
log4j.appender.RFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log 1
log4j.appender.RFAAUDIT.MaxBackupIndex=${hdfs.audit.log.maxbackupindex} 1
log4j.appender.RFAAUDIT.MaxFileSize=${hdfs.audit.log.maxfilesize} 1
log4j.appender.RFAAUDIT.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFAAUDIT.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFAAUDIT=org.apache.log4j.RollingFileAppender 1
log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} 1
log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} 1
log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} 1
log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RFAS=org.apache.log4j.RollingFileAppender 1
log4j.appender.RMSUMMARY.File=${hadoop.log.dir}/${yarn.server.resourcemanager.appsummary.log.file} 1
log4j.appender.RMSUMMARY.MaxBackupIndex=20 1
log4j.appender.RMSUMMARY.MaxFileSize=256MB 1
log4j.appender.RMSUMMARY.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.RMSUMMARY.layout=org.apache.log4j.PatternLayout 1
log4j.appender.RMSUMMARY=org.apache.log4j.RollingFileAppender 1
log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} 1
log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.TLA.layout=org.apache.log4j.PatternLayout 1
log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} 1
log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} 1
log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender 1
log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd 1
log4j.appender.console.layout=org.apache.log4j.PatternLayout 1
log4j.appender.console.target=System.err 1
log4j.appender.console=org.apache.log4j.ConsoleAppender 1
log4j.appender.httpfs.Append=true 1
log4j.appender.httpfs.DatePattern='.'yyyy-MM-dd 1
log4j.appender.httpfs.File=${httpfs.log.dir}/httpfs.log 1
log4j.appender.httpfs.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.httpfs.layout=org.apache.log4j.PatternLayout 1
log4j.appender.httpfs=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.httpfsaudit.Append=true 1
log4j.appender.httpfsaudit.DatePattern='.'yyyy-MM-dd 1
log4j.appender.httpfsaudit.File=${httpfs.log.dir}/httpfs-audit.log 1
log4j.appender.httpfsaudit.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.httpfsaudit.layout=org.apache.log4j.PatternLayout 1
log4j.appender.httpfsaudit=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.kms-audit.Append=true 1
log4j.appender.kms-audit.DatePattern='.'yyyy-MM-dd 1
log4j.appender.kms-audit.File=${kms.log.dir}/kms-audit.log 1
log4j.appender.kms-audit.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.kms-audit.layout=org.apache.log4j.PatternLayout 1
log4j.appender.kms-audit=org.apache.log4j.DailyRollingFileAppender 1
log4j.appender.kms.Append=true 1
log4j.appender.kms.DatePattern='.'yyyy-MM-dd 1
log4j.appender.kms.File=${kms.log.dir}/kms.log 1
log4j.appender.kms.layout.ConversionPattern=%d{ISO8601} 1
log4j.appender.kms.layout=org.apache.log4j.PatternLayout 1
log4j.appender.kms=org.apache.log4j.DailyRollingFileAppender 1
log4j.category.SecurityLogger=${hadoop.security.logger} 1
log4j.logger.com.amazonaws.http.AmazonHttpClient=ERROR 1
log4j.logger.com.amazonaws=ERROR 1
log4j.logger.com.sun.jersey.server.wadl.generators.WadlGeneratorJAXBGrammarGenerator=OFF 1
log4j.logger.httpfsaudit=INFO, 1
log4j.logger.kms-audit=INFO, 1
log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=WARN 1
log4j.logger.org.apache.hadoop.conf=ERROR 1
log4j.logger.org.apache.hadoop.fs.http.server=INFO, 1
log4j.logger.org.apache.hadoop.fs.s3a.S3AFileSystem=WARN 1
log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} 1
log4j.logger.org.apache.hadoop.lib=INFO, 1
log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} 1
log4j.logger.org.apache.hadoop.mapred.JobInProgress$JobSummary=${hadoop.mapreduce.jobsummary.logger} 1
log4j.logger.org.apache.hadoop.yarn.server.resourcemanager.RMAppManager$ApplicationSummary=${yarn.server.resourcemanager.appsummary.logger} 1
log4j.logger.org.apache.hadoop=INFO 1
log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR 1
log4j.rootLogger=${hadoop.root.logger}, 1
log4j.rootLogger=ALL, 1
log4j.threshold=ALL 1
logger 2
logger. 1
logging 4
logs 2
logs.</description> 1
loops 1
manage 1
manager 3
map 2
mapping 1
mapping]* 1
mappings 1
mappings. 1
mapred 1
mapred.audit.log.maxbackupindex=20 1
mapred.audit.log.maxfilesize=256MB 1
mapred.audit.logger=INFO,NullAppender 1
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
mapred.class=org.apache.hadoop.metrics.spi.NullContext 1
mapred.period=10 1
mapred.servers=localhost:8649 1
mapreduce.cluster.acls.enabled 3
mapreduce.cluster.administrators 2
maps 1
master 1
masters 1
match="configuration"> 1
material 3
max 3
maximum 4
may 52
means 18
message 1
messages 2
metadata 1
method="html"/> 1
metrics 3
midnight 1
might 1
milliseconds. 4
min.user.id=1000#Prevent 1
missed 1
modified. 1
modify 1
modifying 1
more 12
mradmin 1
ms) 1
multi-dimensional 1
multiple 4
must 1
name 3
name. 1
name="{name}"><xsl:value-of 1
namenode 1
namenode-metrics.out 1
namenode. 2
namenode.</description> 1
namenode: 1
names. 19
nesting 1
new 2
no 4
nodes 2
nodes. 2
non-privileged 2
normal 1
not 51
null 5
number 6
numerical 3
obtain 24
of 138
off 4
on 30
one 15
one: 2
only 9
operation. 3
operations 8
operations. 9
opportunities 1
option 6
optional. 2
options 11
options. 2
or 71
ordinary 1
org.apache.hadoop.metrics2 1
other 1
other. 6
others 2
overridden 4
override 5
overrides 4
owner 1
ownership. 12
package-info.java 1
parameters 4
parent 1
part 2
password 3
path 2
pending 1
per 1
percent 1
percentage 1
period 1
period, 1
permissions 24
picked 3
pid 3
place 1
please 1
policy 3
port 5
ports 2
ports. 2
potential 2
preferred 3
prefix. 1
present, 1
principal 4
principal. 1
printed 1
priorities. 1
priority 1
privileged 2
privileges 1
privileges. 1
properties 7
property 11
protocol 6
protocol, 2
protocol. 2
provide 3
q1. 1
q2 2
q2. 1
quashed 1
query 1
queue 12
queue). 1
queue, 1
queue. 10
queues 9
queues, 1
queues. 3
rack 1
rack-local 1
recovery. 1
reduce 2
refresh 2
regarding 12
reload 2
remote 2
representing 3
required 29
resolve 2
resources 2
response. 2
restore 1
retrieve 1
return 1
returned 2
rolling 1
rollover-key 1
root 2
rootlogger 1
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
rpc.class=org.apache.hadoop.metrics.spi.NullContext 1
rpc.period=10 1
rpc.servers=localhost:8649 1
run 10
running 3
running, 1
running. 1
runs 3
runtime 2
same 1
sample 1
sampling 2
scale 3
schedule 1
scheduler. 1
schedulers, 1
scheduling 2
secondary 1
seconds 1
seconds). 2
secret 3
secure 7
security 1
segment 1
select="description"/></td> 1
select="name"/></a></td> 1
select="property"> 1
select="value"/></td> 1
send 1
sending 1
separate 2
separated 20
separated. 1
server 3
service 2
service-level 2
set 60
set." 1
sets 2
setting 7
setup 1
severity 1
should 6
sign 1
signature 2
similar 3
single 1
sinks 1
site-specific 4
sizes 1
slave1 1
slave2 1
slave3 1
so 3
softlink 1
software 24
some 2
sometimes 1
source 2
space 1
space), 2
spaces 1
special 19
specific 33
specification. 1
specified 13
specified, 3
specified. 6
specifiying 1
specify 3
specifying 2
split 1
stand-by 1
start 3
started 2
starting 3
state 4
states 1
status 2
stopped, 1
store 1
stored 2
stored. 7
string 3
string, 1
submission 1
submit 2
submitting 1
such 1
summary 5
super-users 1
support 2
supported 1
supports 1
supportsparse 1
suppress 1
symlink 2
syntax 1
syntax: 1
system 3
tag 1
tags 3
target 1
tasks 2
tasktracker. 1
template 1
temporary 2
than 1
that 19
the 370
them 1
then 8
there 2
therefore 3
this 78
threads 1
time 4
timeline 2
timestamp. 1
to 163
top 1
traffic. 1
transfer 4
true. 3
turn 1
two. 3
type 1
type, 1
type="text/xsl" 6
typically 1
u:%user:%user 1
ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext 1
ugi.class=org.apache.hadoop.metrics.ganglia.GangliaContext31 1
ugi.class=org.apache.hadoop.metrics.spi.NullContext 1
ugi.period=10 1
ugi.servers=localhost:8649 1
uncommented 1
under 84
unset 1
up 2
updating 1
usage 2
use 30
use, 2
use. 4
used 36
used. 3
user 48
user. 2
user1,user2 2
user? 1
users 27
users,wheel". 18
uses 2
using 14
value 45
value="20"/> 1
value="30"/> 1
values 4
variable 4
variables 4
version 1
version="1.0" 5
version="1.0"> 1
version="1.0"?> 7
via 3
view, 1
viewing 1
w/ 1
want 1
warnings. 1
when 9
where 4
which 7
while 1
who 6
will 23
window 1
window, 1
with 59
within 4
without 1
work 12
writing, 24
written 2
xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 1
yarn.nodemanager.linux-container-executor.group 1
yarn.nodemanager.linux-container-executor.group=#configured 1
yarn.server.resourcemanager.appsummary.log.file 1
yarn.server.resourcemanager.appsummary.log.file=rm-appsummary.log 1
yarn.server.resourcemanager.appsummary.logger 2
yarn.server.resourcemanager.appsummary.logger=${hadoop.root.logger} 1
you 28
将结果下载到本地:
hadoop fs -get /user/hadoop/usr/hadoop/results count-results
ls
下载到本地后查看,如果成功
那么,实验一完成。
2.MapReduce编程实验
实验目标
了解MapReduce的工作原理
掌握基本的MapReduce的编程方法
学习Eclipse下的MapReduce编程
学会设计、实现、运行MapReduce程序
实验内容
备注:因为版本的原因,教学实验指导书的内容比较陈旧,现在Hadoop内容已经到了2.x、3.x版本,对于实验指导书中Hadoop-0.20版本修改了很多类、方法及API,故本实验在参考教学实验指导书的同时,采用下面的步骤。
示例程序WordCount用于统计一批文本文件中单词出现的频率,完整的代码可在Hadoop安装包中得到(在src/examples 目录中)。(Hadoop2.x版本已有所变化,找不到)
WordCount代码:
[code]/** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ import java.io.IOException; import java.util.StringTokenizer; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.util.GenericOptionsParser; public class WordCount { public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{ private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context ) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } } public static class IntSumReducer extends Reducer<Text,IntWritable,Text,IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context ) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } result.set(sum); context.write(key, result); } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); if (otherArgs.length < 2) { System.err.println("Usage: wordcount <in> [<in>...] <out>"); System.exit(2); } Job job = Job.getInstance(conf, "word count"); job.setJarByClass(WordCount.class); job.setMapperClass(TokenizerMapper.class); job.setCombinerClass(IntSumReducer.class); job.setReducerClass(IntSumReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); for (int i = 0; i < otherArgs.length - 1; ++i) { FileInputFormat.addInputPath(job, new Path(otherArgs[i])); } FileOutputFormat.setOutputPath(job, new Path(otherArgs[otherArgs.length - 1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } }
- 安装配置Eclipse
- 安装Hadoop Eclipse Plugin
我这里用的是hadoop-eclipse-plugin-2.7.3.jar包,将编译好的hadoop-eclipse-plugin相关的jar包,复制到eclipse安装目录下的plugins下。重启eclipse。点击菜单栏Windows–>Preferences,如果插件安装成功,就会出现如下图,在红圈范围内选择hadoop安装目录。
点击Windows–> Show View –> Others –> Map/Redure Location 。
点击下方Map/Redure Locations 窗口,空白处右键New Hadoop location。或点击右边带加号的蓝色小象。填写相关信息。
如何设置信息看下图:
- 在 Eclipse 环境下进行开发和调试
在Eclipse环境下可以方便地进行Hadoop并行程序的开发和调试。推荐使用IBM MapReduce Tools for Eclipse,使用这个Eclipse plugin可以简化开发和部署Hadoop并行程序的过程。基于这个plugin,可以在Eclipse中创建一个Hadoop MapReduce应用程序,并且提供了一些基于MapReduce框架的类开发的向导,可以打包成JAR文件,部署一个Hadoop MapReduce应用程序到一个Hadoop服务器(本地和远程均可),可以通过一个专门的视图(perspective)查看Hadoop服务器、Hadoop分布式文件系统(DFS)和当前运行的任务的状态。
可在IBM alpha Works网站下载这个MapReduce Tool,或在本文的下载清单中下载。将下载后的压缩包解压到你Eclipse安装目录,重新启动Eclipse即可使用了。点击Eclipse主菜单上Windows->Preferences,然后在左侧选择Hadoop Home Directory,设定你的Hadoop主目录。
配置完成,在Eclipse中运行示例。
File —> Project,选择Map/Reduce Project,输入项目名称WordCount等
写入WordCount代码。
点击WordCount.java,右键,点击Run As—>Run Configurations,配置运行参数,参数为:hdfs://localhost:9000/user/hadoop/input hdfs://localhost:9000/user/hadoop/output,分别对应输入和输出。
运行完成,查看结果:
方法1:在终端里面使用命令查看
方法2:直接在Eclipse中查看,DFS Locations,双击打开part-r00000查看结果。
- 改进的WordCount程序
下面对WordCount程序进行一些改进,目标:
- 原WordCount程序仅按空格切分单词,导致各类标点符号与单词混杂在一起,改进后的程序应该能够正确的切出单词,并且单词不要区分大小写。
- 在最终结果中,按单词出现频率的降序进行排序。
WordCount降序输出代码:
[code]package desc; /** * WordCount * 统计输入文件各个单词出现频率 * 统计的时候对于“停词”(从文本文件读入)将不参与统计 * 最后按统计的词频从高到底输出 * * 特别主import某个类的时候,确定你是要用哪个包所属的该类 * * */ import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.util.*; import java.util.Map.Entry; import org.apache.hadoop.filecache.DistributedCache; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapreduce.*; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat; import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; import org.apache.hadoop.mapreduce.lib.map.InverseMapper; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; public class WordCount { /** * Map: 将输入的文本数据转换为<word-1>的键值对 * */ public static class WordCountMap extends Mapper<LongWritable, Text, Text, IntWritable> { String regex = "[.,\"!--;:?'\\]]"; //remove all punctuation Text word = new Text(); final static IntWritable one = new IntWritable(1); HashSet<String> stopWordSet = new HashSet<String>(); /** * 将停词从文件读到hashSet中 * */ private void parseStopWordFile(String path){ try { String word = null; BufferedReader reader = new BufferedReader(new FileReader(path)); while((word = reader.readLine()) != null){ stopWordSet.add(word); } } catch (IOException e) { e.printStackTrace(); } } /** * 完成map初始化工作 * 主要是读取停词文件 * */ public void setup(Context context) { Configuration patternsFiles = null; { patternsFiles = context.getConfiguration(); } if(patternsFiles == null){ System.out.println("have no stopfile\n"); return; } //read stop-words into HashSet for (Entry<String, String> patternsFile : patternsFiles) { parseStopWordFile(patternsFile.toString()); } } /** * map * */ public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String s = null; String line = value.toString().toLowerCase(); line = line.replaceAll(regex, " "); //remove all punctuation //split all words of line StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { s = tokenizer.nextToken(); if(!stopWordSet.contains(s)){ word.set(s); context.write(word, one); } } } } /** * Reduce: add all word-counts for a key * */ public static class WordCountReduce extends Reducer<Text, IntWritable, Text, IntWritable> { int min_num = 0; /** * minimum showing words * */ public void setup(Context context) { min_num = Integer.parseInt(context.getConfiguration().get("min_num")); System.out.println(min_num); } /** * reduce * */ public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum += val.get(); } if(sum < min_num) return; context.write(key, new IntWritable(sum)); } } /** * IntWritable comparator * */ private static class IntWritableDecreasingComparator extends IntWritable.Comparator { public int compare(WritableComparable a, WritableComparable b) { return -super.compare(a, b); } public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) { return -super.compare(b1, s1, l1, b2, s2, l2); } } /** * main: run two job * */ public static void main(String[] args){ boolean exit = false; String skipfile = null; //stop-file path int min_num = 0; String tempDir = "wordcount-temp-" + Integer.toString(new Random().nextInt(Integer.MAX_VALUE)); Configuration conf = new Configuration(); //获取停词文件的路径,并放到DistributedCache中 for(int i=0;i<args.length;i++) { if("-skip".equals(args[i])) { DistributedCache.addCacheFile(new Path(args[++i]).toUri(), conf); System.out.println(args[i]); } } //获取要展示的最小词频 for(int i=0;i<args.length;i++) { if("-greater".equals(args[i])){ min_num = Integer.parseInt(args[++i]); System.out.println(args[i]); } } //将最小词频值放到Configuration中共享 conf.set("min_num", String.valueOf(min_num)); //set global parameter try{ /** * run first-round to count * */ Job job = new Job(conf, "jiq-wordcountjob-1"); job.setJarByClass(WordCount.class); //set format of input-output job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(SequenceFileOutputFormat.class); //set class of output's key-value of MAP job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); //set mapper and reducer job.setMapperClass(WordCountMap.class); job.setReducerClass(WordCountReduce.class); //set path of input-output FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(tempDir)); if(job.waitForCompletion(true)){ /** * run two-round to sort * */ //Configuration conf2 = new Configuration(); Job job2 = new Job(conf, "jiq-wordcountjob-2"); job2.setJarByClass(WordCount.class); //set format of input-output job2.setInputFormatClass(SequenceFileInputFormat.class); job2.setOutputFormatClass(TextOutputFormat.class); //set class of output's key-value job2.setOutputKeyClass(IntWritable.class); job2.setOutputValueClass(Text.class); //set mapper and reducer //InverseMapper作用是实现map()之后的数据对的key和value交换 //将Reducer的个数限定为1, 最终输出的结果文件就是一个 /** * 注意,这里将reduce的数目设置为1个,有很大的文章。 * 因为hadoop无法进行键的全局排序,只能做一个reduce内部 * 的本地排序。 所以我们要想有一个按照键的全局的排序。 * 最直接的方法就是设置reduce只有一个。 */ job2.setMapperClass(InverseMapper.class); job2.setNumReduceTasks(1); //only one reducer //set path of input-output FileInputFormat.addInputPath(job2, new Path(tempDir)); FileOutputFormat.setOutputPath(job2, new Path(args[1])); /** * Hadoop 默认对 IntWritable 按升序排序,而我们需要的是按降序排列。 * 因此我们实现了一个 IntWritableDecreasingComparator 类, * 并指定使用这个自定义的 Comparator 类对输出结果中的 key (词频)进行排序 * */ job2.setSortComparatorClass(IntWritableDecreasingComparator.class); exit = job2.waitForCompletion(true); } }catch(Exception e){ e.printStackTrace(); }finally{ try { //delete tempt dir FileSystem.get(conf).deleteOnExit(new Path(tempDir)); if(exit) System.exit(1); System.exit(0); } catch (IOException e) { e.printStackTrace(); } } } }
运行结果:
到此为止,实验一和实验二完成。
致谢:
[1]https://www.geek-share.com/detail/2672055746.html
[2]https://blog.csdn.net/xingyyn78/article/details/81085100
[3]https://www.geek-share.com/detail/2606983100.html
[4]https://www.cnblogs.com/StevenSun1991/p/6931500.html
[5]https://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop2/index.html
- 对于Hadoop的MapReduce编程makefile
- myeclipse 安装hadoop插件及开发MapReduce
- Hadoop MapReduce编程 API入门系列之wordcount版本4(八)
- Hadoop之MapReduce实验报告
- Hadoop-MapReduce编程思想浅析
- Hadoop MapReduce编程 API入门系列之小文件合并(二十九)
- 实验一 JDK+Hadoop安装配置、单机模式配置、伪分布式、完全分布式配置
- hadoop2.7.1安装部署错误,求大神指导
- 大数据学习笔记——Hadoop编程实战之Mapreduce
- Hadoop入门系列3——安装,配置,编程,部署和运行
- hadoop2.7【单节点】单机、伪分布、分布式安装指导
- hadoop初识之十:mapreduce编程模型与数据传输格式
- JDK+Hadoop安装配置、单机模式配置(实验一)
- Hadoop编程之MapReduce操作Mysql数据库
- hadoop2.2.0安装中遇到的错误:mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid
- Hadoop MapReduce编程 API入门系列之MapReduce多种输入格式(十七)
- Hadoop MapReduce编程 API入门系列之网页流量版本1(二十一)
- rpm傻瓜安装hadoop-1.2.1总结,单机+hdfs+mapreduce
- hadoop2.6.5集群安装及mapreduce测试运行
- 零基础学习hadoop到上手工作线路指导初级篇:hive及mapreduce