您的位置：首页 > 运维架构

Ubuntu上的Hadoop安装教程

2016-04-18 13:47 260 查看

Install Hadoop 2.2.0 on Ubuntu Linux 13.04 (Single-Node Cluster)

This tutorial explains how to install Hadoop 2.2.0/2.3.0/2.4.0/2.4.1 on Ubuntu 13.04/13.10/14.04 (Single-Node Cluster). This setup does not require an additional user for
Hadoop. All files related to Hadoop will be stored inside the ~/hadoop directory.

Install a JRE. If you want the Oracle JRE, follow this post.

Install SSH:

sudo
apt-get install openssh-server

Generate a SSH key:

ssh-keygen
-t rsa -P ""

Enable SSH key:

cat
$HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

(Optional) Disable SSH login from remote addresses by setting in /etc/ssh/sshd_config:

ListenAddress
127.0.0.1

Test local connection:

ssh
localhost

If Ok, then exit:

exit

Otherwise
debug

Download Hadoop 2.2.0 (or newer versions)

Unpack, rename and move to the home directory:

tar
xvf hadoop-2.2.0.tar.gz

mv
hadoop-2.2.0 ~/hadoop

Create HDFS directory:

mkdir
-p ~/hadoop/data/namenode

mkdir
-p ~/hadoop/data/datanode

In file ~/hadoop/etc/hadoop/hadoop-env.sh insert (after the comment "The java implementation to use."):

export
JAVA_HOME="`dirname $(readlink /etc/alternatives/java)`/../"export HADOOP_COMMON_LIB_NATIVE_DIR="~/hadoop/lib"export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=~/hadoop/lib"

In file ~/hadoop/etc/hadoop/core-site.xml (inside <configuration> tag):

<property>
<name>fs.default.name</name> <value>hdfs://localhost:9000</value></property>

In file ~/hadoop/etc/hadoop/hdfs-site.xml (inside <configuration> tag):

<property>
<name>dfs.replication</name> <value>1</value></property><property> <name>dfs.namenode.name.dir</name> <value>${user.home}/hadoop/data/namenode</value></property><property> <name>dfs.datanode.data.dir</name> <value>${user.home}/hadoop/data/datanode</value></property>

In file ~/hadoop/etc/hadoop/yarn-site.xml (inside <configuration> tag):

<property>
<name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value></property><property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value></property>

Create file ~/hadoop/etc/hadoop/mapred-site.xml:

cp
~/hadoop/etc/hadoop/mapred-site.xml.template ~/hadoop/etc/hadoop/mapred-site.xml

And insert (inside <configuration> tag):

<property>
<name>mapreduce.framework.name</name> <value>yarn</value></property>

Add Hadoop binaries to PATH:

echo
"export PATH=$PATH:~/hadoop/bin:~/hadoop/sbin" >> ~/.bashrc

source
~/.bashrc

Format HDFS:

hdfs
namenode -format

Start Hadoop:

start-dfs.sh
&& start-yarn.sh

If you get the warning:

WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

It is because you are running on 64bit but Hadoop native library is 32bit. This is not a big issue. If you want (optional) to fix it, check this.

Check status:

jps

Expected
output (PIDs may change!):

10969
DataNode11745 NodeManager11292 SecondaryNameNode10708 NameNode11483 ResourceManager13096 Jps

N.B. The old JobTracker has been replaced by the ResourceManager.

Access web interfaces:

Cluster status: http://localhost:8088 HDFS status: http://localhost:50070 Secondary NameNode status: http://localhost:50090
Test Hadoop:

hadoop
jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar TestDFSIO -write -nrFiles 20 -fileSize 10

Check the results and remove files:

hadoop
jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar TestDFSIO -clean

And:

hadoop
jar ~/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5

Stop hadoop:

stop-dfs.sh
&& stop-yarn.sh

Some of these steps are taken from this tutorial.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航