在docker上部署Hadoop
2016-03-31 17:00
495 查看
一、构建docker镜像
2、镜像中安装Hadoop
由于Hadop已经解压缩在/usr/local/中
目录下建立tmp、namenode、datanode
这里创建了三个目录,后续配置的时候会用到:
tmp:作为Hadoop的临时目录
namenode:作为NameNode的存放目录
datanode:作为DataNode的存放目录
进入到/etc目录下修改三个xml
1).core-site.xml配置
注意:
2).hdfs-site.xml配置
注意:
我们后续搭建集群环境时,将配置一个Master节点和两个Slave节点。所以
3).mapred-site.xml配置
这里只有一个配置项
格式化namenode
3、安装ssh
进入到/etc/ssh的ssh_config中,添加
二、部署Hadoop分布式集群
启动master容器
启动slave1容器
启动slave2容器
在/etc/hosts中添加
在/usr/local/hadoop/etc/hadoop/slaves文件中添加
在mapred-site.xml中添加
1、 mkdir hadoop
2<span style="font-family: Arial, Helvetica, sans-serif;">、将hadoop-2.6.2.tar.gz复制到hadoop文件中</span>
3、vim Dockfile
FROM ubuntu MAINTAINER Docker tianlei <393743083@qq.com> ADD ./hadoop-2.6.2.tar.gz /usr/local/</span>执行命令生成镜像:
docker build -t "ubuntu:base" .运行镜像生成容器:
docker run -d -it --name hadoop ubuntu:hadoop进入到镜像中进行操作:
docker exec -i -t hadoop /bin/bash1、镜像中安装java
sodu apt-get update sudo apt-get install openjdk-7-jre openjdk-7-jdk更改环境变量
vim ~/.bashrc加入此行:
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64
source ~/.bashrc
2、镜像中安装Hadoop
由于Hadop已经解压缩在/usr/local/中
vim ~/.bashrc添加:
export HADOOP_HOME=/usr/local/hadoop export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop export PATH=$PATH:$HADOOP_HOME/bin export PATH=$PATH:$HADOOP_HOME/sbin生成
source ~/.bashrc修改环境变量
cd /usr/local/hadoop/etc/hadoop/ vim hadoop-env.sh修改
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64在hadoop
目录下建立tmp、namenode、datanode
这里创建了三个目录,后续配置的时候会用到:
tmp:作为Hadoop的临时目录
namenode:作为NameNode的存放目录
datanode:作为DataNode的存放目录
进入到/etc目录下修改三个xml
1).core-site.xml配置
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/root/soft/apache/hadoop/hadoop-2.6.0/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value>bin/start-all.sh <final>true</final> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> </configuration>
注意:
hadoop.tmp.dir配置项值即为此前命令中创建的临时目录路径。
fs.default.name配置为
hdfs://master:9000,指向的是一个Master节点的主机(后续我们做集群配置的时候,自然会配置这个节点,先写在这里)
2).hdfs-site.xml配置
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.replication</name> <value>2</value> <final>true</final> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/hadoop/namenode</value> <final>true</final> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/hadoop/datanode</value> <final>true</final> </property> </configuration>
注意:
我们后续搭建集群环境时,将配置一个Master节点和两个Slave节点。所以
dfs.replication配置为2。
dfs.namenode.name.dir和
dfs.datanode.data.dir分别配置为之前创建的NameNode和DataNode的目录路径
3).mapred-site.xml配置
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapred.job.tracker</name> <value>master:9001</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> </configuration>
这里只有一个配置项
mapred.job.tracker,我们指向master节点机器。
格式化namenode
hadoop namenode -format
3、安装ssh
sudo apt-get install ssh在~/.bashrc中添加
#autorun /usr/sbin/sshd生成密钥
cd ~/ ssh-keygen -t rsa -P '' -f ~/.ssh/id_dsa cd .ssh cat id_dsa.pub >> authorized_keys注:有时候会提示/var/run/sshd找不到,只要在run中创建一个sshd文件夹就行
进入到/etc/ssh的ssh_config中,添加
StrictHostKeyChecking no UserKnownHostsFile /dev/null </span>4、生成安装hadoop的镜像
docker commit -m "hadoop install" hadoop ubuntu:hadoop
二、部署Hadoop分布式集群
启动master容器
docker run -d -ti -h master ubuntu:hadoop
启动slave1容器
docker run -d -ti -h slave1 ubuntu:hadoop
启动slave2容器
docker run -d -ti -h slave2 ubuntu:hadoop
在/etc/hosts中添加
10.0.0.5 master 10.0.0.6 slave1 10.0.0.7 slave2
在/usr/local/hadoop/etc/hadoop/slaves文件中添加
slave1 slave2注:由于虚拟机内存不够
在mapred-site.xml中添加
<property> <name>mapreduce.map.memory.mb</name> <value>500</value> </property>
相关文章推荐
- Github docker源码之代码文件docker/image的解读
- 更改Docker默认的images存储位置
- Windows下安装Docker
- Docker 在 openSUSE 下的安装、使用
- docker部署
- Docker 容器中“TERM environment variable not set.”问题解决
- 使用Jenkins Pipeline插件和Docker打造容器化构建环境
- docker简明教程(二)
- Docker 在 openSUSE 下的安装、使用
- docker'部署
- docker部署
- docker部署
- Docker安装与镜像管理(一)
- Docker安装脚本源码解读
- ubuntu中安装docker
- docker简明教程(一)
- Docker平台开发实践---Docker平台知识归纳(一)
- Ubuntu 14 查看 docker中对应容器的 IP
- 【云计算】使用supervisor管理Docker多进程-ntpd+uwsgi+nginx示例最佳实践
- Ubuntu 安装docker