hadoop集群及HBase+ZooKeeper+Hive完全分布式集群部署安装
这里说复制虚拟机
hadoop单机安装 在这
:
开启远程免密登录配置
ssh-copy-id -i .ssh/id_rsa.pub -p22 root@192.168.106.101
远程登录
ssh -p 22 root@192.168.106.101
vi /hadoop/hdfs-site.xml
vi etc/hadoop/slaves :
hadoop04
hadoop05
hadoop06
在传到其他两个
格式化HDFS
hadoop namenode -format
启动hadoop
start-all.sh(jps查看进程)
安装ZooKeeper
修改zookeepr/conf/zoo.cfg)(修改完后改名)
配置里面的server是zookeeper服务器的主机名。
# The number of milliseconds of each tick tickTime=2000 maxClientCnxns=0 # The number of ticks that the initial # synchronization phase can take initLimit=50 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/opt/hadoop/zookeeperdata # the port at which the clients will connect clientPort=2181 server.1=hadoop01:2888:3888 server.2=hadoop02:2888:3888 server.3=hadoop03:2888:3888
新建目录
在各新建/opt/hadoop/zookprdata/中配置的目录,并添加myid文件,里面内容是该节点对应的server号,如上例hadoop01对应的myid文件内容就是:
1
启动zookeeper
在各zookeeper节点上运行zkServer.sh start
cd /opt/zookeeper
./bin/zkServer.sh start
会有时区错误 附加Linux配置ntp时间服务器(全)
Hbase的安装
修改hbase/conf/hbase-site.xml <configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop01:9000/hbase</value> <description>The directory shared by region servers.</description> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.master.port</name> <value>60000</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>hadoop01,hadoop02,hadoop03</value> </property> <property> <name>hbase.regionserver.handler.count</name> <value>300</value> </property> <property> <name>hbase.hstore.blockingStoreFiles</name> <value>70</value> </property> <property> <name>zookeeper.session.timeout</name> <value>60000</value> </property> <property> <name>hbase.regionserver.restart.on.zk.expire</name> <value>true</value> <description> Zookeeper session expired will force regionserver exit. Enable this will make the regionserver restart. </description> </property> <property> <name>hbase.replication</name> <value>false</value> </property> <property> <name>hfile.block.cache.size</name> <value>0.4</value> </property> <property> <name>hbase.regionserver.global.memstore.upperLimit</name> <value>0.35</value> </property> <property> <name>hbase.hregion.memstore.block.multiplier</name> <value>8</value> </property> <property> <name>hbase.server.thread.wakefrequency</name> <value>100</value> </property> <property> <name>hbase.master.distributed.log.splitting</name> <value>false</value> </property> <property> <name>hbase.regionserver.hlog.splitlog.writer.threads</name> <value>3</value> </property> <property> <name>hbase.hstore.blockingStoreFiles</name> <value>20</value> </property> <property> <name>hbase.hregion.memstore.flush.size</name> <value>134217728</value> </property> <property> <name>hbase.hregion.memstore.mslab.enabled</name> <value>true</value> </property> </configuration>
hbase.rootdir
hbase.zookeeper.quorum
以上两处修改
修改hbase/conf/hbase-env.sh
export HBASE_OFFHEAPSIZE=1G export HBASE_HEAPSIZE=4000 export JAVA_HOME=/opt/j2sdk1.6.29 export HBASE_OPTS="-Xmx4g -Xms4g -Xmn128m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log" export HBASE_MANAGES_ZK=false export HBASE_CLASSPATH=/opt/hadoop/etc/hadoop
修改hbase/conf/log4j.properties
修改如下内容
hbase.root.logger=WARN,console
log4j.logger.org.apache.hadoop.hbase=WARN
在conf/regionservers中添加所有datanode的节点
添加以下内容:
hadooop01
hadooop02
hadooop03
启动Hbase
cd /opt/hbase
bin/start-hbase.sh
配置hive
切换目录: cd /opt/hive/conf
改个名字先mv hive-env.sh.template hive- env.sh
配置hive-env.sh
export HADOOP_ HOME=/opt/hadoop export HIVE_ CONF_ DIR= /opt/hive/conf export HIVE_ AUX_ JARS_ PATH=/opt/hive/lib export JAVA_ HOME= /opt/java8
新建hive- site.xml: vi hive- site.xml
< ?xml version="1.0" encoding= "UTF-8" standalone="no"?>< ?xml-stylesheet type="text/xsl" href= "configuration.xsl"?><configuration> <!-- configurationJäci --> <property> < name> hive.metastore.warehouse.dir</name> <value> /opt/hive/warehouse < /value> </property> < property> < name> hive.metastore.local </name> <value> true</value> </property> <!--如果是远程mysq|数据库的话需要在这里写入远程的IP或hosts -->< property>
保存退出
将mysql-connector-java-5.1.0-bin放入software目录下 将mysql-connector-java-5.1.0-bin移动到/opt/hive/lib目录下: mv mysql-connector-java- 5.1.0-bin /opt/hive/lib 切换目录: cd /opt/hive hadoop fs -ls /查看当前目录文件 hadoop fs -mkdir -p /usr/hive/warehouse hadoop fs -mkdir -p /opt/hive/warehouse hadoop fs -chmod 777 /opt/hive/warehouse (给文件夹赋权) hadoop fs-chmod -R 777 /opt/hive给文件夹赋权(递归查询) 初始化MySQL: schematool -dbType mysql -initSchema
启动hive
cd /opt/hive
bin/hive --service hiveserver
启动顺序:
start-all.sh
启动zookeeper
cd /opt/zookeeper
./bin/zkServer.sh start
启动Hbase
cd /opt/hbase
bin/start-hbase.sh
启动hive
cd /opt/hive
bin/hive --service hiveserver
ps关闭反着关
- 基于hadoop集群的Hive1.2.1、Hbase1.2.2、Zookeeper3.4.8完全分布式安装
- 搭建3个节点的hadoop集群(完全分布式部署)--3 zookeeper与hbase安装
- hadoop-2.6.0+zookeeper-3.4.6+hbase-1.0.0+hive-1.1.0完全分布式集群HA部署
- 完全分布式安装Hadoop,Hive,Hbase,Hwi,Zookeeper-500行说明
- 搭建3个节点的hadoop集群(完全分布式部署)--2安装mysql及hive
- 完全分布式安装Hadoop,Hive,Hbase,Hwi,Zookeeper
- hadoop完全分布式集群+Win Eclipse+Hbase+Hive+Zookeeper+Sqoop+SPARK试验机平台
- hadoop-2.7.1+zookeeper-3.4.8+hbase-1.2.1+apache-hive-2.0.0完全分布式集群
- hadoop-2.7.1+zookeeper-3.4.8+hbase-1.2.1+apache-hive-2.0.0完全分布式集群
- hadoop-2.7.1+zookeeper-3.4.8+hbase-1.2.1+apache-hive-2.0.0完全分布式集群
- hadoop2.71 分布式高可用(HA机制下的,并且使用自己安装的zookeeper的集群)集群 Hbase1.2.6安装
- HBase 1.2.6 完全分布式集群安装部署详细过程
- Hadoop-2.6.0完全分布式集群+Zookeeper安装测试(二)
- hadoop-2.7.3 + hive-2.3.0 + zookeeper 4000 -3.4.8 + hbase-1.3.1 完全分布式安装配置
- Hadoop + HBase + Hive 完全分布式部署笔记
- Hadoop完全分布式集群安装Hbase
- Hadoop、ZooKeeper、Hive、HBase 七节点分布式集群搭建
- Hadoop + HBase + Hive 完全分布式部署笔记
- 大数据: 完全分布式Hadoop集群-HBase安装
- Hadoop及Zookeeper+HBase完全分布式集群部署