Hadoop3+Hive3安装记录(虚拟机搭建分布式环境,报错解决)
说明:相关文章内容为本人学习记录,参考网络分享,如有侵权联系删除!仅供技术分享非商用!
概述
工具:VMware 14
目的:创建三个虚拟机,网络以桥接模式,三台虚拟机在同一网段,保证三台机器能够相互ping通。
流程步骤:
① 下载 CentOS7 ISO镜像使用VM创建第一个虚拟机;
② 通过 VM 克隆创建剩下两个虚拟机;
③ 设置三个系统的主机名以及网络,并相互设置ssh免密登录;
④ 安装JDK;
⑤ 安装 Hadoop3.2 ;
⑥ 安装 Hive3.1 ;
配置Linux主机信息
- 修改主机名称(CentOS7)
[code] # 使用这个命令会立即生效且重启也生效 [root@smallsuperman ~]# hostnamectl set-hostname outman00 [root@smallsuperman ~]# hostname outman00 # 编辑下hosts文件, 给127.0.0.1添加hostname [root@smallsuperman ~]# vi /etc/hosts [root@smallsuperman ~]# cat /etc/hosts 127.0.0.1 localhost smallsuperman.centos localhost4 localhost4.localdomain4 outman00 ::1 localhost smallsuperman.centos localhost6 localhost6.localdomain6
- 内网映射
[code]# 主机名代替ip访问 [root@smallsuperman ~]# sed -i '$a\192.168.233.132 outman00' /etc/hosts [root@smallsuperman ~]# sed -i '$a\192.168.233.130 outman01' /etc/hosts [root@smallsuperman ~]# sed -i '$a\192.168.233.131 outman02' /etc/hosts [root@smallsuperman ~]# ping outman00 # 测试通否 PING localhost (127.0.0.1) 56(84) bytes of data
- 关闭防火墙
[code]# 检查防火墙状态 # 看到绿色字样标注的“active(running)”,说明防火墙是开启状态 # disavtive(dead)的字样,说明防火墙已经关闭 [root@smallsuperman ~]# systemctl status firewalld.service # 关闭运行的防火墙 [root@smallsuperman ~]# systemctl stop firewalld.service # 禁止防火墙服务器,系统重启不会开启防火墙 [root@smallsuperman ~]# systemctl disable firewalld.service
- 创建Hadoop用户,并赋权
[code]# 增加一个用户 [root@smallsuperman ~]# adduser hadoop # 赋权 以root用户身份为hadoop用户赋权,在 root 账号下,命令终端输入:vi /etc/sudoers 找到 root ALL=(ALL) ALL 添加一行内容 hadoop ALL=(ALL) ALL
- 设置SSH免密登录
[code]# 每个主机都生成密钥(一直回车) [hadoop@outman02 ~]$ ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): Created directory '/home/hadoop/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/hadoop/.ssh/id_rsa. Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub. The key fingerprint is: SHA256:rKWn0xjk+J3ZShFMO3pYx6sBxCpl1YkUojNsG4HQ1iE hadoop@outman02 The key's randomart image is: +---[RSA 2048]----+ |ooE.o==+.. | |..o++ooooo | | .Bo .. * o | | ..=. .* + . | | .. +o S . | | . o= + | | .o=++ | | ++= . | | .... | +----[SHA256]-----+ # 与其他主机建立免密连接,将自己的公钥拷贝至其他主机的authorized_keys文件中。 [hadoop@outman02 ~]$ ssh-copy-id outman01 [hadoop@outman02 ~]$ ssh-copy-id outman00 [hadoop@outman01 ~]$ ssh-copy-id outman00 [hadoop@outman01 ~]$ ssh-copy-id outman02 [hadoop@outman00 ~]$ ssh-copy-id outman01 [hadoop@outman00 ~]$ ssh-copy-id outman02 # 测试一下免密登录 [hadoop@outman00 ~]$ ssh hadoop@outman01 Last failed login: Mon Jun 3 02:16:04 CST 2019 from outman00 on ssh:notty There were 3 failed login attempts since the last successful login. Last login: Mon Jun 3 02:13:15 2019 [hadoop@outman01 ~]$ pwd /home/hadoop
安装JDK
1、去官网下载软件的安装压缩包 ;
2、上传一份到主机临时目录下(我的是在/tmp/tar_gz),然后通过 scp 到另外两个服务器相同位置;
- 解压到指定目录下
[code][root@outman02 tar_gz]# tar -zxvf jdk-8u211-linux-x64.tar.gz -C /usr/local/my_app
- 配置环境变量
[code][root@outman00 jdk1.8.0_211]# sed -i '$a\\nexport JAVA_HOME=/usr/local/my_app/jdk1.8.0_211\nexport PATH=$PATH:$JAVA_HOME/bin ' /etc/profile # 更新加载环境变量 [root@outman00 jdk1.8.0_211]# source /etc/profile # 检查是否安装成功 [root@outman00 jdk1.8.0_211]# java -version java version "1.8.0_211" Java(TM) SE Runtime Environment (build 1.8.0_211-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.211-b12, mixed mode)
安装Hadoop
主要目录
(1)bin目录:存放对Hadoop相关服务(HDFS,YARN)进行操作的脚本
(2)etc目录:Hadoop的配置文件目录,存放Hadoop的配置文件
(3)lib目录:存放Hadoop的本地库(对数据进行压缩解压缩功能)
(4)sbin目录:存放启动或停止Hadoop相关服务的脚本
(5)share目录:存放Hadoop的依赖jar包、文档、和官方案例
- 解压到指定目录下
[code][root@outman00 tar_gz]# tar -zxvf hadoop-3.2.0.tar.gz -C /usr/local/my_app/hadoop
修改配置文件
- 进入解压路径
[code][root@outman00 hadoop-3.2.0]# cd /usr/local/my_app/hadoop/hadoop-3.2.0/etc/hadoop/
- 修改配置文件中JDK路径
[code][root@outman00 hadoop]# vi /usr/local/my_app/hadoop/hadoop-3.2.0/etc/hadoop/hadoop-env.sh # 修改内容,添加JDK的路径信息 52 # The java implementation to use. By default, this environment 53 # variable is REQUIRED on ALL platforms except OS X! 54 export JAVA_HOME=/usr/local/my_app/jdk1.8.0_211
- 修改 core-site.xml 文件
fs.defaultFSHDFS中NameNode的地址 端口。hadoop.tmp.dirHadoop运行时产生文件的存储目录。
[code]<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://outman00:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/usr/local/my_app/hadoop/hadoop_data</value> </property> </configuration>
- 修改 hdfs-site.xml
[code]<configuration> <property> <name>dfs.namenode.name.dir</name> <value>/usr/local/my_app/hadoop/hadoop_data/namenode_data</value> <description>元数据存储目录,安全起见可配置到其他目录</description> </property> <property> <name>dfs.datanode.data.dir</name> <value>/usr/local/my_app/hadoop/hadoop_data/datanode_data</value> <description>datanode 的数据存储目录</description> </property> <property> <name>dfs.replication</name> <value>2</value> <description>HDFS 的数据块的副本个数</description> </property> <property> <name>dfs.secondary.http.address</name> <value>outman01:50090</value> <description>secondarynamenode 节点信息,最好是和namenode 设置为不同节点</description> </property> </configuration>
- 修改 yarn-site.xml
yarn.nodemanager.aux-services YARN集群为 MapReduce 程序提供的 shuffle 服务yarn.resourcemanager.hostnameResourceManager的信息
[code]<configuration> <!-- Reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- 指定YARN的ResourceManager的地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>outman00</value> </property> </configuration>
- 修改 mapred-site.xml
配置采用yarn作为资源调度框架(指定MR运行在YARN上)
[code]<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
- 修改workers (3.0之前是slaves文件 )
[code]outman00 outman01 outman02
- 把安装的Hadoop目录文件分发给另外两个节点(scp过去)
[code][hadoop@outman00 hadoop]$ scp -r hadoop-3.2.0/ hadoop@outman02:/usr/local/my_app/hadoop [hadoop@outman00 hadoop]$ scp -r hadoop-3.2.0/ hadoop@outman01:/usr/local/my_app/hadoop
- 配置环境变量(三个节点 /etc/profile)
[code][root@outman00 hadoop]# sed -i '$a\export HADOOP_HOME=/usr/local/my_app/hadoop/hadoop-3.2.0\nexport PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin' /etc/profile # 更新环境变量配置 [root@outman00 hadoop]# source /etc/profile # 验证 [root@outman02 hadoop]# hadoop --help
- 初始化namenode
在HDFS 主节点(core-site.xml中配置的fs.defaultFS),执行初始化命令,成功后会根据配置的信息创建对应的data目录如果需要重新初始化,删除后重新执行即可!
[code][root@outman00 hadoop]# hadoop namenode -format # 判断成功关键信息 2019-06-05 01:58:12,198 INFO common.Storage: Storage directory /usr/local/my_app/hadoop/hadoop_data/namenode_data has been successfully formatted.
- 启动hdfs(执行启动脚本任意节点)
[code][hadoop@outman00 ~]$ start-dfs.sh [hadoop@outman00 ~]$ jps 8949 DataNode 8840 NameNode 9229 Jps [hadoop@outman01 hadoop_data]$ jps 8071 SecondaryNameNode 8137 Jps 7997 DataNode [hadoop@outman02 hadoop_data]$ jps 7973 Jps 7817 DataNode
-
出现问题
NameDode 主节点可以访问hdfs,但是另外两个节点无法访问
[code]# 主节点正常访问 [hadoop@outman00 ~]$ hadoop fs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2019-06-06 00:34 /dyp # 次节点无法访问 [hadoop@outman01 ~]$ hadoop fs -ls / ls: Call From outman01/192.168.233.130 to outman00:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused # 次节点无法访问 [hadoop@outman02 ~]$ hadoop fs -ls / ls: Call From localhost/127.0.0.1 to outman00:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
- 根据配置信息 core-site.xml 检查是否可以访问主节点的9000端口
[code]# 主节点可以 [root@outman00 datanode_data]# telnet outman00 9000 Trying 127.0.0.1... Connected to outman00. Escape character is '^]'. # 次节点不可以 [root@outman01 ~]# telnet outman00 9000 Trying 192.168.233.132... telnet: connect to address 192.168.233.132: Connection refused [root@outman02 xinetd.d]# telnet outman00 9000 Trying 192.168.233.132... telnet: connect to address 192.168.233.132: Connection refused
- 检查主节点9000端口占用情况
发现9000端口被 127.0.0.1:本地占用,也就是只有本地才能访问 (HDFS监听的9000端口默认绑定127.0.0.1地址)
[code][root@outman00 datanode_data]# lsof -i:9000 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 7339 hadoop 269u IPv4 42236 0t0 TCP localhost:cslistener (LISTEN) java 7339 hadoop 279u IPv4 44037 0t0 TCP localhost:cslistener->localhost:51560 (ESTABLISHED) java 7413 hadoop 328u IPv4 44036 0t0 TCP localhost:51560->localhost:cslistener (ESTABLISHED) [root@outman00 datanode_data]# netstat -tunlp |grep 9000 tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN 7339/java
- 尝试修改host配置
[code]#127.0.0.1 localhost smallsuperman.centos localhost4 localhost4.localdomain4 outman00 #::1 localhost smallsuperman.centos localhost6 localhost6.localdomain6 192.168.233.132 outman00 192.168.233.130 outman01 192.168.233.131 outman02
- 重启hadoop
[code][hadoop@outman00 ~]$ stop-all.sh [hadoop@outman00 ~]$ start-all.sh
- 检查主节点NameNode上9000占用(占用的ip变成了主节点其他节点可以访问了)
[code][root@outman00 datanode_data]# netstat -tunlp | grep 9000 tcp 0 0 192.168.233.132:9000 0.0.0.0:* LISTEN 10843/java
- 其他节点访问hdfs
[code][hadoop@outman01 ~]$ hadoop fs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2019-06-06 00:34 /dyp [hadoop@outman02 ~]$ hadoop fs -ls / Found 1 items drwxr-xr-x - hadoop supergroup 0 2019-06-06 00:34 /dyp
- 访问 HDFS 前台页面
http://192.168.233.132:9870
特别注意
hadoop3.0之前web访问端口是50070
hadoop3.0之后web访问端口为9870
- 启动yarn(要在yarn的主节点启动)
[code][hadoop@outman00 hadoop_data]$ start-yarn.sh Starting resourcemanager Starting nodemanagers # 查看进程(主节点增加 ResourceManager 、NodeManager 其他节点增加 NodeManager) [hadoop@outman00 hadoop_data]$ jps 9811 Jps 8949 DataNode 9463 NodeManager 8840 NameNode 9353 ResourceManager [hadoop@outman01 hadoop_data]$ jps 8227 NodeManager 8071 SecondaryNameNode 8327 Jps 7997 DataNode
- 测试yarn
[code][hadoop@outman00 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar wordcount /dyp/test/test /dyp/test/test_out # 报错 [2019-06-06 02:01:09.415]Container exited with a non-zero exit code 1. Error file: prelaunch.err. Last 4096 bytes of prelaunch.err : Last 4096 bytes of stderr : 错误: 找不到或无法加载主类 org.apache.hadoop.mapreduce.v2.app.MRAppMaster # 解决方法 : 在配置文件 mapred-site.xml 文件中添加 mapreduce 程序所用到的 classpath 如下 # /usr/local/my_app/hadoop/hadoop-3.2.0/ 就是hadoop安装路径 <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.application.classpath</name> <value>/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/mapreduce/*, /usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/mapreduce/lib/*</value> </property> </configuration>
- 再次测试
[code][hadoop@outman00 mapreduce]$ hadoop jar hadoop-mapreduce-examples-3.2.0.jar wordcount /dyp/test/test /dyp/test/test_out [hadoop@outman00 mapreduce]$ hadoop fs -ls /dyp/test/test_out Found 2 items -rw-r--r-- 2 hadoop supergroup 0 2019-06-06 02:16 /dyp/test/test_out/_SUCCESS -rw-r--r-- 2 hadoop supergroup 29 2019-06-06 02:16 /dyp/test/test_out/part-r-00000 [hadoop@outman00 mapreduce]$ hadoop fs -cat /dyp/test/test_out/part-r-00000 1|2|3 1 A|B|C 1 A|B|C1|2|3 1
安装 MySQL
我这里使用腾讯云服务器上安装在 Docker 中的 MySQL,所以在虚拟上只需要安装MySQL的客户端就可以了,只用于访问
- 安装mysql-client
[code][root@outman00 ~]# yum install mysql # 连接腾讯云MySQL [root@outman00 ~]# mysql -h 腾讯云MySQL的IP -u root -p
安装Hive3
- 解压到指定目录下
[code][root@outman00 tar_gz]# tar -zxvf apache-hive-3.1.1-bin.tar.gz -C /usr/local/my_app/hive
- 下载 MySQL 的 Java 驱动并上传到 Hive 的lib路径下
官网下载地址
[code]-rw-r--r--. 1 root root 2293144 6月 7 02:07 mysql-connector-java-8.0.16.jar [root@outman00 lib]# pwd /usr/local/my_app/hive/hive-3.1.1/lib
- 配置Hive环境变量
[code][root@outman00 lib]# sed -i '$a\export HIVE_HOME=/usr/local/my_app/hive/hive-3.1.1\nexport PATH=$PATH:$HIVE_HOME/bin' /etc/profile # 更新生效 [root@outman00 lib]# source /etc/profile
- 修改配置文件
[code][root@outman00 conf]# cd /usr/local/my_app/hive/hive-3.1.1/conf [root@outman00 conf]# cp hive-env.sh.template hive-env.sh [root@outman00 conf]# cp hive-default.xml.template hive-site.xml
hive-env.sh
添加以下内容
[code]export JAVA_HOME=/usr/local/my_app/jdk1.8.0_211 export HADOOP_HOME=/usr/local/my_app/hadoop/hadoop-3.2.0 export HIVE_HOME=/usr/local/my_app/hive/hive-3.1.1
- 更新生效
[code][root@outman00 conf]# source hive-env.sh
-
修改
hive-site.xml
- 先创建目录
[code][root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/warehouse [root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/tmp [root@outman00 conf]# mkdir -p /usr/local/my_app/hive/hive_data/log
- 修改 hive-site.xml
[code]<property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>Driver class name for a JDBC metastore</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>用户名</value> <description>username to use against metastore database</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>密码</value> <description>password to use against metastore database</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/local/my_app/hive/hive_datawarehouse</value> <description>location of default database for the warehouse</description> </property> <property> <name>hive.exec.scratchdir</name> <value>/usr/local/my_app/hive/hive_data/tmp</value> <description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description> </property> <property> <name>hive.querylog.location</name> <value>/usr/local/my_app/hive/hive_data/log</value> <description>Location of Hive run time structured log file</description> </property> # 我们把变量 $system:java.io.tmpdir 替换成我们的临时数据存放目录 /usr/local/my_app/hive/hive_data/tmp
- 修改
hive-log4j.proprties
文件
[code]
hive
启动报错如下
Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"
配置文件 hive-site.xml 3186行96个字符不合法
[code]# 详细报错 Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"] at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2981) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2930) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2805) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1459) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:4990) at org.apache.hadoop.hive.conf.HiveConf.getVar(HiveConf.java:5063) at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5150) at org.apache.hadoop.hive.conf.HiveConf.<init>(HiveConf.java:5093) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4jCommon(LogUtils.java:97) at org.apache.hadoop.hive.common.LogUtils.initHiveLog4j(LogUtils.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:699) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:683) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3186,96,"file:/usr/local/my_app/hive/hive-3.1.1/conf/hive-site.xml"] at com.ctc.wstx.sr.StreamScanner.constructWfcException(StreamScanner.java:621) at com.ctc.wstx.sr.StreamScanner.throwParseError(StreamScanner.java:491) at com.ctc.wstx.sr.StreamScanner.reportIllegalChar(StreamScanner.java:2456) at com.ctc.wstx.sr.StreamScanner.validateChar(StreamScanner.java:2403) at com.ctc.wstx.sr.StreamScanner.resolveCharEnt(StreamScanner.java:2369) at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1515) at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2828) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123) at org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3277) at org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3071) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2964) ... 17 more
-
启动 metastore 报错
报错信息
[code]hadoop@outman00 hive-3.1.1]$ hive --service metastore 2019-06-07 23:25:27: Starting Hive Metastore Server SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] MetaException(message:Version information not found in metastore.) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:84) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8661) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8656) at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:8926) at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8843) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
分析报错
- jar包冲突,关键信息
[code][jar:file:/usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/my_app/hadoop/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class
- 删除 HVIE_HOME/bin 下的jar
- 不要删除HADOOP_HOME/bin下的jar包否则start-all.sh远程启动hadoop时会报找不到log4j包的错误。
[code]rm -rf /usr/local/my_app/hive/hive-3.1.1/lib/log4j-slf4j-impl-2.10.0.jar
- 启动metastore仍然报错内容如下
[code]2019-06-07 23:41:43: Starting Hive Metastore Server MetaException(message:Version information not found in metastore.) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:84) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:93) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8661) at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:8656) at org.apache.hadoop.hive.metastore.HiveMetaStore.startMetaStore(HiveMetaStore.java:8926) at org.apache.hadoop.hive.metastore.HiveMetaStore.main(HiveMetaStore.java:8843) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
- 尝试解决 在hvie-site.xml(继续报错)
[code]# 关闭元数据验证 <property> <name>datanucleus.metadata.validate</name> <value>false</value> </property> # 关闭元数据存储模式验证 <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>ture</value> </property> # 其中hive.metastore.schema.verification防止架构版本不兼容时的 Metastore 操作。考虑将此设置为“True”,以减少 Metastore 操作期间发生架构损坏的可能性
- 如果是第一次需要执行初始化命令:schematool -dbType mysql -initSchema
[code][hadoop@outman00 hive-3.1.1]$ schematool -dbType mysql -initSchema # 发现MySQL中创建了hive库 MySQL [(none)]> show databases; +--------------------+ | Database | +--------------------+ | dyp | | hive | | information_schema | | mysql | | performance_schema | | sys | +--------------------+
正常启动 metastore 后 进入 hive 交互,报错如下
[code]hive> show databases; OK Failed with exception java.io.IOException:java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:user.name%7D
- 修改配置
[code] <property> <name>hive.exec.local.scratchdir</name> <value>/usr/local/my_app/hive/hive_data/tmp/${user.name}</value> <description>Local scratch space for Hive jobs</description> </property>
启动后节点信息
启动后节点信息
outman00 | outman01 | outman02 |
---|---|---|
DataNode NameNode NodeManager ResourceManager |
DataNode NodeManager SecondaryNameNode |
DataNode NodeManager |
概念说明
- NameNode
负责管理整个 HDFS 文件系统的元数据:配置副本策略、管理我们存储数据块(Block)映射信息、管理HDFS名称空间、处理客户端读写请求
- SecondaryNameNode
是 NameNode 的辅助分担任务,定期合并fsimage和edits文件,可以辅助恢复NameNode;
- DataNode
负责管理用户的文件数据块:根据NameNode下发的任务命令,DataNode去执行对应的操作。(存储实际数据块、执行数据块的读写操作)
文件会按照固定大小(blocksize)来切分成块后分布式存储在若干台DataNode上
每一个文件快可以有多个副本,并存放在不同的 DataNode 上 DataNode 会定期向 NameNode 汇报自身所保存的文件block信息,而 NameNode 则会负责保持文件的副本数量(当减少DataNode的时候,NameNode才知道当前副本状态,从而进行副本维持)
- YARN 负责将系统资源分配给在 Hadoop 集群中运行的各种应用程序,并调度要在不同集群节点上执行的任务
- YARN 的组件 ResourceManager
ResourceManager 是在系统中的所有应用程序之间仲裁资源的最终权限。
- YARN 的组件 ResourceManager
NodeManager 是每台机器框架代理,负责 Containers,监视其资源使用情况(CPU,内存,磁盘,网络)并将其报告给 ResourceManager。
- 本地搭建Hadoop伪分布式环境之一:虚拟机的安装
- 在oracle Virtual Box 虚拟机中搭建hadoop1.2.1完全分布式环境(转自康哥的博客)
- hadoop+hive-完全分布式环境搭建
- hadoop全分布式搭建3(安装JDK并配置环境)
- centos下搭建单机和伪分布式hadoop环境-(2)下载安装所需软件+测试hadoop的单机模式
- Hadoop 2.6 集群搭建从零开始之4 Hadoop的安装与配置(完全分布式环境)
- 在PC虚拟机Ubuntu14.04搭建交叉编译环境,安装arm-linux-gcc-4.4.3以及问题解决
- 解决典型Hadoop分布式集群环境搭建问题
- 谈win7下Eclipse环境中向安装在虚拟机中的Hadoop中上传文件为空的解决方法
- hadoop-企业版环境搭建(五)-安装Hive
- Hadoop伪分布式安装环境搭建
- 从零搭建Hadoop+zookeeper+hbase+hive完全分布式环境(04)——Hive
- hadoop学习系列1之在ubuntu12.04下搭建伪分布式Hadoop-1.1.1环境并安装eclipse
- Hadoop集群搭建---step3(hadoop三种架构介绍(standAlone,伪分布,分布式安装以及环境搭建)
- 【心血之作】linux虚拟机下安装配置Hadoop(完全分布式)生态环境(hadoop2.2.0,HBase0.98,Hive0.13(连接oracle),sqoop1.4.4(连接oracle)
- 谈win7下Eclipse环境中向安装在虚拟机中的Hadoop中上传文件为空的解决方法
- vmware ubuntu12.04 hadoop 完全分布式环境搭建记录(1)
- Hadoop 2.6 集群搭建从零开始之3 Hadoop的安装与配置(伪分布式环境)
- Ubuntu16.04下伪分布式环境搭建之hadoop2.6.0、jdk1.7、Hbase0.98的安装与配置
- 伪分布式环境搭建之hadoop、Hbase的安装与配置