您的位置:首页 > 运维架构

Hive2.1.1、Hadoop2.7.3 部署

2017-04-17 13:11 369 查看
本文以远程模式安装Hive2.1.1将hive的元数据放置在MySQL数据库中。
1 安装mysql数据库
sudo apt-get install mysql-server
1
1

重启mysql服务使得配置文件生效
sudo service mysql restart
1
1
创建hive专用账户
CREATE USER 'hive'@'%' IDENTIFIED BY '123456';
1
1
给hive账户授予所有权限
grant all privileges on *.* to 'hive'@'%' identified by '123456' with grant option;
1
1
刷新系统权限表,使配置生效
flush privileges;
1
1
2 解压安装hive
cd /usr/local
sudo tar -xvzf apache-hive-2.1.1-bin.tar.gz
sudo mv apache-hive-2.1.1-bin/ hive-2.1.1
1
2
3
1
2
3
配置系统环境变量
sudo gedit .bashrc
export HIVE_HOME=/usr/local/hive-2.1.1
exportPATH=$HIVE_HOME/bin:$HIVE_HOME/lib:$PATH
1
2
3
1
2
3

使得环境变量配置生效
source .bashrc
1
1
3 配置hive 

3.1 修改conf/hive-env.sh文件
cd /usr/local/hive-2.1.1/conf/
sudo cp hive-env.sh.template hive-env.sh
sudo chown hadoop:hadoop hive-env.sh
sudo vi hive-env.sh
HADOOP_HOME=/usr/local/hadoop-2.7.3
export HIVE_CONF_DIR=/usr/local/hive-2.1.1/conf
export HIVE_AUX_JARS_PATH=/usr/local/hive-2.1.1/lib
1
2
3
4
5
6
7
1
2
3
4
5
6
7

3.2 修改日志属性文件配置日志存储目录 

修改hive-log4j2.properties
sudo cp hive-log4j2.properties.template hive-log4j2.properties
sudo chown hadoop:hadoop hive-log4j2.properties
sudo  vi hive-log4j2.properties
property.hive.log.dir = /usr/local/hive-2.1.1/logs
1
2
3
4
1
2
3
4
修改llap-cli-log4j2.properties
property.hive.log.dir = /usr/local/hive-2.1.1/logs
property.hive.log.file = llap-cli.log
1
2
1
2
3.3 修改hive-site.xml配置文件,主要修改如下配置项目
<property>
<name>hive.exec.local.scratchdir</name>
<value>/usr/local/hive-2.1.1/tmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/hive-2.1.1/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/usr/local/hive-2.1.1/logs</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/usr/local/hive-2.1.1/logs</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/usr/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://192.168.80.130:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/usr/local/hive-2.1.1/tmp</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/usr/local/hive-2.1.1/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://192.168.80.130:3306/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<description>
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>Username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.hwi.listen.host</name>
<value>0.0.0.0</value>
<description>This is the host address the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.hwi.listen.port</name>
<value>9999</value>
<description>This is the port the Hive Web Interface will listen on</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
<description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
4 拷贝mysql连接包到hive主目录下的lib中
sudo mv ~/下载/mysql-connector-java-5.1.40-bin.jar /usr/local/hive-1.2.1/lib/
1
1
5 配置Hive的hwi网页访问方式 

下载hive-2.1.1源码包
wget http://www-us.apache.org/dist/hive/hive-2.1.1/ apache-hive-1.2.1-src.tar.gz
tar -zxvf apache-hive-2.1.1-src.tar.gz
cd apache-hive-2.1.1-src
cd hwi/web
zip hive-hwi-2.1.1.zip ./*
mv hive-hwi-2.1.1.zip hive-hwi-2.1.1.war
mv hive-hwi-2.1.1.war $HIVE_HOME/lib
1
2
3
4
5
6
7
1
2
3
4
5
6
7
拷贝tools包
sudo cp /usr/lib/jdk1.8.0_121/lib/tools.jar /usr/local/hive-2.1.1/lib
1
1
删除lib下的ant-1.6.5.jar,否则浏览hwi网页时会显示错误信息,需要刷新两次才能看到网页。 

6 初始化hive
schematool -dbType mysql -initSchema
1
1
7 启动hive服务 

启动metaStore服务
hive --service metastore  &
1
1
启动hive web界面
hive --service  hwi &
1
1
启动thrift2服务
hive --service hiveserver2 &
1
1
启动hive shell
hive
1
1
hwi访问网址
http://localhost:9999/hwi/
1
1
---------------------------------------------------------------------------------------------------------------------------------------------
如果发现不行:可以按照一下在来一遍。

第一步: 

下载最新的hive,直接去apache 里面找hive2.1.0下载就行。

第二步,解压到服务器
tar zxvf apache-hive-2.0.0-bin.tar.gz
mv apache-hive-2.0.0-bin /home/hive
cd /home/hive
1
2
3

第三步,修改conf。这里只关心hadoop和hive的配置,其他JAVA HBASE的配置根据自己来
vi /etc/profile

#for hadoop
export HADOOP_HOME=/home/hadoop/hadoop-2.7.3
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export PATH=$PATH:/home/hadoop/hadoop-2.7.3/bin
export PATH=$PATH:/home/hadoop/hadoop-2.7.3/sbin

#for hive
export HIVE_HOME=/home/hive                               export PATH=$HADOOP_HOME/bin:$JAVA_HOME/bin:$HBASE_HOME/bin:$HIVE_HOME/bin:$PATH
1
2
3
4
5
6
7
8
9
10
11

第四步,下载并设置好jdbc connector 

我这里使用了最新的mysql-connector-java-5.1.40.tar.gz 

记住,将解压出来的jar放入hive 的lib中
cp mysql-connector-java-5.1.36-bin.jar $HIVE_HOME/lib/
1

第五步,配置hive-site.xml文件 

文件来源于hive-default.xml.template 


cp hive-default.xml.template hive-site.xml
1

然后找到
<name>javax.jdo.option.ConnectionURL</name>
1

修改其value
<value>jdbc:mysql://139.196.xxx.xxx:3306/hive?characterEncoding=UTF8&useSSL=false&createDatabaseIfNotExist=true</value>
1

同时,注意修改对应数据库的账号密码,否则会在执行hive时出错
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
1
2
3
4
5
6
7
8

第六步 运行hive客户端
cd /home/hive/bin
hive
1
2

第七步,初始化DB 

schematool -initSchema -dbType mysql

第八步,查看成功后的元数据 

可以看到对应数据库hive中,有了各种初始的表

第九步,启动master,node节点

启动单机 

hive 

启动集群 

hive -hiveconf hbase.zookeeper.quorum=slave1,slave2,slave3

———————————————————————————————————————————— 

以下是可能出现的错误: 

一,如果执行hive时候出现报出账号密码的错误 

那么记得修改hive-site.xml中账号密码,参考第五步中的内容。 

二,如果出现db没有初始化,如
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Hive metastore database is not initialized. Please use schematool (e.g. ./schematool -initSchema -dbType ...) to create the schema. If needed, don't forget to include the option to auto-create the underlying database in your JDBC connection string (e.g. ?createDatabaseIfNotExist=true for mysql))
1

这种错误的时候,请先执行初始化DB。 

三,如果提示 

SSL相关的内容,请在配置jdbc链接的时候设置ssl为false 

jdbc:mysql://139.196.xxx.xxx:3306/hive?useSSL=false&createDatabaseIfNotExist=true
Wed Nov 30 14:24:50 CST 2016 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
Wed Nov 30 14:24:55 CST 2016 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
1
2

四,遇到hive出错的时候
[Fatal Error] hive-site.xml:26:5: The element type "value" must be terminated by the matching end-tag "</value>".
Exception in thread "main" java.lang.RuntimeException: org.xml.sax.SAXParseException; systemId: file:/home/hive/conf/hive-site.xml; lineNumber: 26; columnNumber: 5; The element type "value" must be terminated by the matching end-tag "</value>".
1
2
Logging initialized using configuration in jar:file:/home/hive/lib/hive-common-2.1.0.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
1
2


这里就是配置文件Hive-site.xml中,修改system:java.io.tmpdir,指定一个系统存在的目录即可。

这里追加两种启动方式,方便各位用来进行hive测试。

hive提供了四种运行hive的方式,分别是:

**Hive CLI 

HiveServer2 和 Beeline** 

HCatalog 

WebHCat (Templeton)

这里主要介绍前两种 

第一种,hive CLI

因为hive的bin目录已经添加了path变量, 因此, 可以直接使用hive命令启动: 
hive
 

输入完命令后可以,直接可以进行hive操作。

第二种HiveServer2 和 Beeline

beeline提供多用户, 更加安全的服务, 因此beeline用得比较多. 

hiveserver2启动时默认的地址是”localhost:10000”, 因此, 在使用beeline连接的时候, 需要使用” jdbc:hive2://localhost:10000”作为参数. 

相关的命令如下:
hiveserver2
beeline -u jdbc:hive2://localhost:10000
1
2

同时也可以将 Beeline和HiveServer2在同一个进程里启动, 用于测试: 

beeline -u jdbc:hive2:// 

但是,这里如果用到自定义账号密码,必须在配置文件hive-site.xml中进行相关配置。 

上文第五步已经进行了相关介绍,可以参照。
----------------------------------------------------------------------------------------------------------------------------------------------------------


hive启动时报Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D解决办法

hive启动时遇到以下错误:

Exception in thread "main"java.lang.RuntimeException: java.lang.IllegalArgumentException:java.net.URISyntaxException: Relative path in absolute URI:${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D

解决办法:

在hive下创建临时IO的tmp文件夹。然后将路径配置到hive-site.xml的下列参数中

<property>

    <name>hive.querylog.location</name>

    <value>/usr/local/hive/iotmp</value>

    <description>Location of Hive run time structured log file</description>

  </property>

  

  <property>

    <name>hive.exec.local.scratchdir</name>

    <value>/usr/local/hive/iotmp</value>

    <description>Local scratch space for Hive jobs</description>

  </property>

  

  <property>

    <name>hive.downloaded.resources.dir</name>

    <value>/usr/local/hive/iotmp</value>

    <description>Temporary local directory for added resources in the remote file system.</description>

  </property>

保存,重启hive即可。

[root@master ~]# hive

Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-1.2.0.jar!/hive-log4j.properties

hive> show databases;

OK

default

Time taken: 3.684 seconds, Fetched: 1 row(s)

hive>

hiveserver启动方式

1, hive  命令行模式,直接输入/hive/bin/hive的执行程序,或者输入 hive –service cli

       用于linux平台命令行查询,查询语句基本跟mysql查询语句类似

 2, hive  web界面的启动方式,hive
–service hwi  

      用于通过浏览器来访问hive,感觉没多大用途

3, hive  远程服务 (端口号10000) 启动方式, hive –service hiveserver  & 

      用java等程序实现通过jdbc等驱动的访问hive就用这种起动方式了,这个是程序员最需要的方式了

   也可以自己指定端口 hive -service hiveserver -p 50000 &  (&表示后台运行)

  输入完这些指令后终端就在运行hiveserver了,会卡住不动。其实已经在运行了,不用担心。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: