hbase与hive整合遇见的问题
2016-03-08 09:32
701 查看
1、环境说明
Cenos6.5、hadoop-2.6.0、hbase-1.1.2、hive-1.2.1和zookeeper-3.4.6
2、整个hbase与hive前的准备工作
1)搭建hadoop集群
2)搭建zookeeper集群
3)搭建habse集群
4)安装hive(实验换境只是在集群的的Master节点安装了hive)
5)更换hive的检索引擎,更换成MySql
3、整合hbse与hive
具备上面的条件之后始整合
1)将hbase的lib下的如下jar拷贝
hbase-client-1.0.0.jar
hbase-common-1.0.0.jar
hbase-server-1.0.0.jar
hbase-protocol-1.0.0.jar
netty-all-4.0.23.Final.jar
htrace-core-3.1.0-incubating.jar
注意:上面的版本和实际可能不一样,要拷贝适合自己环境的版本。
4、在整合过程中可能存在的问题
1)
2)
3)具体的整合操作可以参考下面:
Hive与HBase整合的实现是利用两者本身对外的API接口互相进行通信,相互通信主要是依靠Hive安装包\apache-hive-0.13.1-bin\lib\hive-hbase-handler-0.9.0.jar工具类,它负责Hbase和Hive进行通信的。
Hive和HBase通信原理如下图:
![](http://static.open-open.com/lib/uploadImg/20141030/20141030095623_376.png)
Step01:上传apache-hive-0.13.1-bin到Linux目录下
说明:
这里使用的是最新版本稳定版,下载:http://mirrors.hust.edu.cn/apache/hive/
使用远程ftp工具上传到Linux下的 /yf/home/software目录下
Step02:解压到安装目录下:
cd /home/yf/software
#切换到root用户:
su
password
#创建目录
mkdir -p /usr/share/hive
#解压到安装目录
tar -zxvf apache-hive-0.13.1-bin.tar.gz
-C /usr/share/hive
#更改用户
chown -R yf:yf /usr/share/hive
#切换回来
su yf
cd /usr/share/hive
ll
Step03:配置环境变量
su do vi /etc/profile
编辑如下:
![](http://static.open-open.com/lib/uploadImg/20141030/20141030095624_720.png)
# 使之生效
source /etc/profile
Step04:拷贝jar包
#删除$HIVA_HOME/lib目录下的的Zookeeper的jar包
rm -rf $HIVE_HOME/lib/zookeeper*
#拷贝生产环境下的Zookeeper的jar包到$HIVA_HOME/lib目录下
cp $ZOOKEEPER_HOME/zookeeper-3.4.6.jar $HIVA_HOME/lib
Step05:修改$HIVE_HOME/conf/hive-site.xml
cd $HIVE_HOME/conf
#复制一份出来
cp hive-default.xml.template hive-site.xml
#修改hive-site.xml
vi hive-site.mxl
由于文件内容比较多,需要进行查找,查找方式很简单,在命令行模式下输入 ‘/’ 后面跟上需要查找的文本即可找到,如: /
hive.querylog.location
hive.querylog.location
/usr/share/hive/logs
记得创建logs目录:
mkdir $HIVE_HOME/logs
#修改hive.zookeeper.quorum的值:
hive.zookeeper.quorum
yf001,yf002,yf003,yf004,yf005,yf006,yf007
Step06:在目录$HIVE_HOME/bin目录下面,修改文件hive-config.sh
#在最后面增加以下内容:
export JAVA_HOME=/usr/lib/jvm/java7
export HIVE_HOME=/usr/share/hive
export HADOOP_HOME=/usr/share/hadoop/static
Step07:修改$HADOOP_HOME/conf/hadoop-env.sh
#增加HADOOP_CLASSPATH
export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
#记得修改完成以后,要将修改后的文件同步拷贝到其他的节点。
注:如果hadoop-env.sh中没有增加HADOOP_CLASSPATH的话会报下面的错误:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/ql/CommandNeedRetryException
Step07:验证
#在命令行中输入hive
yf@yf007:/usr/share/hive/conf$ hive
14/10/29 11:18:16 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use
hive.hmshandler.retry.* instead
Logging initialized using configuration in jar:file:/usr/share/hive/lib/hive-common-0.13.1.jar!/hive-log4j.properties
#查看当前数据库
hive> show databases;
OK
default
Time taken: 0.03 seconds, Fetched: 1 row(s)
Step08:创建表
hive> use default;
hive> create table student(id string, name string);
创建好表以后,会在hdfs文件系统上产生一个/user/hive/warehouse/student的目录,至此,Hive就安装好了。
为什么要修改存储引擎?
1. metastore是hive元数据的集中存放地。metastore默认使用内嵌的derby数据库作为存储引擎。
2. Derby引擎的缺点:一次只能打开一个会话
3. 使用Mysql作为外置存储引擎,多用户同时访问
Step01:安装MySQL
我采用在线安装的方式,机器为Ubuntu。
# 安装server
sudo apt-get install mysql-server
中间会提示输入root用户密码
# 安装client
sudo apt-get install mysql-client
Step02:启动MySQL服务
mysqld_safe &
Step03:登录数据库
mysql –root –p1234
# 创建数据库
create database hive;
# 为yf用户授权,可以对hive数据库下的所有内容执行任意操作,在任意位置,root验证密码是admin
mysql> GRANT all ON hive.* TO yf@'%' IDENTIFIED BY 'admin';
mysql> flush privileges; --刷新权限表
Step04:上传mysql-connector-java-5.1.20-bin.jar到$HIVE_HOME/lib目录下
Step05:修改hive的数据库配置信息,修改hive-site.xml文件,修改如下:
hive.metastore.warehouse.dir
/hive
location of default
database for the warehouse
# 以集群的方式启动
hive --auxpath /usr/share/hive/lib/hive-hbase-handler-0.9.0.jar,/usr/share/hive/lib/zookeeper-3.4.6.jar
-hiveconf hbase.master=yfV007:60000
说明:现在已存在一张HBase表’bidask_quote’,现在要创建一张hive的外部表,关联到hbase的表,语句如下:
hive> CREATE EXTERNAL TABLE bidask_quote_hive(key string,ProdCode string,ProdName string,TradingDay string,ExchangeID
string,ExchangeInstID string,LastPrice string,PreSettlementPrice string,PreClosePrice string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "info:ProdCode,info:ProdName,info:TradingDay,info:ExchangeID,info:ExchangeInstID,info:LastPrice,info:PreSettlementPrice,info:PreClosePrice
")
TBLPROPERTIES("hbase.table.name" = "bidask_quote");
说明:
*
bidask_quote_hive是hive的biao
*
"hbase.table.name" = "bidask_quote"是hbase中已经存在的表
*
bidask_quote_hive(key string,ProdCode string,string,ProdName string,....)是hive表的结构
*
"hbase.columns.mapping" = "info:ProdCode,info:ProdName,info:TradingDay,...)是HBase中的列信息,这里现在只有一个列蔟。
现在来看一下是否创建成功:
hive> show tables;
如果存在我们创建的bidask_quote_hive表,就查询几条数据试试吧
hive> select * from bidask_quote_hive
limit 3;
后续还有关于Hive
API的操作。
错误1:
yf@yf007:/usr/share/hive/bin$
./hive
Exception in thread "main"
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/CommandNeedRetryException
at
java.lang.Class.forName0(Native Method)
at
java.lang.Class.forName(Class.java:270)
at
org.apache.hadoop.util.RunJar.main(RunJar.java:205)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.CommandNeedRetryException
at
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at
java.security.AccessController.doPrivileged(Native Method)
at
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
...
3 more
解决办法:
/usr/share/hadoop/etc/hadoop/hadoop-env.sh里面被增加了HADOOP_CLASSPATH.
设置如下:
export HADOOP_CLASSPATH=$HBASE_HOME/hbase/hbase-0.20.3.jar:$HABSE_HOME/hbase-config:$ZOOKEEPER/zookeeper-3.2.2.jar
将其修改为:
export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
Cenos6.5、hadoop-2.6.0、hbase-1.1.2、hive-1.2.1和zookeeper-3.4.6
2、整个hbase与hive前的准备工作
1)搭建hadoop集群
2)搭建zookeeper集群
3)搭建habse集群
4)安装hive(实验换境只是在集群的的Master节点安装了hive)
5)更换hive的检索引擎,更换成MySql
3、整合hbse与hive
具备上面的条件之后始整合
1)将hbase的lib下的如下jar拷贝
hbase-client-1.0.0.jar
hbase-common-1.0.0.jar
hbase-server-1.0.0.jar
hbase-protocol-1.0.0.jar
netty-all-4.0.23.Final.jar
htrace-core-3.1.0-incubating.jar
注意:上面的版本和实际可能不一样,要拷贝适合自己环境的版本。
4、在整合过程中可能存在的问题
1)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:615) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4057) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:276) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:884) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:874) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: MetaException(message:MetaException(message:java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:389) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:366) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:247) at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:183) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:84) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:546) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:539) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at $Proxy7.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:609) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4057) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:276) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:884) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:874) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:387) ... 34 more Caused by: java.lang.NoClassDefFoundError: org/cloudera/htrace/Trace at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:195) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:479) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:83) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.retrieveClusterId(HConnectionManager.java:801) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:633) ... 39 more Caused by: java.lang.ClassNotFoundException: org.cloudera.htrace.Trace at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) ... 45 more ) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.getHBaseAdmin(HBaseStorageHandler.java:88) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:162) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:546) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:539) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at $Proxy7.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:609) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:4057) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:276) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1472) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1239) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1057) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:884) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:874) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:424) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:793) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:687) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) ) at org.apache.hadoop.hive.hbase.HBaseStorageHandler.preCreateTable(HBaseStorageHandler.java:212) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:546) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:539) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at $Proxy7.createTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:609) ... 20 more上面的问题原因是:htrace-core-3.1.0-incubating.jar没有拷贝过去。
2)
DEBUG - Codec=org.apache.hadoop.hbase.codec.KeyValueCodec@82c71, compressor=null, tcpKeepAlive=true, tcpNoDelay=true, connectTO=10000, readTO=20000, writeTO=60000, minIdleTimeBeforeClose=120000, maxRetries=0, fallbackAllowed=false, bind address=null Exception in thread "main" java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) at newone.OperateTable.main(OperateTable.java:22) Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:525) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 3 more Caused by: java.lang.NoClassDefFoundError: io/netty/channel/EventLoopGroup at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2013) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1978) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2072) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2098) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:631) ... 8 more Caused by: java.lang.ClassNotFoundException: io.netty.channel.EventLoopGroup at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) ... 15 more DEBUG - stopping client from cache: org.apache.hadoop.ipc.Client@6eb29d DEBUG - removing client from cache: org.apache.hadoop.ipc.Client@6eb29d DEBUG - stopping actual client because no more references remain: org.apache.hadoop.ipc.Client@6eb29d DEBUG - Stopping client上面的原因可能是:netty-all-4.0.23.Final.jar没有完成拷贝或者是版本不对。
3)具体的整合操作可以参考下面:
1. Hive整合HBase原理
Hive与HBase整合的实现是利用两者本身对外的API接口互相进行通信,相互通信主要是依靠Hive安装包\apache-hive-0.13.1-bin\lib\hive-hbase-handler-0.9.0.jar工具类,它负责Hbase和Hive进行通信的。Hive和HBase通信原理如下图:
![](http://static.open-open.com/lib/uploadImg/20141030/20141030095623_376.png)
2. Hive的安装
Step01:上传apache-hive-0.13.1-bin到Linux目录下说明:
这里使用的是最新版本稳定版,下载:http://mirrors.hust.edu.cn/apache/hive/
使用远程ftp工具上传到Linux下的 /yf/home/software目录下
Step02:解压到安装目录下:
cd /home/yf/software
#切换到root用户:
su
password
#创建目录
mkdir -p /usr/share/hive
#解压到安装目录
tar -zxvf apache-hive-0.13.1-bin.tar.gz
-C /usr/share/hive
#更改用户
chown -R yf:yf /usr/share/hive
#切换回来
su yf
cd /usr/share/hive
ll
Step03:配置环境变量
su do vi /etc/profile
编辑如下:
![](http://static.open-open.com/lib/uploadImg/20141030/20141030095624_720.png)
# 使之生效
source /etc/profile
Step04:拷贝jar包
#删除$HIVA_HOME/lib目录下的的Zookeeper的jar包
rm -rf $HIVE_HOME/lib/zookeeper*
#拷贝生产环境下的Zookeeper的jar包到$HIVA_HOME/lib目录下
cp $ZOOKEEPER_HOME/zookeeper-3.4.6.jar $HIVA_HOME/lib
Step05:修改$HIVE_HOME/conf/hive-site.xml
cd $HIVE_HOME/conf
#复制一份出来
cp hive-default.xml.template hive-site.xml
#修改hive-site.xml
vi hive-site.mxl
由于文件内容比较多,需要进行查找,查找方式很简单,在命令行模式下输入 ‘/’ 后面跟上需要查找的文本即可找到,如: /
hive.querylog.location
hive.querylog.location
/usr/share/hive/logs
记得创建logs目录:
mkdir $HIVE_HOME/logs
#修改hive.zookeeper.quorum的值:
hive.zookeeper.quorum
yf001,yf002,yf003,yf004,yf005,yf006,yf007
Step06:在目录$HIVE_HOME/bin目录下面,修改文件hive-config.sh
#在最后面增加以下内容:
export JAVA_HOME=/usr/lib/jvm/java7
export HIVE_HOME=/usr/share/hive
export HADOOP_HOME=/usr/share/hadoop/static
Step07:修改$HADOOP_HOME/conf/hadoop-env.sh
#增加HADOOP_CLASSPATH
export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
#记得修改完成以后,要将修改后的文件同步拷贝到其他的节点。
注:如果hadoop-env.sh中没有增加HADOOP_CLASSPATH的话会报下面的错误:
java.lang.NoClassDefFoundError:
org/apache/hadoop/hive/ql/CommandNeedRetryException
Step07:验证
#在命令行中输入hive
yf@yf007:/usr/share/hive/conf$ hive
14/10/29 11:18:16 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect. Use
hive.hmshandler.retry.* instead
Logging initialized using configuration in jar:file:/usr/share/hive/lib/hive-common-0.13.1.jar!/hive-log4j.properties
#查看当前数据库
hive> show databases;
OK
default
Time taken: 0.03 seconds, Fetched: 1 row(s)
Step08:创建表
hive> use default;
hive> create table student(id string, name string);
创建好表以后,会在hdfs文件系统上产生一个/user/hive/warehouse/student的目录,至此,Hive就安装好了。
3.修改存储引擎为MySQL
为什么要修改存储引擎?1. metastore是hive元数据的集中存放地。metastore默认使用内嵌的derby数据库作为存储引擎。
2. Derby引擎的缺点:一次只能打开一个会话
3. 使用Mysql作为外置存储引擎,多用户同时访问
Step01:安装MySQL
我采用在线安装的方式,机器为Ubuntu。
# 安装server
sudo apt-get install mysql-server
中间会提示输入root用户密码
# 安装client
sudo apt-get install mysql-client
Step02:启动MySQL服务
mysqld_safe &
Step03:登录数据库
mysql –root –p1234
# 创建数据库
create database hive;
# 为yf用户授权,可以对hive数据库下的所有内容执行任意操作,在任意位置,root验证密码是admin
mysql> GRANT all ON hive.* TO yf@'%' IDENTIFIED BY 'admin';
mysql> flush privileges; --刷新权限表
Step04:上传mysql-connector-java-5.1.20-bin.jar到$HIVE_HOME/lib目录下
Step05:修改hive的数据库配置信息,修改hive-site.xml文件,修改如下:
hive.metastore.warehouse.dir
/hive
location of default
database for the warehouse
4.启动Hive
# 以集群的方式启动hive --auxpath /usr/share/hive/lib/hive-hbase-handler-0.9.0.jar,/usr/share/hive/lib/zookeeper-3.4.6.jar
-hiveconf hbase.master=yfV007:60000
5.创建Hive外部表关联HBase表
说明:现在已存在一张HBase表’bidask_quote’,现在要创建一张hive的外部表,关联到hbase的表,语句如下:hive> CREATE EXTERNAL TABLE bidask_quote_hive(key string,ProdCode string,ProdName string,TradingDay string,ExchangeID
string,ExchangeInstID string,LastPrice string,PreSettlementPrice string,PreClosePrice string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = "info:ProdCode,info:ProdName,info:TradingDay,info:ExchangeID,info:ExchangeInstID,info:LastPrice,info:PreSettlementPrice,info:PreClosePrice
")
TBLPROPERTIES("hbase.table.name" = "bidask_quote");
说明:
*
bidask_quote_hive是hive的biao
*
"hbase.table.name" = "bidask_quote"是hbase中已经存在的表
*
bidask_quote_hive(key string,ProdCode string,string,ProdName string,....)是hive表的结构
*
"hbase.columns.mapping" = "info:ProdCode,info:ProdName,info:TradingDay,...)是HBase中的列信息,这里现在只有一个列蔟。
现在来看一下是否创建成功:
hive> show tables;
如果存在我们创建的bidask_quote_hive表,就查询几条数据试试吧
hive> select * from bidask_quote_hive
limit 3;
后续还有关于Hive
API的操作。
遇到的错误
错误1:yf@yf007:/usr/share/hive/bin$
./hive
Exception in thread "main"
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/CommandNeedRetryException
at
java.lang.Class.forName0(Native Method)
at
java.lang.Class.forName(Class.java:270)
at
org.apache.hadoop.util.RunJar.main(RunJar.java:205)
Caused by: java.lang.ClassNotFoundException:
org.apache.hadoop.hive.ql.CommandNeedRetryException
at
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at
java.security.AccessController.doPrivileged(Native Method)
at
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:358)
...
3 more
解决办法:
/usr/share/hadoop/etc/hadoop/hadoop-env.sh里面被增加了HADOOP_CLASSPATH.
设置如下:
export HADOOP_CLASSPATH=$HBASE_HOME/hbase/hbase-0.20.3.jar:$HABSE_HOME/hbase-config:$ZOOKEEPER/zookeeper-3.2.2.jar
将其修改为:
export HADOOP_CLASSPATH=.:$CLASSPATH:$HADOOP_CLASSPATH:$HADOOP_HOME/bin
相关文章推荐
- 让我们再谈谈 iOS 安全
- JS——简单的正则表达式验证
- eBay Kylin
- java设计模式学习记录
- 数据结构与算法笔记 —— 排序算法及代码实现
- cent os7 配置
- MVC中modelstate的使用
- php5.5 连接SQL SERVER 【环境配置及DLL文件可下载】
- 好玩的创建UI
- 51822外设篇-2
- 64位机 VS2012 Oracle ORA-06413 连接未打开
- 欢迎使用CSDN-markdown编辑器
- 自由职业者的自我修养
- WeakSelf和StrongSelf
- jsp中设置必填项
- 新的开始
- JS实现设置ff与ie元素绝对位置的方法
- mockito mock测试框架
- Android应用第一次安装成功点击“打开”后Home键切出应用后再点击桌面图标返回导致应用重启问题
- jQuery图片轮播