您的位置:首页 > 其它

HBase和Hive的整合

2014-11-12 16:14 253 查看
1. 为什么要进行hive和hbase的整合?

用hbase做数据库,但由于hbase没有类sql查询方式,所以操作和计算数据非常不方便,于是整合hive,让hive支撑在hbase数据库层面的hql查询.hive也即做数据仓库。

Hive与HBase的整合功能的实现是利用两者本身对外的API接口互相进行通信,相互通信主要是依靠hive_hbase-handler.jar工具类 (Hive
Storage Handlers ), 大致意思如图所示:



2. 环境的搭建.

环境: hadoop-1.1.2.tar.gz hbase-0.94.7-security.tar.gz hive-0.9.0.tar.gz zookeeper-3.4.5.tar.gz

怎么搭建hadoop集群,zookeeper集群,hbase集群,huve的搭建我就不详细介绍了,搭建都很简单,重点是整合



3. 在装有zookeeper软件的节点上启动zookeeper集群

zkServer.sh start



4. 启动hbase集群

start-hbase.sh



5. 启动hive

hive --auxpath /usr/local/hive/lib/zookeeper-3.4.3.jar,/usr/local/hive/lib/hbase-0.92.0.jar,/usr/local/hive/lib/hive-hbase-handler-0.9.0.jar

注意:如果启动hive不加参数--auxpath /usr/local/hive/lib/zookeeper-3.4.3.jar,/usr/local/hive/lib/hbase-0.92.0.jar,/usr/local/hive/lib/hive-hbase-handler-0.9.0.jar,

会报FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver

6. 在hive中创建表

create external table ods_ida_users(

uid string,

username string,

idacard string,

truename string,

password string,

pass string,

tele string,

areaid string,

shequid string,

otheraddrs string,

integral string,

img string,

qq string,

sex string,

email string,

bianma string,

login string,

reg_ip string,

reg_time bigint,

last_login_ip string,

last_login_time bigint,

status string)

row format delimited fields terminated by '\t' stored as TEXTFILE;



7.向hive中加载数据

LOAD DATA LOCAL INPATH '/usr/tmp/i_IDA_01001_20141026_001.dat' OVERWRITE INTO TABLE ods_ida_users;



8.创建hive映射到hbase中的表

CREATE TABLE h2b_ods_ida_users(

uid string,

username string,

idacard string,

truename string,

password string,

pass string,

tele string,

areaid string,

shequid string,

otheraddrs string,

integral string,

img string,

qq string,

sex string,

email string,

bianma string,

login string,

reg_ip string,

reg_time bigint,

last_login_ip string,

last_login_time bigint,

status string)

STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ("hbase.columns.mapping" = "key,info:username,info:idacard,info:truename,info:password,info:pass,info:tele,info:areaid,info:shequid,info:otheraddrs,info:integral,info:img,info:qq,info:sex,info:email,info:bianma,info:login,info:reg_ip,info:reg_time,info:last_login_ip,info:last_login_time,info:status")
TBLPROPERTIES ("hbase.table.name" = "h2b_ods_ida_users");



9. 执行hbase shell命令进入hbase

list 查看有哪些表

发现hbase中多了一张表h2b_ods_ida_users




10. 在hive中向h2b_ods_ida_users表插入数据

insert overwrite table h2b_ods_ida_users

select

uid,

username,

idacard,

truename,

password,

pass,

tele,

areaid,

shequid,

otheraddrs,

integral,

img,

qq,

sex,

email,

bianma,

login,

reg_ip,

reg_time,

last_login_ip,

last_login_time,

status

from ods_ida_users;

如果能看到下面的信息,就表示整合成功了





11.在hbse中查看是否有数据

scan 'h2b_ods_ida_users '

出现数据:






到此hive和hbase的整合已经完毕了!

请转载的同学一定表明准载地址,谢谢!
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: