您的位置:首页 > 其它

[原创]独立模式安装Hive

2014-05-28 08:44 197 查看
独立模式安装Hive

在三台CentOS虚拟机server1、server2、 server3上已经成功安装全分布模式的Hadoop集群,其中server1为NameNode,
server2、server3为DataNode。本人在server1即NameNode上安装Hive,并使用MySQL数据库存放metadata。

1.首先配置MySQL

安装MySQL

[root@server1 ~]# yum
install mysql mysql -server


一路选择yes就行

添加MySQL服务

[root@server1
~]# /sbin/chkconfig --add mysqld

启动MySQL

[root@server1
~]# service mysqld start

Starting mysqld: [ OK ]

用root账户在本地登录MySQL

[root@server1 ~]# mysql -u
root


出现欢迎页面,进入MySQL monitor.

创建数据库实例hive

mysql > CREATE
DATABASE hive;


创建用户hive

mysql > CREATE
USER 'hive' IDENTIFIED BY 'hive';


给用户hive赋予相应的访问与读写权限

mysql > GRANT
ALL ON hive.* TO hive@localhost ;


2.配置Hive

下载hive-0.8.1,并解压

[admin@server1
hive-0.8.1]$ pwd

/home/admin/hive-0.8.1

修改系统环境变量,在path中加入Hive执行路径

[root@server1 ~]# vim
/etc/profile


...................

export HIVE_HOME=/home/admin/hive-0.8.1

export PATH=$PATH:$HIVE_HOME/bin

应用修改

[root@server1 ~]# source
/etc/profile


下载MySQL的JDBC驱动包MySQL-connector-java-5.1.18-bin.jar,复制到hive-0.8.1的lib目录下

将Hive的conf下的文件hive-default.xml.template复制一份,重命名为hive-site.xml

[admin@server1 conf]$ cp
hive-default.xml.template hive-site.xml


将复制的hive-site.xml 移动到所安装的Hadoop的conf目录下

[admin@server1
conf]$ mv hive-site.xml /home/admin/hadoop-0.20.2/conf

修改hive-site.xml,配置,主要修改以下属性

所连接的MySQL数据库实例

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:mysql://localhost:3306/hive</value>

<description>JDBC
connect string for a JDBC metastore</description>

</property>

连接的MySQL数据库驱动

<property>

<name>javax.jdo.option.ConnectionDriverName</name>

<value>com.mysql.jdbc.Driver</value>

<description>Driver
class name for a JDBC metastore</description>

</property>

连接的MySQL数据库用户名

<property>

<name>javax.jdo.option.ConnectionUserName</name>

<value>hive</value>

<description>username
to use against metastore database</description>

</property>

连接的MySQL数据库密码

<property>

<name>javax.jdo.option.ConnectionPassword</name>

<value>hive</value>

<description>password
to use against metastore database</description>

</property>

2.测试Hive

在控制台键入Hive,进入Hive操作界面

[admin@server1
~]$ hive

Logging initialized using configuration in
jar:file:/home/admin/hive-0.8.1/lib/hive-common-0.8.1.jar!/hive-log4j.properties

Hive history file=/tmp/admin/hive_job_log_admin_201212011113_1138680566.txt

hive>

显示存在的表:

hive> SHOW
TABLES;


OK

Time taken: 0.063 seconds

建立表records:

hive> CREATE
TABLE records (year STRING, temperature INT, quality
INT)


> ROW
FORMAT DELIMITED FIELDS TERMINATED BY
'\t';


OK

Time taken: 0.253 seconds

显示存在的表,多了一个records:

hive>SHOW TABLES;

OK

records

Time taken: 0.089 seconds

查看表records的定义

hive> DESCRIBE
records;


OK

year string

temperature int

quality int

Time taken: 0.139 seconds

向表records导入数据

hive> LOAD DATA
INPATH '/user/admin/in/ncdc/micro/sample.txt'


> INTO
TABLE
records;


Loading data to
table default.records

OK

Time taken: 0.337 seconds

查看表records的数据

hive> SELECT *
FROM records;


OK

1950 0 1

1950 22 1

1950 -11 1

1949 111 1

1949 78 1

Time taken: 0.264 seconds

计算records中每一年的最高温度

hive> SELECT
year, MAX(temperature) FROM records GROUP BY
year;


Total MapReduce jobs = 1

Launching Job 1 out of 1

Number of reduce tasks not specified. Estimated
from input data size: 1

In order to change the average load for a reducer
(in bytes):

set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of
reducers:

set hive.exec.reducers.max=<number>

In order to set a constant number of
reducers:

set mapred.reduce.tasks=<number>

Starting Job = job_201211240040_0037, Tracking URL
= http://server1:50030/jobdetails.jsp?jobid=job_201211240040_0037
Kill Command =
/home/admin/hadoop-0.20.2/bin/../bin/hadoop job -Dmapred.job.tracker=server1:9001
-kill job_201211240040_0037

Hadoop job information for Stage-1: number
of mappers: 1; number
of reducers: 1

2012-12-01 11:30:57,089 Stage-1 map =
0%, reduce = 0%

2012-12-01 11:31:15,188 Stage-1 map =
100%, reduce = 0%

2012-12-01 11:31:24,235 Stage-1 map =
100%, reduce = 100%

Ended Job = job_201211240040_0037

MapReduce Jobs Launched:

Job 0: Map:
1 Reduce:
1 HDFS Read:
51 HDFS Write: 17 SUCESS

Total MapReduce CPU Time Spent:
0 msec

OK

1949 111

1950 22

Time taken: 47.238 seconds
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: