您的位置:首页 > 运维架构

sqoop-1.99.6安装和测试

2016-08-12 16:18 302 查看
一、实验环境

hadoop-2.7.2

apache-hive-1.2.1-bin

sqoop-1.99.6-bin-hadoop200

二、安装Sqoop

1、解压

在/opt/文件夹下执行命令

tar -zxvf sqoop-1.99.6-bin-hadoop200.tar.gz


2、在/etc/profile中添加引用

进入profile文件

vim /etc/profile


export SQOOP2_HOME=/opt/sqoop-1.99.6-bin-hadoop200
export CATALINA_BASE=$SQOOP2_HOME/server
export PATH=.:$SQOOP2_HOME/bin:$PATH


source /etc/profile


3、创建logs文件夹

切换到/opt/目录下,创建文件夹

mkdir logs
cd logs
查看文件夹路径
pwd
/opt/sqoop-1.99.6-bin-hadoop200/logs


4、将mysql-connector-java-5.1.25-bin.jar导入到/opt/sqoop-1.99.6-bin-hadoop200/server/lib下面

5、修改server/conf下面文件

(1)sqoop.properties

将@LOGDIR@替换为之前的文件夹logs路径:/opt/sqoop-1.99.6-bin-hadoop200/logs

将derby的路径,@BASEDIR@修改为:/opt/sqoop-1.99.6-bin-hadoop200

将Hadoop的conf文件路径修改

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/hadoop-2.7.2/etc/hadoop


(2)catalina.properties

引入相关包,将原来的common.loader注释

common.loader=${catalina.base}/lib,${catalina.base}/lib/*.jar,${catalina.home}/lib,${catalina.home}/lib/*.jar,${catalina.home}/../lib/*.jar,/opt/hadoop-2.7.2/share/hadoop/common/*.jar,/opt/hadoop-2.7.2/share/hadoop/common/lib/*.jar,/opt/hadoop-2.7.2/share/hadoop/hdfs/*.jar,/opt/hadoop-2.7.2/share/hadoop/hdfs/lib/*.jar,/opt/hadoop-2.7.2/share/hadoop/mapreduce/*.jar,/opt/hadoop-2.7.2/share/hadoop/mapreduce/lib/*.jar,/opt/hadoop-2.7.2/share/hadoop/tools/*.jar,/opt/hadoop-2.7.2/share/hadoop/tools/lib/*.jar,/opt/hadoop-2.7.2/share/hadoop/yarn/*.jar,/opt/hadoop-2.7.2/share/hadoop/yarn/lib/*.jar


三、测试

1、开启服务

(1)到/opt/sqoop-1.99.6-bin-hadoop200/bin目录下

启动命令

$SQOOP2_HOME/bin/sqoop.sh server start
$SQOOP2_HOME/bin/sqoop.sh server stop
或者
$SQOOP2_HOME/bin/sqoop2-server start
$SQOOP2_HOME/bin/sqoop2-server stop


得到下面输出

Sqoop home directory: /opt/sqoop-1.99.6-bin-hadoop200
Setting SQOOP_HTTP_PORT:     12000
Setting SQOOP_ADMIN_PORT:     12001
Using   CATALINA_OPTS:
Adding to CATALINA_OPTS:    -Dsqoop.http.port=12000 -Dsqoop.admin.port=12001
Using CATALINA_BASE:   /opt/sqoop-1.99.6-bin-hadoop200/server
Using CATALINA_HOME:   /opt/sqoop-1.99.6-bin-hadoop200/server
Using CATALINA_TMPDIR: /opt/sqoop-1.99.6-bin-hadoop200/server/temp
Using JRE_HOME:        /opt/jdk1.7.0_45/jre
Using CLASSPATH:       /opt/sqoop-1.99.6-bin-hadoop200/server/bin/bootstrap.jar


(2)验证是否安装成功

访问http://{master}:12000/sqoop/version


(3)客户端访问

$SQOOP2_HOME/bin/sqoop.sh client
或者
$SQOOP2_HOME/bin/sqoop2-shell


为客户端配置服务器:

sqoop:000> set server --host master --port 12000 --webapp sqoop


查看服务器端信息:

sqoop:000> show server --all


查看所有connector:

show connector --all


查询所有link:

show link
+----+--------+--------------+------------------------+---------+
| Id |  Name  | Connector Id |     Connector Name     | Enabled |
+----+--------+--------------+------------------------+---------+
| 3  | mysql  | 4            | generic-jdbc-connector | true    |
| 5  | hdfs   | 3            | hdfs-connector         | true    |
| 6  | mysql2 | 4            | generic-jdbc-connector | true    |
+----+--------+--------------+------------------------+---------+


删除指定link:

delete link --lid x
这里的x指的是show link中输出的Id


查询所有job:

show job
+----+-------------+----------------+--------------+---------+
| Id |    Name     | From Connector | To Connector | Enabled |
+----+-------------+----------------+--------------+---------+
| 1  | mysqltohdfs | 4              | 3            | true    |
| 2  | sqooptest   | 4              | 3            | true    |
+----+-------------+----------------+--------------+---------+


删除指定job:

delete job --jid 1


创建generic-jdbc-connector类型的connector

create link --cid 4
Name: mysql
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://master:3306/hive
Username: root
Password: ****
JDBC Connection Properties:
There are currently 0 values in the map:
entry# protocol=tcp
There are currently 1 values in the map:
protocol = tcp
entry#
New link was successfully created with validation status OK and persistent id 3


show link


+----+--------+--------------+------------------------+---------+
| Id |  Name  | Connector Id |     Connector Name     | Enabled |
+----+--------+--------------+------------------------+---------+
| 3  | mysql  | 4            | generic-jdbc-connector | true    |
+----+--------+--------------+------------------------+---------+


创建hdfs-connector类型的connector:

sqoop:000> create link -cid 3
Creating link for connector with id 3
Please fill following values to create new link object
Name: hdfs

Link configuration

HDFS URI: hdfs://master:9000
Hadoop conf directory: /opt/hadoop-2.7.2/etc/hadoop
New link was successfully created with validation status OK and persistent id 5


show link
+----+--------+--------------+------------------------+---------+
| Id |  Name  | Connector Id |     Connector Name     | Enabled |
+----+--------+--------------+------------------------+---------+
| 3  | mysql  | 4            | generic-jdbc-connector | true    |
| 5  | hdfs   | 3            | hdfs-connector         | true    |
+----+--------+--------------+------------------------+---------+


show link -all
2 link(s) to show:
link with id 3 and name mysql (Enabled: true, Created by root at 8/12/16 2:15 PM, Updated by root at 8/12/16 2:15 PM)
Using Connector generic-jdbc-connector with id 4
Link configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://master:3306/hive
Username: root
Password:
JDBC Connection Properties:
protocol = tcp
link with id 5 and name hdfs (Enabled: true, Created by root at 8/12/16 2:47 PM, Updated by root at 8/12/16 2:47 PM)
Using Connector hdfs-connector with id 3
Link configuration
HDFS URI: hdfs://master:9000
Hadoop conf directory: /opt/hadoop-2.7.2/etc/hadoop


根据connector id创建job:

sqoop:000> create job -f 3 -t 5
Creating job for links with from id 3 and to id 5
Please fill following values to create new job object
Name: mysqltohdfs

From database configuration

Schema name: hive
Table name: TBLS
Table SQL statement:
Table column names:
Partition column name:
Null value allowed for the partition column:
Boundary query:

Incremental read

Check column:
Last value:

To HDFS configuration

Override null value:
Null value:
Output format:
0 : TEXT_FILE
1 : SEQUENCE_FILE
Choose: 0
Compression format:
0 : NONE
1 : DEFAULT
2 : DEFLATE
3 : GZIP
4 : BZIP2
5 : LZO
6 : LZ4
7 : SNAPPY
8 : CUSTOM
Choose: 0
Custom compression format:
Output directory: hdfs://master:9000/test
Append mode:

Throttling resources

Extractors:
Loaders:
New job was successfully created with validation status OK  and persistent id 1


查询所有job:

show job
+----+-------------+----------------+--------------+---------+
| Id |    Name     | From Connector | To Connector | Enabled |
+----+-------------+----------------+--------------+---------+
| 1  | mysqltohdfs | 4              | 3            | true    |
+----+-------------+----------------+--------------+---------+


start job –jid 1

查看指定job的执行状态:

status job –jid 1

停止指定的job:

stop job –jid 1

在start job(如:start job –jid 1)时常见错误:

Exception has occurred during processing command

Exception: org.apache.sqoop.common.SqoopException Message: CLIENT_0001:Server has returned exception

在sqoop客户端设置查看job详情:

set option –name verbose –value true

show job –jid 1
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  平台搭建