您的位置:首页 > 运维架构

hadoop之sqoop抽取数据

2016-04-19 15:32 441 查看
1、sqoop的安装与配置

安装(前提hadoop启动)

[hadoop@h91 ~]$ tar -zxvf sqoop-1.3.0-cdh3u5.tar.gz
[hadoop@h91 hadoop-0.20.2-cdh3u5]$ cp hadoop-core-0.20.2-cdh3u5.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/

[hadoop@h91 ~]$ cp ojdbc6.jar sqoop-1.3.0-cdh3u5/lib/

[hadoop@h91 ~]$ vi .bash_profile
添加
export SQOOP_HOME=/home/hadoop/sqoop-1.3.0-cdh3u5

[hadoop@h91 ~]$ source .bash_profile

2、sqoop与mysql的导入导出
2.1、授权用户,并创建sss表

mysql> insert into mysql.user(Host,User,Password) values("localhost","sqoop",password("sqoop"));
mysql> flush privileges;
mysql> grant all privileges on *.* to 'sqoop'@'%' identified by 'sqoop' with grant option;
mysql> flush privileges;

mysql> use test;
mysql> create table sss (id int,name varchar(10));

mysql> insert into sss values(1,'zs');
mysql> insert into sss values(2,'ls');2.2、测试sqoop能否连接上mysql
[hadoop@h91 mysql-connector-java-5.0.7]$ cp mysql-connector-java-5.0.7-bin.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/

[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql2.3、将mysql中的表导入到hdfs(-m 为并行  默认并行度为4)
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop import --connect jdbc:mysql://192.168.8.222:3306/test --username sqoop --password sqoop --table sss -m 1
hadoop fs -cat /user/hadoop/sss/part-m-00000(查看结果)
2.4、将hdfs中的数据导入到mysql
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop export --connect jdbc:mysql://192.168.23.2:3306/test --username sqoop --password sqoop --table sss --export-dir hdfs://h91:9000/user/hadoop/sss/part-m-00000
[root@o222 ~]# mysql -usqoop -p
mysql> use test
mysql> select * from sss;(查看结果)
3、sqoop与oracle的导入导出
3.1、测试sqoop能否连接上oracle

[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom(实例名) --username SCOTT --password 1234563.2、将oracle中的表导入到hdfs中并指定hdfs中的目录,默认为user---------------注意不需要提前创建好文件目录
sqoop import --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom --username SCOTT --password 123456 --verbose -m 1 --table ALL_OBJECTS3.3、将hdfs中的数据导入到oracle中---------------------注意要提前在oracle中创建好表
sqoop export --connect jdbc:oracle:thin:@192.168.23.11:1521:oral --username SCOTT --password 123456 --table TEST2 --export-dir /sqoop/test/part-m-000004、sqoop必知--------------增量抽取

4.1按照字段来实现增量抽取
sqoop import --connect jdbc:mysql://192.168.8.101:3306/test --username sqoop --password sqoop --table sss -m 1 --target-dir /user/hadoop/a --check-column id --incremental append --last-value 34.2按照时间来实现增量抽取
mysql> create table s2 (id int,sj timestamp not null default current_timestamp);
mysql> insert into s2 (id)values(123);
mysql> insert into s2 (id)values(321);
mysql> select * from s2;
+------+---------------------+
| id | sj |
+------+---------------------+
| 123 | 2015-11-20 22:34:09 |
| 321 | 2015-11-20 22:34:23 |
+------+---------------------+

sqoop import --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql --table time -m 1 --target-dir /user/hadoop/time --incremental lastmodified --check-column sj --last-value '2015-11-20 22:34:15' ////`date +%Y-%m-%d` `date +%H:%M:%S`5、mysql与hive的导入
5.1从mysql到hive中的导入

sqoop import --connect jdbc:mysql://192.168.8.101/test --username sqoop --password sqoop --table sss --hive-import -m 1 --hive-table tb1 --fields-terminated-by ','
6、sqoop下的eval工具--------对关系型数据库进行sql操作

[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "select * from sss"

[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "insert into sss values(3,'ww')"
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息