hadoop之sqoop抽取数据
2016-04-19 15:32
441 查看
1、sqoop的安装与配置
安装(前提hadoop启动)
[hadoop@h91 ~]$ tar -zxvf sqoop-1.3.0-cdh3u5.tar.gz
[hadoop@h91 hadoop-0.20.2-cdh3u5]$ cp hadoop-core-0.20.2-cdh3u5.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 ~]$ cp ojdbc6.jar sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 ~]$ vi .bash_profile
添加
export SQOOP_HOME=/home/hadoop/sqoop-1.3.0-cdh3u5
[hadoop@h91 ~]$ source .bash_profile
2、sqoop与mysql的导入导出
2.1、授权用户,并创建sss表
mysql> insert into mysql.user(Host,User,Password) values("localhost","sqoop",password("sqoop"));
mysql> flush privileges;
mysql> grant all privileges on *.* to 'sqoop'@'%' identified by 'sqoop' with grant option;
mysql> flush privileges;
mysql> use test;
mysql> create table sss (id int,name varchar(10));
mysql> insert into sss values(1,'zs');
mysql> insert into sss values(2,'ls');2.2、测试sqoop能否连接上mysql
[hadoop@h91 mysql-connector-java-5.0.7]$ cp mysql-connector-java-5.0.7-bin.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql2.3、将mysql中的表导入到hdfs(-m 为并行 默认并行度为4)
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop import --connect jdbc:mysql://192.168.8.222:3306/test --username sqoop --password sqoop --table sss -m 1
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop export --connect jdbc:mysql://192.168.23.2:3306/test --username sqoop --password sqoop --table sss --export-dir hdfs://h91:9000/user/hadoop/sss/part-m-00000
3.1、测试sqoop能否连接上oracle
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom(实例名) --username SCOTT --password 1234563.2、将oracle中的表导入到hdfs中并指定hdfs中的目录,默认为user---------------注意不需要提前创建好文件目录
sqoop import --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom --username SCOTT --password 123456 --verbose -m 1 --table ALL_OBJECTS3.3、将hdfs中的数据导入到oracle中---------------------注意要提前在oracle中创建好表
sqoop export --connect jdbc:oracle:thin:@192.168.23.11:1521:oral --username SCOTT --password 123456 --table TEST2 --export-dir /sqoop/test/part-m-000004、sqoop必知--------------增量抽取
4.1按照字段来实现增量抽取
sqoop import --connect jdbc:mysql://192.168.8.101:3306/test --username sqoop --password sqoop --table sss -m 1 --target-dir /user/hadoop/a --check-column id --incremental append --last-value 34.2按照时间来实现增量抽取
mysql> create table s2 (id int,sj timestamp not null default current_timestamp);
mysql> insert into s2 (id)values(123);
mysql> insert into s2 (id)values(321);
mysql> select * from s2;
+------+---------------------+
| id | sj |
+------+---------------------+
| 123 | 2015-11-20 22:34:09 |
| 321 | 2015-11-20 22:34:23 |
+------+---------------------+
sqoop import --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql --table time -m 1 --target-dir /user/hadoop/time --incremental lastmodified --check-column sj --last-value '2015-11-20 22:34:15' ////`date +%Y-%m-%d` `date +%H:%M:%S`5、mysql与hive的导入
5.1从mysql到hive中的导入
sqoop import --connect jdbc:mysql://192.168.8.101/test --username sqoop --password sqoop --table sss --hive-import -m 1 --hive-table tb1 --fields-terminated-by ','
6、sqoop下的eval工具--------对关系型数据库进行sql操作
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "select * from sss"
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "insert into sss values(3,'ww')"
安装(前提hadoop启动)
[hadoop@h91 ~]$ tar -zxvf sqoop-1.3.0-cdh3u5.tar.gz
[hadoop@h91 hadoop-0.20.2-cdh3u5]$ cp hadoop-core-0.20.2-cdh3u5.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 ~]$ cp ojdbc6.jar sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 ~]$ vi .bash_profile
添加
export SQOOP_HOME=/home/hadoop/sqoop-1.3.0-cdh3u5
[hadoop@h91 ~]$ source .bash_profile
2、sqoop与mysql的导入导出
2.1、授权用户,并创建sss表
mysql> insert into mysql.user(Host,User,Password) values("localhost","sqoop",password("sqoop"));
mysql> flush privileges;
mysql> grant all privileges on *.* to 'sqoop'@'%' identified by 'sqoop' with grant option;
mysql> flush privileges;
mysql> use test;
mysql> create table sss (id int,name varchar(10));
mysql> insert into sss values(1,'zs');
mysql> insert into sss values(2,'ls');2.2、测试sqoop能否连接上mysql
[hadoop@h91 mysql-connector-java-5.0.7]$ cp mysql-connector-java-5.0.7-bin.jar /home/hadoop/sqoop-1.3.0-cdh3u5/lib/
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql2.3、将mysql中的表导入到hdfs(-m 为并行 默认并行度为4)
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop import --connect jdbc:mysql://192.168.8.222:3306/test --username sqoop --password sqoop --table sss -m 1
hadoop fs -cat /user/hadoop/sss/part-m-00000(查看结果)2.4、将hdfs中的数据导入到mysql
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop export --connect jdbc:mysql://192.168.23.2:3306/test --username sqoop --password sqoop --table sss --export-dir hdfs://h91:9000/user/hadoop/sss/part-m-00000
[root@o222 ~]# mysql -usqoop -p mysql> use test mysql> select * from sss;(查看结果)3、sqoop与oracle的导入导出
3.1、测试sqoop能否连接上oracle
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop list-tables --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom(实例名) --username SCOTT --password 1234563.2、将oracle中的表导入到hdfs中并指定hdfs中的目录,默认为user---------------注意不需要提前创建好文件目录
sqoop import --connect jdbc:oracle:thin:@192.168.23.11:1521:ecom --username SCOTT --password 123456 --verbose -m 1 --table ALL_OBJECTS3.3、将hdfs中的数据导入到oracle中---------------------注意要提前在oracle中创建好表
sqoop export --connect jdbc:oracle:thin:@192.168.23.11:1521:oral --username SCOTT --password 123456 --table TEST2 --export-dir /sqoop/test/part-m-000004、sqoop必知--------------增量抽取
4.1按照字段来实现增量抽取
sqoop import --connect jdbc:mysql://192.168.8.101:3306/test --username sqoop --password sqoop --table sss -m 1 --target-dir /user/hadoop/a --check-column id --incremental append --last-value 34.2按照时间来实现增量抽取
mysql> create table s2 (id int,sj timestamp not null default current_timestamp);
mysql> insert into s2 (id)values(123);
mysql> insert into s2 (id)values(321);
mysql> select * from s2;
+------+---------------------+
| id | sj |
+------+---------------------+
| 123 | 2015-11-20 22:34:09 |
| 321 | 2015-11-20 22:34:23 |
+------+---------------------+
sqoop import --connect jdbc:mysql://192.168.23.2:3306/test --username hive --password mysql --table time -m 1 --target-dir /user/hadoop/time --incremental lastmodified --check-column sj --last-value '2015-11-20 22:34:15' ////`date +%Y-%m-%d` `date +%H:%M:%S`5、mysql与hive的导入
5.1从mysql到hive中的导入
sqoop import --connect jdbc:mysql://192.168.8.101/test --username sqoop --password sqoop --table sss --hive-import -m 1 --hive-table tb1 --fields-terminated-by ','
6、sqoop下的eval工具--------对关系型数据库进行sql操作
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "select * from sss"
[hadoop@h91 sqoop-1.3.0-cdh3u5]$ bin/sqoop eval --connect jdbc:mysql://192.168.8.91:3306/test --username sqoop --password sqoop --query "insert into sss values(3,'ww')"
相关文章推荐
- MySQL中的integer 数据类型
- MySQL存储过程
- 基于 Red Hat 的发行版 Oracle Linux 正式发布Oracle Linux 7.1
- 详解HDFS Short Circuit Local Reads
- mysql中int、bigint、smallint 和 tinyint的区别与长度
- mysql load data 导出、导入 csv
- source命令执行SQL脚本文件
- MySQL创建用户及权限控制
- MySQL管理数据表
- linux下mysql添加用户
- mysql procedure
- mysql触发器
- Oracle Containers for J2EE远程安全漏洞(CVE-2014-0413)
- Hadoop_2.1.0 MapReduce序列图
- 使用Hadoop搭建现代电信企业架构
- Oracle 10g R2不能使用EM的问题
- MySQL 备份和恢复策略
- 表空间操作
- PreparedStatement中in子句的处理