利用sqoop把Mysql中的表数据导出到HDFS下的文本文件里
2013-12-08 21:54
281 查看
1) 下载并安装sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz到/usr目录下
tar -zvxf sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz
2) 配置环境变量/etc/profile
export SQOOP_HOME=/usr/sqoop-1.4.4.bin__hadoop-1.0.0
export HADOOP_HOME=/usr/hadoop-1.1.2
export PATH=$PATH:$SQOOP_HOME/bin
配置完成后执行命令:source/etc/profile 使配置生效
3) 启动hadoop集群,测试sqoop是否安装配置成功:
[hadoop@Master bin]$ sqoop version
Warning: $HADOOP_HOME is deprecated.
Sqoop 1.4.4
git commit id 050a2015514533bc25f3134a33401470ee9353ad
Compiled byvasanthkumar on Mon Jul 22 20:01:26 IST 2013
4) 连接mysql数据库,创建测试表:
[hadoop@Master bin]$mysql -h10.24.46.4 -uhive –phive
mysql>use hive
mysql> createtable tb1 as select table_schema,table_name,table_type frominformation_schema.TABLES;
5) 测试sqoop与mysql的连接:
[hadoop@Master ~]$ sqoop list-databases --connectjdbc:mysql://10.24.46.4:3306 --username hive --password hive
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.
13/12/07 17:43:59 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.
13/12/07 17:44:00 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.
information_schema
hive
mysql
performance_schema
test
6) 从mysql导入数据到HDFS:
[hadoop@Master ~]$ sqoop import --connectjdbc:mysql://Slave2:3306/hive --username hive --password hive --table tb1 -m 1
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.
13/12/07 17:47:09 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.
13/12/07 17:47:10 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.
13/12/07 17:47:10 INFO tool.CodeGenTool: Beginning code generation
13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1
13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1
13/12/07 17:47:14INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop-1.1.2
13/12/07 17:47:28INFO mapreduce.ImportJobBase: Beginning import of tb1
13/12/07 17:47:47INFO mapred.JobClient: Running job: job_201312071702_0001
13/12/07 17:47:48INFO mapred.JobClient: map 0% reduce 0%
13/12/07 17:50:25INFO mapred.JobClient: map 100% reduce0%
13/12/07 17:51:50INFO mapred.JobClient: Job complete: job_201312071702_0001
13/12/07 17:51:51INFO mapred.JobClient: Counters: 18
13/12/07 17:51:51INFO mapred.JobClient: Job Counters
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_MAPS=143687
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all reduces waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all maps waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Launched maptasks=1
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/12/07 17:51:51INFO mapred.JobClient: File OutputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: BytesWritten=6784
13/12/07 17:51:51INFO mapred.JobClient: FileSystemCounters
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_READ=87
13/12/07 17:51:51INFO mapred.JobClient: FILE_BYTES_WRITTEN=62951
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6784
13/12/07 17:51:51INFO mapred.JobClient: File InputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: Bytes Read=0
13/12/07 17:51:51INFO mapred.JobClient: Map-ReduceFramework
13/12/07 17:51:51INFO mapred.JobClient: Map inputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: Physicalmemory (bytes) snapshot=61415424
13/12/07 17:51:51INFO mapred.JobClient: SpilledRecords=0
13/12/07 17:51:51INFO mapred.JobClient: CPU time spent(ms)=29210
13/12/07 17:51:51INFO mapred.JobClient: Totalcommitted heap usage (bytes)=30277632
13/12/07 17:51:51INFO mapred.JobClient: Virtual memory(bytes) snapshot=1251344384
13/12/07 17:51:51INFO mapred.JobClient: Map outputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: SPLIT_RAW_BYTES=87
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Transferred 6.625 KB in 260.7197 seconds (26.0203bytes/sec)
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Retrieved 143 records.
7) 在HDFS中查看刚才导入的文件:
[hadoop@Master ~]$ hadoop fs -ls tb1
Warning: $HADOOP_HOME is deprecated.
Found 3 items
-rw-r--r-- 2 hadoopsupergroup 0 2013-12-07 17:51/user/hadoop/tb1/_SUCCESS
drwxr-xr-x - hadoopsupergroup 0 2013-12-07 17:47/user/hadoop/tb1/_logs
-rw-r--r-- 2 hadoop supergroup 6784 2013-12-07 17:50/user/hadoop/tb1/part-m-00000
8) 查看文件内容:
[hadoop@Master ~]$ hadoop fs -cat /user/hadoop/tb1/part-m-00000
Warning: $HADOOP_HOME is deprecated.
information_schema,CHARACTER_SETS,SYSTEM VIEW
information_schema,COLLATIONS,SYSTEM VIEW
information_schema,COLLATION_CHARACTER_SET_APPLICABILITY,SYSTEM VIEW
information_schema,COLUMNS,SYSTEM VIEW
information_schema,COLUMN_PRIVILEGES,SYSTEM VIEW
information_schema,ENGINES,SYSTEM VIEW
information_schema,EVENTS,SYSTEM VIEW
information_schema,FILES,SYSTEM VIEW
information_schema,GLOBAL_STATUS,SYSTEM VIEW
information_schema,GLOBAL_VARIABLES,SYSTEM VIEW
information_schema,KEY_COLUMN_USAGE,SYSTEM VIEW
information_schema,OPTIMIZER_TRACE,SYSTEM VIEW
information_schema,PARAMETERS,SYSTEM VIEW
tar -zvxf sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz
2) 配置环境变量/etc/profile
export SQOOP_HOME=/usr/sqoop-1.4.4.bin__hadoop-1.0.0
export HADOOP_HOME=/usr/hadoop-1.1.2
export PATH=$PATH:$SQOOP_HOME/bin
配置完成后执行命令:source/etc/profile 使配置生效
3) 启动hadoop集群,测试sqoop是否安装配置成功:
[hadoop@Master bin]$ sqoop version
Warning: $HADOOP_HOME is deprecated.
Sqoop 1.4.4
git commit id 050a2015514533bc25f3134a33401470ee9353ad
Compiled byvasanthkumar on Mon Jul 22 20:01:26 IST 2013
4) 连接mysql数据库,创建测试表:
[hadoop@Master bin]$mysql -h10.24.46.4 -uhive –phive
mysql>use hive
mysql> createtable tb1 as select table_schema,table_name,table_type frominformation_schema.TABLES;
5) 测试sqoop与mysql的连接:
[hadoop@Master ~]$ sqoop list-databases --connectjdbc:mysql://10.24.46.4:3306 --username hive --password hive
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.
13/12/07 17:43:59 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.
13/12/07 17:44:00 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.
information_schema
hive
mysql
performance_schema
test
6) 从mysql导入数据到HDFS:
[hadoop@Master ~]$ sqoop import --connectjdbc:mysql://Slave2:3306/hive --username hive --password hive --table tb1 -m 1
Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: $HADOOP_HOME is deprecated.
13/12/07 17:47:09 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.
13/12/07 17:47:10 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.
13/12/07 17:47:10 INFO tool.CodeGenTool: Beginning code generation
13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1
13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1
13/12/07 17:47:14INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop-1.1.2
13/12/07 17:47:28INFO mapreduce.ImportJobBase: Beginning import of tb1
13/12/07 17:47:47INFO mapred.JobClient: Running job: job_201312071702_0001
13/12/07 17:47:48INFO mapred.JobClient: map 0% reduce 0%
13/12/07 17:50:25INFO mapred.JobClient: map 100% reduce0%
13/12/07 17:51:50INFO mapred.JobClient: Job complete: job_201312071702_0001
13/12/07 17:51:51INFO mapred.JobClient: Counters: 18
13/12/07 17:51:51INFO mapred.JobClient: Job Counters
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_MAPS=143687
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all reduces waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all maps waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Launched maptasks=1
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/12/07 17:51:51INFO mapred.JobClient: File OutputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: BytesWritten=6784
13/12/07 17:51:51INFO mapred.JobClient: FileSystemCounters
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_READ=87
13/12/07 17:51:51INFO mapred.JobClient: FILE_BYTES_WRITTEN=62951
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6784
13/12/07 17:51:51INFO mapred.JobClient: File InputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: Bytes Read=0
13/12/07 17:51:51INFO mapred.JobClient: Map-ReduceFramework
13/12/07 17:51:51INFO mapred.JobClient: Map inputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: Physicalmemory (bytes) snapshot=61415424
13/12/07 17:51:51INFO mapred.JobClient: SpilledRecords=0
13/12/07 17:51:51INFO mapred.JobClient: CPU time spent(ms)=29210
13/12/07 17:51:51INFO mapred.JobClient: Totalcommitted heap usage (bytes)=30277632
13/12/07 17:51:51INFO mapred.JobClient: Virtual memory(bytes) snapshot=1251344384
13/12/07 17:51:51INFO mapred.JobClient: Map outputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: SPLIT_RAW_BYTES=87
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Transferred 6.625 KB in 260.7197 seconds (26.0203bytes/sec)
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Retrieved 143 records.
7) 在HDFS中查看刚才导入的文件:
[hadoop@Master ~]$ hadoop fs -ls tb1
Warning: $HADOOP_HOME is deprecated.
Found 3 items
-rw-r--r-- 2 hadoopsupergroup 0 2013-12-07 17:51/user/hadoop/tb1/_SUCCESS
drwxr-xr-x - hadoopsupergroup 0 2013-12-07 17:47/user/hadoop/tb1/_logs
-rw-r--r-- 2 hadoop supergroup 6784 2013-12-07 17:50/user/hadoop/tb1/part-m-00000
8) 查看文件内容:
[hadoop@Master ~]$ hadoop fs -cat /user/hadoop/tb1/part-m-00000
Warning: $HADOOP_HOME is deprecated.
information_schema,CHARACTER_SETS,SYSTEM VIEW
information_schema,COLLATIONS,SYSTEM VIEW
information_schema,COLLATION_CHARACTER_SET_APPLICABILITY,SYSTEM VIEW
information_schema,COLUMNS,SYSTEM VIEW
information_schema,COLUMN_PRIVILEGES,SYSTEM VIEW
information_schema,ENGINES,SYSTEM VIEW
information_schema,EVENTS,SYSTEM VIEW
information_schema,FILES,SYSTEM VIEW
information_schema,GLOBAL_STATUS,SYSTEM VIEW
information_schema,GLOBAL_VARIABLES,SYSTEM VIEW
information_schema,KEY_COLUMN_USAGE,SYSTEM VIEW
information_schema,OPTIMIZER_TRACE,SYSTEM VIEW
information_schema,PARAMETERS,SYSTEM VIEW
相关文章推荐
- 使用Sqoop将HDFS/Hive/HBase与MySQL/Oracle中的数据相互导入、导出
- 利用sqoop将hive数据导入导出数据到mysql
- Hadoop Hive概念学习系列之HDFS、Hive、MySQL、Sqoop之间的数据导入导出(强烈建议去看)(十八)
- Hadoop之Sqoop导出hdfs数据到Mysql
- 利用Sqoop实现MySQL与HDFS数据互导
- 利用Sqoop从HDFS导出数据到DB
- SQOOP从HDFS导出数据到MySQL
- 大数据基础(二)hadoop, mave, hbase, hive, sqoop在ubuntu 14.04.04下的安装和sqoop与hdfs,hive,mysql导入导出
- 利用sqoop将hive数据导入导出数据到mysql (转)
- Sqoop_详细总结 使用Sqoop将HDFS/Hive/HBase与MySQL/Oracle中的数据相互导入、导出
- 利用sqoop将hive数据导入导出数据到mysql
- 通过sqoop 实现hdfs与mysql的数据导入导出
- 利用sqoop将hive数据导入导出数据到mysql
- sqoop从hdfs 中导出数据到mysql
- 使用Sqoop将HDFS/Hive/HBase与MySQL/Oracle中的数据相互导入、导出
- 利用Sqoop将MySQL海量测试数据导入HDFS和HBase
- 利用sqoop将hive数据导入导出数据到mysql
- 利用sqoop将hive数据导入导出数据到mysql
- Sqoop_具体总结 使用Sqoop将HDFS/Hive/HBase与MySQL/Oracle中的数据相互导入、导出
- 1.5 使用Sqoop从HDFS导出数据到MySQL