您的位置：首页 > 数据库 > SQL

利用sqoop把Mysql中的表数据导出到HDFS下的文本文件里

2013-12-08 21:54 281 查看

1）下载并安装sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz到/usr目录下

tar -zvxf sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz

2）配置环境变量/etc/profile

export SQOOP_HOME=/usr/sqoop-1.4.4.bin__hadoop-1.0.0
export HADOOP_HOME=/usr/hadoop-1.1.2
export PATH=$PATH:$SQOOP_HOME/bin
配置完成后执行命令：source/etc/profile 使配置生效
3）启动hadoop集群，测试sqoop是否安装配置成功：

[hadoop@Master bin]$ sqoop version

Warning: $HADOOP_HOME is deprecated.

Sqoop 1.4.4

git commit id 050a2015514533bc25f3134a33401470ee9353ad

Compiled byvasanthkumar on Mon Jul 22 20:01:26 IST 2013

4）连接mysql数据库，创建测试表：

[hadoop@Master bin]$mysql -h10.24.46.4 -uhive –phive

mysql>use hive

mysql> createtable tb1 as select table_schema,table_name,table_type frominformation_schema.TABLES;

5）测试sqoop与mysql的连接：

[hadoop@Master ~]$ sqoop list-databases --connectjdbc:mysql://10.24.46.4:3306 --username hive --password hive

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: $HADOOP_HOME is deprecated.

13/12/07 17:43:59 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.

13/12/07 17:44:00 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.

information_schema

hive

mysql

performance_schema

test

6）从mysql导入数据到HDFS：

[hadoop@Master ~]$ sqoop import --connectjdbc:mysql://Slave2:3306/hive --username hive --password hive --table tb1 -m 1

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: $HADOOP_HOME is deprecated.

13/12/07 17:47:09 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.

13/12/07 17:47:10 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.

13/12/07 17:47:10 INFO tool.CodeGenTool: Beginning code generation

13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1

13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1

13/12/07 17:47:14INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop-1.1.2

13/12/07 17:47:28INFO mapreduce.ImportJobBase: Beginning import of tb1
13/12/07 17:47:47INFO mapred.JobClient: Running job: job_201312071702_0001
13/12/07 17:47:48INFO mapred.JobClient: map 0% reduce 0%
13/12/07 17:50:25INFO mapred.JobClient: map 100% reduce0%
13/12/07 17:51:50INFO mapred.JobClient: Job complete: job_201312071702_0001
13/12/07 17:51:51INFO mapred.JobClient: Counters: 18
13/12/07 17:51:51INFO mapred.JobClient: Job Counters
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_MAPS=143687
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all reduces waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all maps waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Launched maptasks=1
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/12/07 17:51:51INFO mapred.JobClient: File OutputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: BytesWritten=6784
13/12/07 17:51:51INFO mapred.JobClient: FileSystemCounters
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_READ=87
13/12/07 17:51:51INFO mapred.JobClient: FILE_BYTES_WRITTEN=62951
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6784
13/12/07 17:51:51INFO mapred.JobClient: File InputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: Bytes Read=0
13/12/07 17:51:51INFO mapred.JobClient: Map-ReduceFramework
13/12/07 17:51:51INFO mapred.JobClient: Map inputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: Physicalmemory (bytes) snapshot=61415424
13/12/07 17:51:51INFO mapred.JobClient: SpilledRecords=0
13/12/07 17:51:51INFO mapred.JobClient: CPU time spent(ms)=29210
13/12/07 17:51:51INFO mapred.JobClient: Totalcommitted heap usage (bytes)=30277632
13/12/07 17:51:51INFO mapred.JobClient: Virtual memory(bytes) snapshot=1251344384
13/12/07 17:51:51INFO mapred.JobClient: Map outputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: SPLIT_RAW_BYTES=87
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Transferred 6.625 KB in 260.7197 seconds (26.0203bytes/sec)
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Retrieved 143 records.
7）在HDFS中查看刚才导入的文件：

[hadoop@Master ~]$ hadoop fs -ls tb1

Warning: $HADOOP_HOME is deprecated.

Found 3 items

-rw-r--r-- 2 hadoopsupergroup 0 2013-12-07 17:51/user/hadoop/tb1/_SUCCESS

drwxr-xr-x - hadoopsupergroup 0 2013-12-07 17:47/user/hadoop/tb1/_logs

-rw-r--r-- 2 hadoop supergroup 6784 2013-12-07 17:50/user/hadoop/tb1/part-m-00000

8）查看文件内容：

[hadoop@Master ~]$ hadoop fs -cat /user/hadoop/tb1/part-m-00000

Warning: $HADOOP_HOME is deprecated.

information_schema,CHARACTER_SETS,SYSTEM VIEW

information_schema,COLLATIONS,SYSTEM VIEW

information_schema,COLLATION_CHARACTER_SET_APPLICABILITY,SYSTEM VIEW

information_schema,COLUMNS,SYSTEM VIEW

information_schema,COLUMN_PRIVILEGES,SYSTEM VIEW

information_schema,ENGINES,SYSTEM VIEW

information_schema,EVENTS,SYSTEM VIEW

information_schema,FILES,SYSTEM VIEW

information_schema,GLOBAL_STATUS,SYSTEM VIEW

information_schema,GLOBAL_VARIABLES,SYSTEM VIEW

information_schema,KEY_COLUMN_USAGE,SYSTEM VIEW

information_schema,OPTIMIZER_TRACE,SYSTEM VIEW

information_schema,PARAMETERS,SYSTEM VIEW

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航