您的位置:首页 > 数据库 > SQL

利用sqoop把Mysql中的表数据导出到HDFS下的文本文件里

2013-12-08 21:54 281 查看
1) 下载并安装sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz到/usr目录下

tar -zvxf sqoop-1.4.4.bin__hadoop-1.0.0.tar.gz

2) 配置环境变量/etc/profile

export SQOOP_HOME=/usr/sqoop-1.4.4.bin__hadoop-1.0.0
export HADOOP_HOME=/usr/hadoop-1.1.2
export PATH=$PATH:$SQOOP_HOME/bin
配置完成后执行命令:source/etc/profile 使配置生效
3) 启动hadoop集群,测试sqoop是否安装配置成功:

[hadoop@Master bin]$ sqoop version

Warning: $HADOOP_HOME is deprecated.

Sqoop 1.4.4

git commit id 050a2015514533bc25f3134a33401470ee9353ad

Compiled byvasanthkumar on Mon Jul 22 20:01:26 IST 2013

4) 连接mysql数据库,创建测试表:

[hadoop@Master bin]$mysql -h10.24.46.4 -uhive –phive

mysql>use hive

mysql> createtable tb1 as select table_schema,table_name,table_type frominformation_schema.TABLES;

5) 测试sqoop与mysql的连接:

[hadoop@Master ~]$ sqoop list-databases --connectjdbc:mysql://10.24.46.4:3306 --username hive --password hive

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: $HADOOP_HOME is deprecated.

13/12/07 17:43:59 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.

13/12/07 17:44:00 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.

information_schema

hive

mysql

performance_schema

test

6) 从mysql导入数据到HDFS:

[hadoop@Master ~]$ sqoop import --connectjdbc:mysql://Slave2:3306/hive --username hive --password hive --table tb1 -m 1

Warning: /usr/lib/hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: $HADOOP_HOME is deprecated.

13/12/07 17:47:09 WARN tool.BaseSqoopTool: Setting your password onthe command-line is insecure. Consider using -P instead.

13/12/07 17:47:10 INFO manager.MySQLManager: Preparing to use aMySQL streaming resultset.

13/12/07 17:47:10 INFO tool.CodeGenTool: Beginning code generation

13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1

13/12/07 17:47:14 INFO manager.SqlManager: Executing SQL statement:SELECT t.* FROM `tb1` AS t LIMIT 1

13/12/07 17:47:14INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hadoop-1.1.2

13/12/07 17:47:28INFO mapreduce.ImportJobBase: Beginning import of tb1
13/12/07 17:47:47INFO mapred.JobClient: Running job: job_201312071702_0001
13/12/07 17:47:48INFO mapred.JobClient: map 0% reduce 0%
13/12/07 17:50:25INFO mapred.JobClient: map 100% reduce0%
13/12/07 17:51:50INFO mapred.JobClient: Job complete: job_201312071702_0001
13/12/07 17:51:51INFO mapred.JobClient: Counters: 18
13/12/07 17:51:51INFO mapred.JobClient: Job Counters
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_MAPS=143687
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all reduces waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Total timespent by all maps waiting after reserving slots (ms)=0
13/12/07 17:51:51INFO mapred.JobClient: Launched maptasks=1
13/12/07 17:51:51INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/12/07 17:51:51INFO mapred.JobClient: File OutputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: BytesWritten=6784
13/12/07 17:51:51INFO mapred.JobClient: FileSystemCounters
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_READ=87
13/12/07 17:51:51INFO mapred.JobClient: FILE_BYTES_WRITTEN=62951
13/12/07 17:51:51INFO mapred.JobClient: HDFS_BYTES_WRITTEN=6784
13/12/07 17:51:51INFO mapred.JobClient: File InputFormat Counters
13/12/07 17:51:51INFO mapred.JobClient: Bytes Read=0
13/12/07 17:51:51INFO mapred.JobClient: Map-ReduceFramework
13/12/07 17:51:51INFO mapred.JobClient: Map inputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: Physicalmemory (bytes) snapshot=61415424
13/12/07 17:51:51INFO mapred.JobClient: SpilledRecords=0
13/12/07 17:51:51INFO mapred.JobClient: CPU time spent(ms)=29210
13/12/07 17:51:51INFO mapred.JobClient: Totalcommitted heap usage (bytes)=30277632
13/12/07 17:51:51INFO mapred.JobClient: Virtual memory(bytes) snapshot=1251344384
13/12/07 17:51:51INFO mapred.JobClient: Map outputrecords=143
13/12/07 17:51:51INFO mapred.JobClient: SPLIT_RAW_BYTES=87
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Transferred 6.625 KB in 260.7197 seconds (26.0203bytes/sec)
13/12/07 17:51:51INFO mapreduce.ImportJobBase: Retrieved 143 records.
7) 在HDFS中查看刚才导入的文件:

[hadoop@Master ~]$ hadoop fs -ls tb1

Warning: $HADOOP_HOME is deprecated.

Found 3 items

-rw-r--r-- 2 hadoopsupergroup 0 2013-12-07 17:51/user/hadoop/tb1/_SUCCESS

drwxr-xr-x - hadoopsupergroup 0 2013-12-07 17:47/user/hadoop/tb1/_logs

-rw-r--r-- 2 hadoop supergroup 6784 2013-12-07 17:50/user/hadoop/tb1/part-m-00000

8) 查看文件内容:

[hadoop@Master ~]$ hadoop fs -cat /user/hadoop/tb1/part-m-00000

Warning: $HADOOP_HOME is deprecated.

information_schema,CHARACTER_SETS,SYSTEM VIEW

information_schema,COLLATIONS,SYSTEM VIEW

information_schema,COLLATION_CHARACTER_SET_APPLICABILITY,SYSTEM VIEW

information_schema,COLUMNS,SYSTEM VIEW

information_schema,COLUMN_PRIVILEGES,SYSTEM VIEW

information_schema,ENGINES,SYSTEM VIEW

information_schema,EVENTS,SYSTEM VIEW

information_schema,FILES,SYSTEM VIEW

information_schema,GLOBAL_STATUS,SYSTEM VIEW

information_schema,GLOBAL_VARIABLES,SYSTEM VIEW

information_schema,KEY_COLUMN_USAGE,SYSTEM VIEW

information_schema,OPTIMIZER_TRACE,SYSTEM VIEW

information_schema,PARAMETERS,SYSTEM VIEW
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: