您的位置:首页 > 运维架构

sqoop-1.4.6 安装及配置

2017-09-26 11:12 204 查看
1.环境信息

[hadoop@master sqoop-1.4.6]$ cat /etc/redhat-release
CentOS Linux release 7.1.1503 (Core)
[hadoop@master sqoop-1.4.6]$
[hadoop@master sqoop-1.4.6]$ mysql --version
mysql  Ver 14.14 Distrib 5.6.37, for Linux (x86_64) using  EditLine wrapper

[hadoop@master sqoop-1.4.6]$ hadoop version
Hadoop 2.8.1
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 20fe5304904fc2f5a18053c389e43cd26f7a70fe
Compiled by vinodkv on 2017-06-02T06:14Z
Compiled with protoc 2.5.0
From source with checksum 60125541c2b3e266cbf3becc5bda666
This command was run using /home/hadoop/hadoop-2.8.1/share/hadoop/common/hadoop-common-2.8.1.jar

[hadoop@master sqoop-1.4.6]$ hive --version
Hive 2.1.1
Subversion git://jcamachorodriguez-rMBP.local/Users/jcamachorodriguez/src/workspaces/hive/HIVE-release2/hive -r 1af77bbf8356e86cabbed92cfa8cc2e1470a1d5c
Compiled by jcamachorodriguez on Tue Nov 29 19:46:12 GMT 2016
From source with checksum 569ad6d6e5b71df3cb04303183948d90


[hadoop@master sqoop-1.4.6]$ hbase version
HBase 1.2.6
Source code repository file:///home/busbey/projects/hbase/hbase-assembly/target/hbase-1.2.6 revision=Unknown
Compiled by busbey on Mon May 29 02:25:32 CDT 2017
From source with checksum 7e8ce83a648e252758e9dae1fbe779c9

2.下载
http://mirror.bit.edu.cn/apache/sqoop/1.4.6/sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
这个目录下有多个sqoop ,请在请注意下上面这个版本, sqoop-1.4.6.tar.gz 这个包少sqoop-1.4.6.jar文件。
2.1 解压配置
解压到/home/hadoop 下面,并重命名为sqoop-1.4.6
[hadoop@master sqoop-1.4.6]$ cd /home/hadoop/sqoop-1.4.62.1.1 拷贝文件
[hadoop@master sqoop-1.4.6]$ cp sqoop-1.4.6.jar lib/ #拷贝sqoop-1.4.6.jar到lib目录
[hadoop@master sqoop-1.4.6]$ cp mysql-connector-java-5.1.44-bin.jar  lib/     #拷贝mysql驱动包到lib目录,这个包需要下载解压,前面hive安装提过



2.2 配置
sqoop-env.sh
[hadoop@master conf]$ cat sqoop-env.sh
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0 #
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop/hadoop-2.8.1/

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop/hadoop-2.8.1/share/hadoop/mapreduce

#set the path to where bin/hbase is available
export HBASE_HOME=/home/hadoop/hbase-1.2.6

#Set the path to where bin/hive is available
export HIVE_HOME=/home/hadoop/apache-hive-2.1.1

#Set the path for where zookeper config dir is
#export ZOOCFGDIR= #使用自带的
/etc/profile 变量设置 最后添加
export JAVA_HOME=/usr/java/jdk1.8.0_131/
export HADOOP_HOME=/home/hadoop/hadoop-2.8.1/
export HIVE_HOME=/home/hadoop/apache-hive-2.1.1
export HBASE_HOME=/home/hadoop/hbase-1.2.6
export SQOOP_HOME=/home/hadoop/sqoop-1.4.6 #添加sqoop变量
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$SQOOP_HOME/bin #添加sqoop变量

configure-sqoop 

注释下面部分(启动会告警):

[hadoop@master bin]$ cat -n configure-sqoop 
...........
134 ## Moved to be a runtime check in sqoop.
135  #if [ ! -d "${HCAT_HOME}" ]; then
136  #  echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
137  #  echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
138  #fi
139
140  #if [ ! -d "${ACCUMULO_HOME}" ]; then
141  #  echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
142  #  echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
143  #fi
..........



3.验证
[hadoop@master bin]$ sqoop import --connect jdbc:mysql://10.0.1.98/ykt --table paper_detail --username root --password 123456 --direct -m 1
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hadoop/hadoop-2.8.1/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.2.6/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
17/09/23 19:52:01 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
17/09/23 19:52:01 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/09/23 19:52:02 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/09/23 19:52:02 INFO tool.CodeGenTool: Beginning code generation
17/09/23 19:52:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `paper_detail` AS t LIMIT 1
17/09/23 19:52:02 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `paper_detail` AS t LIMIT 1
17/09/23 19:52:02 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /home/hadoop/hadoop-2.8.1/share/hadoop/mapreduce
Note: /tmp/sqoop-hadoop/compile/f2ada91047344c6af2723a1a8044f440/paper_detail.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/09/23 19:52:06 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/f2ada91047344c6af2723a1a8044f440/paper_detail.jar
17/09/23 19:52:06 INFO manager.DirectMySQLManager: Beginning mysqldump fast path import
17/09/23 19:52:06 INFO mapreduce.ImportJobBase: Beginning import of paper_detail
17/09/23 19:52:07 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
17/09/23 19:52:10 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
17/09/23 19:52:11 INFO client.RMProxy: Connecting to ResourceManager at master/10.0.1.118:18040
17/09/23 19:52:22 INFO db.DBInputFormat: Using read commited transaction isolation
17/09/23 19:52:23 INFO mapreduce.JobSubmitter: number of splits:1
17/09/23 19:52:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1506138213755_0004
17/09/23 19:52:28 INFO impl.YarnClientImpl: Submitted application application_1506138213755_0004
17/09/23 19:52:29 INFO mapreduce.Job: The url to track the job: http://master:18088/proxy/application_1506138213755_0004/ 17/09/23 19:52:29 INFO mapreduce.Job: Running job: job_1506138213755_0004
17/09/23 19:52:55 INFO mapreduce.Job: Job job_1506138213755_0004 running in uber mode : false
17/09/23 19:52:55 INFO mapreduce.Job: map 0% reduce 0%
17/09/23 19:53:26 INFO mapreduce.Job: map 100% reduce 0%
17/09/23 19:53:27 INFO mapreduce.Job: Job job_1506138213755_0004 completed successfully
17/09/23 19:53:28 INFO mapreduce.Job: Counters: 30
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=160883
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=87
HDFS: Number of bytes written=27954
HDFS: Number of read operations=4
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Other local map tasks=1
Total time spent by all maps in oc
adb9
cupied slots (ms)=27272
Total time spent by all reduces in occupied slots (ms)=0
Total time spent by all map tasks (ms)=27272
Total vcore-seconds taken by all map tasks=27272
Total megabyte-seconds taken by all map tasks=27926528
Map-Reduce Framework
Map input records=1
Map output records=962
Input split bytes=87
Spilled Records=0
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=92
CPU time spent (ms)=1150
Physical memory (bytes) snapshot=105013248
Virtual memory (bytes) snapshot=2092593152
Total committed heap usage (bytes)=17776640
File Input Format Counters
Bytes Read=0
File Output Format Counters
Bytes Written=27954
17/09/23 19:53:28 INFO mapreduce.ImportJobBase: Transferred 27.2988 KB in 77.2979 seconds (361.6399 bytes/sec)
17/09/23 19:53:28 INFO mapreduce.ImportJobBase: Retrieved 962 records.

查看HDFS文件数据

如果不指定hdfs目录默认生成在/user/someuser/paper_detail/
[hadoop@master bin]$ hadoop fs -cat /user/hadoop/paper_detail/part-m-00000 |more
1,1,填空,22,5,1,1
1,1,填空,23,5,2,2
1,1,填空,31,5,3,3
1,2,解答题,403,5,1,4
1,2,解答题,394,5,2,5
1,3,多选题,987,5,1,6
1,4,单选题,757,5,1,7
1,4,单选题,133,5,2,8
2,1,单项选择,19,1,1,1
2,2,数字选择,18,2,1,2
2,2,数字选择,21,2,2,3
2,2,数字选择,24,2,3,4
2,2,数字选择,25,2,4,5
2,3,填空,20,5,1,6
2,3,填空,23,10,2,7
2,4,233,395,12,1,8
3,1,单项选择,16,2,1,1
3,1,单项选择,17,2,2,2
3,1,单项选择,18,2,3,3
3,1,单项选择,21,2,4,4
3,1,单项选择,24,2,5,5
3,1,单项选择,25,2,6,6
3,2,解答,62,2,1,7
3,2,解答,63,2,2,8
4,1,选择题,717,2,1,1
4,1,选择题,718,2,2,2
......

sqoop 官方手册:http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_controlling_the_import_process
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  hadoop sqoop