YARN & HDFS2 安装和配置Kerberos
2013-09-11 16:28
483 查看
今天尝试在Hadoop 2.x开发集群上配置Kerberos,遇到一些问题,记录一下
设置hadoop security
core-site.xml
hadoop.security.authentication默认是simple方式,也就是基于linux操作系统的验证方式,用户端调用whoami命令,然后RPC call给服务端,恶意用户很容易在其他host伪造一个相同的用户。这里我们改为kerberos。
设置hdfs security
hdfs-site.xml
配置中有几点要注意的
1. dfs.datanode.address表示data transceiver RPC server所绑定的hostname或IP地址,如果开启security,端口号必须小于1024(privileged port),否则的话启动datanode时候会报“Cannot start secure cluster without privileged resources”错误
2. principal中的instance部分可以使用'_HOST'标记,系统会自动替换它为全称域名
3. 如果开启了security, hadoop会对hdfs block data(由dfs.data.dir指定)做permission check,方式用户的代码不是调用hdfs api而是直接本地读block data,这样就绕过了kerberos和文件权限验证,管理员可以通过设置dfs.datanode.data.dir.perm来修改datanode文件权限,这里我们设置为700
namenode和secondary namenode都是以hadoop用户身份启动
datanode需要以root用户身份用jsvc来启动,而Hadoop 2.x自身带的jsvc是32位版本的,需要去jsvc官网上重新下载编译
set. Falling back to starting insecure DN."
编译commons-daemon-1.0.15.jar,拷贝到$HADOOP_HOME/share/hadoop/hdfs/lib下,同时删除自带版本的commons-daemon jar包
hadoop-env.sh中修改
分发配置和jar到整个集群
用hadoop帐号启动namenode,然后切换到root,再启动datanode,发现namenode web页面上有显示"
Security is ON
"
启动secure datanode命令
设置yarn security
yarn-site.xml
container-executor默认是DefaultContainerExecutor,是以起Nodemanager的用户身份启动container的,切换为LinuxContainerExecutor会以提交application的用户身份来启动,它使用一个setuid可执行文件来启动和销毁container
这个可执行文件在bin/container-executor,不过Hadoop默认带的还是32位版本,所以需要重新编译
下载Hadoop 2.x source code
mvn package -Pdist,native -DskipTests -Dtar -Dcontainer-executor.conf.dir=/etc
注:container-executor.conf.dir必须显示注明,它表示setuid可执行文件依赖的配置文件(container-executor.cfg)路径,默认会在$HADOOP_HOME/etc/hadoop下,不过由于该文件需要父目录和以上的目录的owner都为root,要不然会有以下报错,所以为了方便我们设置为/etc
默认的寻找configuration路径
加上container-executor.conf.dir=/etc 再编译后
container-executor.cfg中设置
将container-executor拷贝到$HADOOP_HOME/bin
注意:为了方便测试,先不要将配置分发到所有集群,而是分到一台,然后起RM和一台NM,一切都okay后,再同步到所有host
设置jobhistory server security
mapred-site.xml
启动JobHistoryServer
sbin/mr-jobhistory-daemon.sh start historyserver
执行命令kinit,获得一张tgt(ticket
granting ticket)
ticket cache, 默认会在/tmp下创建名字为“krb5cc_”加上uid的文件,此处500表示hadoop帐号的uid
KRB5CCNAME=/tmp/krb5cc_500到环境变量来指定ticket cache路径
用完之后可以使用kdestroy命令来销毁掉该ticket cache
如果本地没有ticket cache,会报如下错误
附上/etc/hadoop.keytab中的principal,都是service principal
/home/hadoop/.keytab下放user principal
由于keytab相当于有了永久凭证,不需要提供密码(如果修改kdc中的principal的密码,则该keytab就会失效),所以其他用户如果对该文件有读权限,就可以冒充keytab中指定的用户身份访问hadoop,所以keytab文件需要确保只对owner有读权限(0400)
本文链接/article/1385954.html,转载请注明
设置hadoop security
core-site.xml
<property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property> <property> <name>hadoop.security.authorization</name> <value>true</value> </property>
hadoop.security.authentication默认是simple方式,也就是基于linux操作系统的验证方式,用户端调用whoami命令,然后RPC call给服务端,恶意用户很容易在其他host伪造一个相同的用户。这里我们改为kerberos。
设置hdfs security
hdfs-site.xml
<property> <name>dfs.block.access.token.enable</name> <value>true</value> </property> <property> <name>dfs.https.enable</name> <value>false</value> </property> <property> <name>dfs.namenode.https-address</name> <value>dev80.hadoop:50470</value> </property> <property> <name>dfs.https.port</name> <value>50470</value> </property> <property> <name>dfs.namenode.keytab.file</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>dfs.namenode.kerberos.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.namenode.kerberos.https.principal</name> <value>host/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>dev80.hadoop:50090</value> </property> <property> <name>dfs.namenode.secondary.https-port</name> <value>50470</value> </property> <property> <name>dfs.namenode.secondary.keytab.file</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>dfs.namenode.secondary.kerberos.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.namenode.secondary.kerberos.https.principal</name> <value>host/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.datanode.data.dir.perm</name> <value>700</value> </property> <property> <name>dfs.datanode.address</name> <value>0.0.0.0:1003</value> </property> <property> <name>dfs.datanode.http.address</name> <value>0.0.0.0:1007</value> </property> <property> <name>dfs.datanode.https.address</name> <value>0.0.0.0:1005</value> </property> <property> <name>dfs.datanode.keytab.file</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>dfs.datanode.kerberos.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.datanode.kerberos.https.principal</name> <value>host/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.datanode.data.dir.perm</name> <value>700</value> </property> <property> <name>dfs.datanode.address</name> <value>0.0.0.0:1003</value> </property> <property> <name>dfs.datanode.http.address</name> <value>0.0.0.0:1007</value> </property> <property> <name>dfs.datanode.https.address</name> <value>0.0.0.0:1005</value> </property> <property> <name>dfs.datanode.keytab.file</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>dfs.datanode.kerberos.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.datanode.kerberos.https.principal</name> <value>host/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.web.authentication.kerberos.principal</name> <value>HTTP/_HOST@DIANPING.COM</value> </property> <property> <name>dfs.web.authentication.kerberos.keytab</name> <value>/etc/hadoop.keytab</value> <description> The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint. </description> </property>
配置中有几点要注意的
1. dfs.datanode.address表示data transceiver RPC server所绑定的hostname或IP地址,如果开启security,端口号必须小于1024(privileged port),否则的话启动datanode时候会报“Cannot start secure cluster without privileged resources”错误
2. principal中的instance部分可以使用'_HOST'标记,系统会自动替换它为全称域名
3. 如果开启了security, hadoop会对hdfs block data(由dfs.data.dir指定)做permission check,方式用户的代码不是调用hdfs api而是直接本地读block data,这样就绕过了kerberos和文件权限验证,管理员可以通过设置dfs.datanode.data.dir.perm来修改datanode文件权限,这里我们设置为700
namenode和secondary namenode都是以hadoop用户身份启动
datanode需要以root用户身份用jsvc来启动,而Hadoop 2.x自身带的jsvc是32位版本的,需要去jsvc官网上重新下载编译
wget http://mirror.esocc.com/apache//commons/daemon/binaries/commons-daemon-1.0.15-bin.tar.gz cd src/native/unix; configure; make生成jsvc 64位executable,把它拷贝到$HADOOP_HOME/libexec,然后需要在hadoop-env.sh中指定JSVC_HOME到此路径,否则会报错"It looks like you're trying to start a secure DN, but $JSVC_HOME isn't
set. Falling back to starting insecure DN."
[hadoop@dev80 unix]$ file jsvc jsvc: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not strippedmvn package
编译commons-daemon-1.0.15.jar,拷贝到$HADOOP_HOME/share/hadoop/hdfs/lib下,同时删除自带版本的commons-daemon jar包
hadoop-env.sh中修改
# The jsvc implementation to use. Jsvc is required to run secure datanodes. export JSVC_HOME=/usr/local/hadoop/hadoop-2.1.0-beta/libexec # On secure datanodes, user to run the datanode as after dropping privileges export HADOOP_SECURE_DN_USER=hadoop # The directory where pid files are stored. /tmp by default export HADOOP_SECURE_DN_PID_DIR=/usr/local/hadoop # Where log files are stored in the secure data environment. export HADOOP_SECURE_DN_LOG_DIR=/data/logs
分发配置和jar到整个集群
用hadoop帐号启动namenode,然后切换到root,再启动datanode,发现namenode web页面上有显示"
Security is ON
"
启动secure datanode命令
exec "$JSVC" \ -Dproc_$COMMAND -outfile "$JSVC_OUTFILE" \ -errfile "$JSVC_ERRFILE" \ -pidfile "$HADOOP_SECURE_DN_PID" \ -nodetach \ -user "$HADOOP_SECURE_DN_USER" \ -cp "$CLASSPATH" \ $J***A_HEAP_MAX $HADOOP_OPTS \ org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter "$@"如果启动过程中有什么问题可以查看$JSVC_OUTFILE(默认是$HADOOP_LOG_DIR/jsvc.out) 和 $JSVC_ERRFILE(默认是$HADOOP_LOG_DIR/jsvc.err)信息来排错
设置yarn security
yarn-site.xml
<property> <name>yarn.resourcemanager.keytab</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>yarn.resourcemanager.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>yarn.nodemanager.keytab</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>yarn.nodemanager.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property> <property> <name>yarn.nodemanager.container-executor.class</name> <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value> </property> <property> <name>yarn.nodemanager.linux-container-executor.group</name> <value>hadoop</value> </property>
container-executor默认是DefaultContainerExecutor,是以起Nodemanager的用户身份启动container的,切换为LinuxContainerExecutor会以提交application的用户身份来启动,它使用一个setuid可执行文件来启动和销毁container
这个可执行文件在bin/container-executor,不过Hadoop默认带的还是32位版本,所以需要重新编译
下载Hadoop 2.x source code
mvn package -Pdist,native -DskipTests -Dtar -Dcontainer-executor.conf.dir=/etc
注:container-executor.conf.dir必须显示注明,它表示setuid可执行文件依赖的配置文件(container-executor.cfg)路径,默认会在$HADOOP_HOME/etc/hadoop下,不过由于该文件需要父目录和以上的目录的owner都为root,要不然会有以下报错,所以为了方便我们设置为/etc
Caused by: org.apache.hadoop.util.Shell$ExitCodeException: File /usr/local/hadoop/hadoop-2.1.0-beta/etc/hadoop must be owned by root, but is owned by 500 at org.apache.hadoop.util.Shell.runCommand(Shell.java:458) at org.apache.hadoop.util.Shell.run(Shell.java:373) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:578) at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.init(LinuxContainerExecutor.java:147)
默认的寻找configuration路径
[root@dev80 bin]# strings container-executor | grep etc ../etc/hadoop/container-executor.cfg看出来是默认加载$HADOOP_HOME/etc/hadoop/container-executor.cfg
加上container-executor.conf.dir=/etc 再编译后
[hadoop@dev80 bin]$ strings container-executor | grep etc /etc/container-executor.cfg
container-executor.cfg中设置
yarn.nodemanager.linux-container-executor.group=hadoop min.user.id=499其中min.user.id表示启动container的最小uid,如果有低于这个值的uid启动task,就会fail掉。一般Centos,RHEL用户帐号uid是从500开始
将container-executor拷贝到$HADOOP_HOME/bin
chown root:hadoop container-executor /etc/container-executor.cfg chmod 4750 container-executor chmod 400 /etc/container-executor.cfg同步配置文件到整个集群,用hadoop帐号启动ResourceManager和Nodemanager
注意:为了方便测试,先不要将配置分发到所有集群,而是分到一台,然后起RM和一台NM,一切都okay后,再同步到所有host
设置jobhistory server security
mapred-site.xml
<property> <name>mapreduce.jobhistory.keytab</name> <value>/etc/hadoop.keytab</value> </property> <property> <name>mapreduce.jobhistory.principal</name> <value>hadoop/_HOST@DIANPING.COM</value> </property>
启动JobHistoryServer
sbin/mr-jobhistory-daemon.sh start historyserver
执行命令kinit,获得一张tgt(ticket
granting ticket)
[hadoop@dev80 hadoop]$ kinit -r 24l -k -t /home/hadoop/.keytab hadoop [hadoop@dev80 hadoop]$ klist Ticket cache: FILE:/tmp/krb5cc_500 Default principal: hadoop@DIANPING.COM Valid starting Expires Service principal 09/11/13 15:25:34 09/12/13 15:25:34 krbtgt/DIANPING.COM@DIANPING.COM renew until 09/12/13 15:25:34其中/tmp/krb5cc_500就是kerberos
ticket cache, 默认会在/tmp下创建名字为“krb5cc_”加上uid的文件,此处500表示hadoop帐号的uid
[hadoop@dev80 hadoop]$ getent passwd hadoop:x:500:500::/home/hadoop:/bin/bash用户也可以通过设置export
KRB5CCNAME=/tmp/krb5cc_500到环境变量来指定ticket cache路径
用完之后可以使用kdestroy命令来销毁掉该ticket cache
[hadoop@dev80 hadoop]$ kdestroy [hadoop@dev80 hadoop]$ klist klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_500)
如果本地没有ticket cache,会报如下错误
13/09/11 16:21:35 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
附上/etc/hadoop.keytab中的principal,都是service principal
[hadoop@dev80 hadoop]$ klist -k -t /etc/hadoop.keytab Keytab name: WRFILE:/etc/hadoop.keytab KVNO Timestamp Principal ---- ----------------- -------------------------------------------------------- 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 hadoop/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 host/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM 1 06/17/12 22:01:24 HTTP/dev80.hadoop@DIANPING.COM
/home/hadoop/.keytab下放user principal
[hadoop@dev80 hadoop]$ klist -k -t /home/hadoop/.keytab Keytab name: WRFILE:/home/hadoop/.keytab KVNO Timestamp Principal ---- ----------------- -------------------------------------------------------- 1 04/11/12 13:56:29 hadoop@DIANPING.COM
由于keytab相当于有了永久凭证,不需要提供密码(如果修改kdc中的principal的密码,则该keytab就会失效),所以其他用户如果对该文件有读权限,就可以冒充keytab中指定的用户身份访问hadoop,所以keytab文件需要确保只对owner有读权限(0400)
本文链接/article/1385954.html,转载请注明
相关文章推荐
- 分布式安全--YARN & HDFS2 安装和配置Kerberos
- YARN & HDFS2 安装和配置Kerberos
- 安装二:hdfs的运行和YARN资源管理系统的配置和启动
- yarn上配置kerberos(十二)
- yarn & mapreduce 配置参数总结
- windows下安装NetBeans IDE & 配置C++编译环境
- Nginx安装配置&反向代理
- MongoDB(1)简介 &amp; 安装配置
- 《MongoDB极简教程》第一章 NoSQL简史 & MongoDB安装&环境配置
- Veeam Backup & Replication试用(一):安装及配置
- Windows7 64位 && VS2010 环境下CUDA 6.5安装与配置
- AMP(Apache+Mysql+PHP)的安装与配置
- 深入浅出Node.js(二):Node.js&NPM的安装与配置
- VS2005 安装文件 "由于应用程序配置不正确,应用程序未能启动"
- kerberos安装配置
- C&C++入门篇---CodeBlocks的安装与简单配置
- hdfs-over-ftp安装与配置
- mysql group replication 安装&配置详解
- Hadoop-2.6.0&nbsp;集群的安装配置
- linux安装mysql5.5多实例&&主从配置测试