在 K8S 部署一个 Spark History Server - 篇2
文章目录
1 Overview
之前我们组在生产环境上部署的是 Spark 2.2 on k8s 的那个 fork,部署在 K8S 上,至少需要一个 Dockerfile,最近有计划升级到 3.0.0 Snapshot 的分支代码上,借此,做个记录。
History Server => HS
2 Start
Spark 自2.3.0,之后就提供官方的 Dockerfile 了,可以基于生产环境的需求,自行 build。所以这里调研一下,Dockerfile 能否直接支持运行一个 HS 的进程。
贴个 Dockerfile 看看(删除了一些注释)。
FROM openjdk:8-alpine ARG spark_uid=185 RUN set -ex && \ apk upgrade --no-cache && \ ln -s /lib /lib64 && \ apk add --no-cache bash tini libc6-compat linux-pam krb5 krb5-libs nss && \ mkdir -p /opt/spark && \ mkdir -p /opt/spark/examples && \ mkdir -p /opt/spark/work-dir && \ touch /opt/spark/RELEASE && \ rm /bin/sh && \ ln -sv /bin/bash /bin/sh && \ echo "auth required pam_wheel.so use_uid" >> /etc/pam.d/su && \ chgrp root /etc/passwd && chmod ug+rw /etc/passwd COPY jars /opt/spark/jars COPY bin /opt/spark/bin COPY sbin /opt/spark/sbin COPY kubernetes/dockerfiles/spark/entrypoint.sh /opt/ COPY examples /opt/spark/examples COPY kubernetes/tests /opt/spark/tests COPY data /opt/spark/data ENV SPARK_HOME /opt/spark WORKDIR /opt/spark/work-dir RUN chmod g+w /opt/spark/work-dir ENTRYPOINT [ "/opt/entrypoint.sh" ] # Specify the User that the actual main process will run as USER ${spark_uid}
看出来了,不论 Driver 还是 Executor,这个 Dockerfile 来跑什么,取决于最后的脚本 entrypoint.sh。
再贴个 entrypoint.sh 的关键代 码。
case "$1" in driver) shift 1 CMD=( "$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@" ) ;; executor) shift 1 CMD=( ${JAVA_HOME}/bin/java "${SPARK_EXECUTOR_JAVA_OPTS[@]}" -Xms$SPARK_EXECUTOR_MEMORY -Xmx$SPARK_EXECUTOR_MEMORY -cp "$SPARK_CLASSPATH" org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url $SPARK_DRIVER_URL --executor-id $SPARK_EXECUTOR_ID --cores $SPARK_EXECUTOR_CORES --app-id $SPARK_APPLICATION_ID --hostname $SPARK_EXECUTOR_POD_IP ) ;; *) echo "Non-spark-on-k8s command provided, proceeding in pass-through mode..." CMD=("$@") ;; esac
注意到了,当运行这个 Dockerfile build 出来的容器的时候,需要输入一些参数,如果输入
driver则运行的是一个 Driver 进程,如果是
executor就是一个 Executor 进程。
那么如果想跑 HS 这样的进程服务的时候该怎么办呢?
显然最后一个选项就是给兜底的,你可以运行 Spark 官方提供的 start-history-server.sh。
所以按照官方 build 完镜像之后可以试试。
./bin/docker-image-tool.sh -t v3.0.0 build
然后运行 start-history-server.sh,其实细看这个脚本文件,HS 是用 Daemon 的方式运行的,Docker 是不能直接跑后台进程的(这个说法可能有误,大概可以先这么理解)。而 HS 其实就是运行
org.apache.spark.deploy.history.HistoryServer这个启动类,所以按照下面这个脚本跑吧。
docker run -it spark:v3.0.0 /opt/spark/bin/spark-class org.apache.spark.deploy.history.HistoryServer
然后你就会看到报错了…
➜ spark git:(master) ✗ docker run -it spark:v3.0.0 /opt/spark/bin/spark-class org.apache.spark.deploy.history.HistoryServer++ id -u + myuid=0 ++ id -g + mygid=0 + set +e ++ getent passwd 0 + uidentry=root:x:0:0:root:/root:/bin/ash + set -e + '[' -z root:x:0:0:root:/root:/bin/ash ']' + SPARK_CLASSPATH=':/opt/spark/jars/*' + env + grep SPARK_JAVA_OPT_ + sort -t_ -k4 -n + sed 's/[^=]*=\(.*\)/\1/g' + readarray -t SPARK_EXECUTOR_JAVA_OPTS + '[' -n '' ']' + '[' '' == 2 ']' + '[' '' == 3 ']' + '[' -z ']' + case "$1" in + echo 'Non-spark-on-k8s command provided, proceeding in pass-through mode...' Non-spark-on-k8s command provided, proceeding in pass-through mode... + CMD=("$@") + exec /sbin/tini -s -- /opt/spark/bin/spark-class org.apache.spark.deploy.history.HistoryServer Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/07/09 03:59:22 INFO HistoryServer: Started daemon with process name: 14@df0f7b9fd0cf 19/07/09 03:59:22 INFO SignalUtils: Registered signal handler for TERM 19/07/09 03:59:22 INFO SignalUtils: Registered signal handler for HUP 19/07/09 03:59:22 INFO SignalUtils: Registered signal handler for INT 19/07/09 03:59:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 19/07/09 03:59:23 INFO SecurityManager: Changing view acls to: root 19/07/09 03:59:23 INFO SecurityManager: Changing modify acls to: root 19/07/09 03:59:23 INFO SecurityManager: Changing view acls groups to: 19/07/09 03:59:23 INFO SecurityManager: Changing modify acls groups to: 19/07/09 03:59:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set() 19/07/09 03:59:23 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions Exception in thread "main" java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278) at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala) Caused by: java.io.FileNotFoundException: Log directory specified does not exist: file:/tmp/spark-events Did you configure the correct one through spark.history.fs.logDirectory? at org.apache.spark.deploy.history.FsHistoryProvider.startPolling(FsHistoryProvider.scala:259)
这个解决起来容易啊,不就是默认读取的 spark event log 文件夹不存在吗,那就创建一个好了,或者在 Spark 的配置文件里改一下默认的 Event 读取路径就好了,这里不赘述了。
3 Summary
所以说用 Docker 来跑一个 Spark History Server 并不是什么问题,而且基本可以说是开箱即用 的,重点是一些配置,和日志存放的硬盘需要和 Spark App 配合好。
- 从无到有,教你如何在Server2003上部署一个Asp.Net的网站(转载)
- ESB v6.6 被分配在同一个组中的多台Server,比如:Server1和Server2同属于一个服务器组,当Server1同步的时候,能否将Server1的资源部署到Server2上?
- Spark History Server配置使用
- Team Foundation Server 2010 安装、部署与配置(七):创建 Team Project 时的一个 Issue:TF218027 .
- spark on yarn图形化任务监控利器:History-server帮你理解spark的任务执行过程
- k8s之核心指标API Metrics-Server部署及HPA
- Spark History Server配置使用
- Spark as a Service之JobServer部署
- Spark1.0.0 history server 配置
- Spark入门 - History Server配置使用
- Spark History Server配置使用
- spark2.0 history server 内存溢出解决
- 配置&使用Spark History Server
- Spark1.0.0 history server 配置
- log4j无日志生成? 我将一个web项目部署在tomcat server.xml配置文件中,当我启动tomcat是无日志输出。
- 使用kubeadm部署k8s集群08-配置LB指向kube-apiserver
- Spark History Server 没有生效
- 从无到有,教你如何在Server2003上部署一个Asp.Net的网站
- how-to-configure-and-use-spark-history-server
- 在Tomcat的server.xml中配置两个context,出现其中一个不能正常启动,交换配置顺序,另一个又不能正常启动,即始终只有第二个配置能启动的情况。如果单独部署,都没有问题。报错大致内容如下