您的位置：首页 > 其它

Ambari spark 开启动态资源分配

2020-02-13 06:41 155 查看

这几天研究资源分配的时候踩了不少坑,先做以下总结:

1.修改每台NodeManager上的yarn-site.xml：

##修改
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
##增加
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>

以上若在ambari平台,应该是默认设置好的,

2.在spark-defaults.conf

设置

ark.shuffle.service.enabled true   //启用External shuffle Service服务
spark.shuffle.service.port 7337 //Shuffle Service服务端口，必须和yarn-site中的一致
spark.dynamicAllocation.enabled true  //开启动态资源分配
spark.dynamicAllocation.minExecutors 1  //每个Application最小分配的executor数
spark.dynamicAllocation.maxExecutors 30  //每个Application最大并发分配的executor数
spark.dynamicAllocation.schedulerBacklogTimeout 1s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 5s

附上说明:

这里是引用
以下是基本配置参考
spark.shuffle.service.enabled true 配置External shuffle Service服务（一定要配置启用）
spark.shuffle.service.port 7337
spark.dynamicAllocation.enabled true 启用动态资源调度
spark.dynamicAllocation.minExecutors 3 每个应用中最少executor的个数
spark.dynamicAllocation.maxExecutors 8 每个应用中最多executor的个数
可选参数说明：
配置项说明默认值
spark.dynamicAllocation.minExecutors 最小Executor个数。 0
spark.dynamicAllocation.initialExecutors 初始Executor个数。 spark.dynamicAllocation.minExecutors
spark.dynamicAllocation.maxExecutors 最大executor个数。 Integer.MAX_VALUE
spark.dynamicAllocation.schedulerBacklogTimeout 调度第一次超时时间。 1(s)
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout 调度第二次及之后超时时间。 spark.dynamicAllocation.schedulerBacklogTimeout
spark.dynamicAllocation.executorIdleTimeout 普通Executor空闲超时时间。 60(s)
spark.dynamicAllocation.cachedExecutorIdleTimeout 含有cached blocks的Executor空闲超时时间。spark.dynamicAllocation.executorIdleTimeout的2倍
说明
1.使用动态资源调度功能，必须配置External Shuffle Service。如果没有使用External Shuffle Service，Executor被杀时会丢失shuffle文件。
2.配置了动态资源调度功能，就不能再单独配置Executor的个数，否则会报错退出。
3.使用动态资源调度功能，能保证最少的executor的个数（spark.dynamicAllocation.minExecutors）
来源:https://blog.csdn.net/dandykang/article/details/48160953

动态资源分配策略：
开启动态分配策略后，application会在task因没有足够资源被挂起的时候去动态申请资源，这种情况意味着该application现有的executor无法满足所有task并行运行。spark一轮一轮的申请资源，当有task挂起或等待spark.dynamicAllocation.schedulerBacklogTimeout(默认1s)时间的时候，会开始动态资源分配；之后会每隔spark.dynamicAllocation.sustainedSchedulerBacklogTimeout(默认1s)时间申请一次，直到申请到足够的资源。每次申请的资源量是指数增长的，即1,2,4,8等。
之所以采用指数增长，出于两方面考虑：其一，开始申请的少是考虑到可能application会马上得到满足；其次要成倍增加，是为了防止application需要很多资源，而该方式可以在很少次数的申请之后得到满足。

资源回收策略
当application的executor空闲时间超过spark.dynamicAllocation.executorIdleTimeout（默认60s）后，就会被回收。

这里提醒下,实是在spark-defaults.conf下增加,在

spark2-thrift-sparkconf下也有改配置,该配置是spark2-thrift的配置,通过远程调用的,本地的spark-shell或者pyspark是不生效的,研究了好久,
添加提示

不用管,直接添加就行,提示重复添加,俩不是一个配置文件里的没有影响
配置完后,我启动spark-sql测试(pysaprk和spark-shell一样),
未提交任务情况下

占用集群资源较少,当提交任务后

集群资源动态调节,最大话利用集群

点赞
收藏
分享
文章举报

不搬砖的程序员不是好程序员发布了14 篇原创文章 · 获赞 0 · 访问量 347 私信关注

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航