Coscheduling
2021-04-26 16:04
120 查看
背景
Kubernetes 目前已经广泛的应用于在线服务编排,为了提升集群的的利用率和运行效率,我们希望将 Kubernetes 作为一个统一的管理平台来管理在线服务和离线作业。默认的调度器是以 Pod 为调度单元进行依次调度, 不会考虑 Pod 之间的相互关系。但是很多数据计算类的离线作业具有组合调度的特点,即要求所有的子任务都能够成功创建后,整个作业才能正常运行。如果只有部分子任务启动的话,启动的子任务将持续等待剩余的子任务 被调度。这正是 Gang Scheduling 的场景。 在 Coscheduling 的具体实现过程中,根据是否允许“碎片”存在,可以细分为 Explicit Coscheduling,Local Coscheduling 和 Implicit Coscheduling。 其中 Explicit Coscheduling 就是大家常听 到的 Gang Scheduling。Gang Scheduling 要求完全不允许有“碎片”存在, 也就是“All or Nothing”。
前提条件
1.支持 Kubernetes 1.16 以上版本 2.选择创建 ACK 提供的标准专有集群(阿里云k8s集群、kubeadm安装集群),不支持rancher安装的k8s集群(已测试过) 3.保证集群节点可以访问公网 4.master节点安装helm v3 5.不支持cpu和memory的限制 6 支持 nvidia.com/gpu
修改配置内容
wget http://kubeflow.oss-cn-beijing.aliyuncs.com/ack-coscheduling.tar.gz # 下载chart 包 tar zxvf ack-coscheduling.tar.gz #解压 cd ack-coscheduling vim values.yaml #修改schedulerCount 默认是3,修改为集群master节点的数量 cd templates # 注释以下内容 (只有阿里云的scheduler pod才有component这个标签,如果阿里云k8s集群,不需要注释) #affinity: # podAffinity: # requiredDuringSchedulingIgnoredDuringExecution: # - labelSelector: # matchExpressions: # - key: component # operator: In # values: # - kube-scheduler # topologyKey: kubernetes.io/hostname
安装
$ helm install ack-coscheduling -n kube-system ./ack-coscheduling $ kubectl get pods -n kube-system -w # 查看pod执行完成 scheduler-update-clusterrole-7g6pd 0/1 Completed 0 25m scheduler-update-lrgj8 0/1 Completed 0 25m $ kubectl logs scheduler-update-lrgj8 -n kube-system #查看执行日志 #出现以下内容表示安装成功 DEBUG update /etc/kubernetes/manifests/kube-scheduler.yaml succeed
验证
1.运行tfjob,通过注释和开启验证调度正确性 2.注释pod-group.scheduling.sigs.k8s.io/name: tf-smoke-gpu 和 pod-group.scheduling.sigs.k8s.io/min-available: "5", 占用2块GPU卡 [root@master ~]# kubectl get pods NAME READY STATUS RESTARTS AGE tf-smoke-gpu-ps-0 1/1 Running 0 3s tf-smoke-gpu-worker-0 1/1 Running 0 4s tf-smoke-gpu-worker-1 1/1 Running 0 4s tf-smoke-gpu-worker-2 0/1 Pending 0 4s tf-smoke-gpu-worker-3 0/1 Pending 0 4s
1. 开启pod-group.scheduling.sigs.k8s.io/name: tf-smoke-gpu 和 pod-group.scheduling.sigs.k8s.io/min-available: "5",所有pod处于pending状态 [root@master ~]# kubectl get pods NAME READY STATUS RESTARTS AGE tf-smoke-gpu-ps-0 0/1 Pending 0 29m tf-smoke-gpu-worker-0 0/1 Pending 0 29m tf-smoke-gpu-worker-1 0/1 Pending 0 29m tf-smoke-gpu-worker-2 0/1 Pending 0 29m tf-smoke-gpu-worker-3 0/1 Pending 0 29m
apiVersion: "kubeflow.org/v1" kind: "TFJob" metadata: name: "tf-smoke-gpu" spec: tfReplicaSpecs: PS: replicas: 1 template: metadata: creationTimestamp: null labels: pod-group.scheduling.sigs.k8s.io/name: tf-smoke-gpu # podGroup 名称 pod-group.scheduling.sigs.k8s.io/min-available: "5" # 需要的pod数量 spec: containers: - args: - python - tf_cnn_benchmarks.py - --batch_size=32 - --model=resnet50 - --variable_update=parameter_server - --flush_stdout=true - --num_gpus=1 - --local_parameter_device=cpu - --device=cpu - --data_format=NHWC image: registry.cn-hangzhou.aliyuncs.com/kubeflow-images-public/tf-benchmarks-cpu:v20171202-bdab599-dirty-284af3 name: tensorflow ports: - containerPort: 2222 name: tfjob-port resources: requests: cpu: '3' workingDir: /opt/tf-benchmarks/scripts/tf_cnn_benchmarks restartPolicy: OnFailure Worker: replicas: 4 template: metadata: creationTimestamp: null labels: pod-group.scheduling.sigs.k8s.io/name: tf-smoke-gpu pod-group.scheduling.sigs.k8s.io/min-available: "5" spec: containers: - args: - python - tf_cnn_benchmarks.py - --batch_size=32 - --model=resnet50 - --variable_update=parameter_server - --flush_stdout=true - --num_gpus=1 - --local_parameter_device=cpu - --device=gpu - --data_format=NHWC image: registry.cn-hangzhou.aliyuncs.com/kubeflow-images-public/tf-benchmarks-gpu:v20171202-bdab599-dirty-284af3 name: tensorflow ports: - containerPort: 2222 name: tfjob-port resources: requestss: cpu: 2 memory: "3Gi" limits: nvidia.com/gpu: 1 workingDir: /opt/tf-benchmarks/scripts/tf_cnn_benchmarks restartPolicy: OnFailure
卸载
helm uninstall ack-coscheduling -n kube-system
相关文章推荐
- 讨厌的co-standby...
- 共创Linux桌面系统co-create 1.0.3光盘安装过程详细图解.htm
- SAP CO T-Code.
- SAP FI/CO 知识 浓缩版
- Beijing Haimai Fangzhou Technology Co.,Ltd
- SAP FI/CO 顾问面试问题
- VoiceXML 2.0 Developer's Guide : Building Professional Voice-enabled Applications with JSP, ASP & Co
- FI/CO常用名词解释(非原创)
- linux svn co时候出现 unresolve hostname的方法解决方法
- Co_Friends004
- Top 10 Performance Problems taken from Zappos, Monster, Thomson and Co
- MFC Initilization of General Service Control about InitCommonControls和InitCommonControlsEx(MFC 通用控件的初始化InitCommonControls和InitCo
- FI/CO基础知识
- Android应用程序上传错误The package name of your apk may not begin with any of the following values:[com.android, com.google, android, co
- 维护成本控制凭证的编号范围报错:co area have different elements
- BUG: scheduling while atomic: gnode/1005/0x00000100
- spring 定时器 表达式 org.springframework.scheduling.quartz.CronTriggerBean .
- 这是转自克米亚的一篇文章(FI/CO)
- 现实世界的Windows Azure:采访SIVECO Romania的高级电子教学部门经理Florin Anton
- SVN CO SVN+SSH的使用