k8s 操作笔记
2018-03-23 08:26
513 查看
版本 1.9.0
namespace限制gpu
[root@tensorflow1 gpu-namespace]# cat compute-resources2.yamlapiVersion: v1kind: ResourceQuotametadata: name: compute-resourcesspec: hard: #pods: "4" #requests.cpu: "1" #requests.memory: 1Gi #limits.cpu: "2" #limits.memory: 2Gi limits.nvidia.com/gpu: "3"
kubectl create -f compute-resources2.yaml
kubectl get quotakubectl describe quota compute-resourceskubectl delete quota compute-resources
先创建namespace 再在namespace上增加限制,这里是在default下增加限制
docker里面没vi等基本命令
echo "nameserver 192.168.1.254" > /etc/resolv.confapt-get updateapt install net-tools # ifconfig apt install iputils-ping # pingapt install vi
启动gpu任务
Warning FailedScheduling 3s (x7 over 34s) default-scheduler 0/3 nodes are available: 1 PodToleratesNodeTaints, 3 Insufficient nvidia.com/gpu.
http://blog.csdn.net/u013042928/article/details/78751015
调整副本数
kubectl scale ds/kube-flannel-ds --replicas=1
在指定node上启动容器
增加参数 nodeName: xxxx
eg:apiVersion: v1kind: Podmetadata: name: cuda-vector-addspec: restartPolicy: OnFailure containers: - name: cuda-vector-add # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile image: "nfs:5000/tensorflow/tensorflow:nightly" #resources: #limits: #nvidia.com/gpu: 1 # requesting 1 GPU nodeName: tensorflow1
隔离恢复节点
kubectl cordon {hostname} #隔离kubectl uncordon {hostname} #恢复
创建删除应用
kubectl run httpd-app --image=httpd --replicas=2
kubectl get all --all-namespaceskubectl get deployments删除任务kubectl delete deployment xxxxxkubectl delete deploy/httpd-app
验证[root@tensorflow1 k8s_images]# kubectl get pods -o wideNAME READY STATUS RESTARTS AGE IP NODEhttpd-app-5fbccd7c6c-5sx5z 1/1 Running 0 25m 10.244.2.2 tensorflow0httpd-app-5fbccd7c6c-87jvp 1/1 Running 0 17m 10.244.2.3 tensorflow0[root@tensorflow1 k8s_images]# curl 10.244.2.2<html><body><h1>It works!</h1></body></html>[root@tensorflow1 k8s_images]# curl 10.244.2.3<html><body><h1>It works!</h1></body></html>
restful api无法访问
1.5使用不加密的4194端口,1.9使用加密的6443端口,需要做额外设置才能访问
master机器上执行 curl "https://localhost:6443/healthz" -k
-k忽略证书问题
kubectl get clusterrole/cluster-admin -o yaml
编辑basic_auth_filevi /etc/kubernetes/pki/basic_auth_fileadmin,admin,2vi /etc/kubernetes/manifests/kube-apiserver.yaml增加 - --basic_auth_file=/etc/kubernetes/pki/basic_auth_file注意 basic_auth_file必须是下划线,网上有中划线的是不行的
自动生效
这里basic_auth_file必须在/etc/kubernetes/pki/下的原因,可能是因为apiserver的容器里挂载了这个路径。仅是猜测,未经测试。
访问master机器上执行 curl -u admin:admin "https://localhost:6443/api/v1" -k
权限问题:
api文档https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/
解决Kubernetes 1.6.4 Dashboard无法访问的问题https://tonybai.com/2017/07/20/fix-cannot-access-dashboard-in-k8s-1-6-4/
Kubernetes 1.6新特性学习:RBAC授权https://blog.frognew.com/2017/04/kubernetes-1.6-rbac.html
Kubernetes dashboard1.8.0 WebUI安装与配置http://blog.csdn.net/a632189007/article/details/78840971
User “system:anonymous” cannot get path “/”https://stackoverflow.com/questions/45094665/user-systemanonymous-cannot-get-path
rbac 官方文档https://kubernetes.io/docs/admin/authorization/rbac/
kubernetes1.8版本heapster部署http://blog.csdn.net/liukuan73/article/details/78704395
访问dashboard失败
endpoints正常[root@tensorflow1 k8s_images]# kubectl get endpoints --all-namespacesNAMESPACE NAME ENDPOINTS AGEdefault kubernetes 192.168.1.138:6443 18hkube-system kube-controller-manager <none> 18hkube-system kube-dns 172.17.0.2:53,172.17.0.2:53 18hkube-system kube-scheduler <none> 18hkube-system kubernetes-dashboard 172.17.0.3:8443 29m
[root@tensorflow1 k8s_images]# curl "https://172.17.0.3:8443" -k<!doctype html> <html ng-app="kubernetesDashboard"> <head> <meta charset="utf-8"> <title ng-controller="kdTitle as $ctrl" ng-bind="$ctrl.title()"></title> <link rel="icon" type="image/png" href="assets/images/kubernetes-logo.png"> <meta name="viewport" content="width=device-width"> <link rel="stylesheet" href="static/vendor.93db0a0d.css"> <link rel="stylesheet" href="static/app.ffb1366f.css"> </head> <body ng-controller="kdMain as $ctrl"> <!--[if lt IE 10]> <p class="browsehappy">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p> <![endif]--> <kd-login layout="column" layout-fill ng-if="$ctrl.isLoginState()"> </kd-login> <kd-chrome layout="column" layout-fill ng-if="!$ctrl.isLoginState()"> </kd-chrome> <script src="static/vendor.9a600e6f.js"></script> <script src="api/appConfig.json"></script> <script src="static/app.fe2776ce.js"></script> </body> </html>
service信息如下[root@tensorflow1 k8s_images]# kubectl describe svc/kubernetes-dashboard -n kube-systemName: kubernetes-dashboardNamespace: kube-systemLabels: k8s-app=kubernetes-dashboardAnnotations: <none>Selector: k8s-app=kubernetes-dashboardType: NodePortIP: 10.100.2.162Port: <unset> 443/TCPTargetPort: 8443/TCPNodePort: <unset> 32666/TCPEndpoints: 172.17.0.3:8443Session Affinity: NoneExternal Traffic Policy: ClusterEvents: <none>
curl "https://10.100.2.162:443" -k 可以访问
master机器执行 curl "https://localhost:32666" -k 不可以访问
打标签 以后使用定制标签,否则每次会去网上检查,再拉镜像docker tag httpd:latest httpd:20180322
几个port之间的关系http://blog.csdn.net/liyingke112/article/details/76022267
dashboard https打不开问题
ie和google浏览器都打不开 要用火狐打开。允许 添加例外。然后就能打开了
dashboard看不了了
namespace限制gpu
[root@tensorflow1 gpu-namespace]# cat compute-resources2.yamlapiVersion: v1kind: ResourceQuotametadata: name: compute-resourcesspec: hard: #pods: "4" #requests.cpu: "1" #requests.memory: 1Gi #limits.cpu: "2" #limits.memory: 2Gi limits.nvidia.com/gpu: "3"
kubectl create -f compute-resources2.yaml
kubectl get quotakubectl describe quota compute-resourceskubectl delete quota compute-resources
先创建namespace 再在namespace上增加限制,这里是在default下增加限制
docker里面没vi等基本命令
echo "nameserver 192.168.1.254" > /etc/resolv.confapt-get updateapt install net-tools # ifconfig apt install iputils-ping # pingapt install vi
启动gpu任务
Warning FailedScheduling 3s (x7 over 34s) default-scheduler 0/3 nodes are available: 1 PodToleratesNodeTaints, 3 Insufficient nvidia.com/gpu.
http://blog.csdn.net/u013042928/article/details/78751015
调整副本数
kubectl scale ds/kube-flannel-ds --replicas=1
在指定node上启动容器
增加参数 nodeName: xxxx
eg:apiVersion: v1kind: Podmetadata: name: cuda-vector-addspec: restartPolicy: OnFailure containers: - name: cuda-vector-add # https://github.com/kubernetes/kubernetes/blob/v1.7.11/test/images/nvidia-cuda/Dockerfile image: "nfs:5000/tensorflow/tensorflow:nightly" #resources: #limits: #nvidia.com/gpu: 1 # requesting 1 GPU nodeName: tensorflow1
隔离恢复节点
kubectl cordon {hostname} #隔离kubectl uncordon {hostname} #恢复
创建删除应用
kubectl run httpd-app --image=httpd --replicas=2
kubectl get all --all-namespaceskubectl get deployments删除任务kubectl delete deployment xxxxxkubectl delete deploy/httpd-app
验证[root@tensorflow1 k8s_images]# kubectl get pods -o wideNAME READY STATUS RESTARTS AGE IP NODEhttpd-app-5fbccd7c6c-5sx5z 1/1 Running 0 25m 10.244.2.2 tensorflow0httpd-app-5fbccd7c6c-87jvp 1/1 Running 0 17m 10.244.2.3 tensorflow0[root@tensorflow1 k8s_images]# curl 10.244.2.2<html><body><h1>It works!</h1></body></html>[root@tensorflow1 k8s_images]# curl 10.244.2.3<html><body><h1>It works!</h1></body></html>
restful api无法访问
1.5使用不加密的4194端口,1.9使用加密的6443端口,需要做额外设置才能访问
master机器上执行 curl "https://localhost:6443/healthz" -k
-k忽略证书问题
kubectl get clusterrole/cluster-admin -o yaml
编辑basic_auth_filevi /etc/kubernetes/pki/basic_auth_fileadmin,admin,2vi /etc/kubernetes/manifests/kube-apiserver.yaml增加 - --basic_auth_file=/etc/kubernetes/pki/basic_auth_file注意 basic_auth_file必须是下划线,网上有中划线的是不行的
自动生效
这里basic_auth_file必须在/etc/kubernetes/pki/下的原因,可能是因为apiserver的容器里挂载了这个路径。仅是猜测,未经测试。
访问master机器上执行 curl -u admin:admin "https://localhost:6443/api/v1" -k
权限问题:
api文档https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/
解决Kubernetes 1.6.4 Dashboard无法访问的问题https://tonybai.com/2017/07/20/fix-cannot-access-dashboard-in-k8s-1-6-4/
Kubernetes 1.6新特性学习:RBAC授权https://blog.frognew.com/2017/04/kubernetes-1.6-rbac.html
Kubernetes dashboard1.8.0 WebUI安装与配置http://blog.csdn.net/a632189007/article/details/78840971
User “system:anonymous” cannot get path “/”https://stackoverflow.com/questions/45094665/user-systemanonymous-cannot-get-path
rbac 官方文档https://kubernetes.io/docs/admin/authorization/rbac/
kubernetes1.8版本heapster部署http://blog.csdn.net/liukuan73/article/details/78704395
访问dashboard失败
endpoints正常[root@tensorflow1 k8s_images]# kubectl get endpoints --all-namespacesNAMESPACE NAME ENDPOINTS AGEdefault kubernetes 192.168.1.138:6443 18hkube-system kube-controller-manager <none> 18hkube-system kube-dns 172.17.0.2:53,172.17.0.2:53 18hkube-system kube-scheduler <none> 18hkube-system kubernetes-dashboard 172.17.0.3:8443 29m
[root@tensorflow1 k8s_images]# curl "https://172.17.0.3:8443" -k<!doctype html> <html ng-app="kubernetesDashboard"> <head> <meta charset="utf-8"> <title ng-controller="kdTitle as $ctrl" ng-bind="$ctrl.title()"></title> <link rel="icon" type="image/png" href="assets/images/kubernetes-logo.png"> <meta name="viewport" content="width=device-width"> <link rel="stylesheet" href="static/vendor.93db0a0d.css"> <link rel="stylesheet" href="static/app.ffb1366f.css"> </head> <body ng-controller="kdMain as $ctrl"> <!--[if lt IE 10]> <p class="browsehappy">You are using an <strong>outdated</strong> browser. Please <a href="http://browsehappy.com/">upgrade your browser</a> to improve your experience.</p> <![endif]--> <kd-login layout="column" layout-fill ng-if="$ctrl.isLoginState()"> </kd-login> <kd-chrome layout="column" layout-fill ng-if="!$ctrl.isLoginState()"> </kd-chrome> <script src="static/vendor.9a600e6f.js"></script> <script src="api/appConfig.json"></script> <script src="static/app.fe2776ce.js"></script> </body> </html>
service信息如下[root@tensorflow1 k8s_images]# kubectl describe svc/kubernetes-dashboard -n kube-systemName: kubernetes-dashboardNamespace: kube-systemLabels: k8s-app=kubernetes-dashboardAnnotations: <none>Selector: k8s-app=kubernetes-dashboardType: NodePortIP: 10.100.2.162Port: <unset> 443/TCPTargetPort: 8443/TCPNodePort: <unset> 32666/TCPEndpoints: 172.17.0.3:8443Session Affinity: NoneExternal Traffic Policy: ClusterEvents: <none>
curl "https://10.100.2.162:443" -k 可以访问
master机器执行 curl "https://localhost:32666" -k 不可以访问
打标签 以后使用定制标签,否则每次会去网上检查,再拉镜像docker tag httpd:latest httpd:20180322
几个port之间的关系http://blog.csdn.net/liyingke112/article/details/76022267
dashboard https打不开问题
ie和google浏览器都打不开 要用火狐打开。允许 添加例外。然后就能打开了
dashboard看不了了
相关文章推荐
- linux基础学习笔记-2-关于控制台的基本操作
- 【数据结构 笔记01】链表常见操作
- [转载]MongoDB学习笔记(五) MongoDB文件存取操作
- Kafka学习笔记-Java简单操作
- JavaWeb学习笔记——jquery中的dom操作
- OAF学习笔记-10-Update的操作
- SQLITE学习笔记二(数据库管理,命令行操作)
- 关于layui数据表格的分页操作的学习笔记。
- C++学习笔记20——顺序容器的操作
- HAWQ 操作笔记
- Kafka学习笔记——使用Kafka记录APP的操作日志
- ECMAScript 6 学习笔记----异步操作和Async函数
- AndroidStudio操作笔记--简单介绍1
- iOS学习笔记(十六)——数据库操作(使用FMDB)
- java学习笔记(四)——io操作
- Teradata使用笔记(2)--数据库操作
- c模拟链表操作,笔记
- ios笔记一《对于内存操作的了解》
- hive学习笔记-表操作
- QT sqlite3数据库读取、容器操作、文件读写——学习笔记