您的位置:首页 > 运维架构 > Docker

[K8S] 认证集群搭建

2017-08-03 13:03 316 查看
注意:该文章 是 根据 https://github.com/opsnull/follow-me-install-kubernetes-clusterfollow-me-install-kubernetes-cluster
一步步做的。

若有版权问题请留言,谢谢!

其中自己根据实际情况做了若干变动。

里面的ip与实际对应如下(有的地方ip或许没改):

10.64.3.7
192.168.1.206
etcd-host0
10.64.3.8
192.168.1.207
etcd-host1
10.64.3.86
192.168.1.208
etcd-host2
01-组件版本和集群环境

集群组件和版本

Kubernetes 1.6.2
Docker 17.04.0-ce
Etcd 3.1.6
Flanneld 0.7.1 vxlan 网络
TLS 认证通信 (所有组件,如 etcd、kubernetes master 和 node)
RBAC 授权
kubelet TLS BootStrapping
kubedns、dashboard、heapster (influxdb、grafana)、EFK (elasticsearch、fluentd、kibana) 插件
私有 docker registry,使用 ceph rgw 后端存储,TLS + HTTP Basic 认证

集群机器

192.168.1.206
master、registry
192.168.1.207
node01
192.168.1.208
node02
本着测试的目的,etcd 集群、kubernetes master 集群、kubernetesnode 均使用这三台机器。

初始化系统,关闭 firewalld selinux .
分发集群环境变量定义脚本
把全局变量定义脚本拷贝到所有机器的 /root/local/bin 目录:
$ cp environment.sh /root/local/bin
$ vi /etc/profile

source /root/local/bin/environment.sh

:wq

192.168.1.206 environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR="10.254.0.0/16"

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR="172.30.0.0/16"

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE="8400-9000"

# etcd 集群服务地址列表

export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX="/kubernetes/network"

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP="10.254.0.2"

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN="cluster.local."

export NODE_NAME=etcd-host0 # 当前部署的机器名称(随便定义,
4000
只要能区分不同机器即可)

export NODE_IP=192.168.1.206 # 当前部署的机器 IP

export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER="https://${MASTER_IP}:6443"

192.168.1.207 environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR="10.254.0.0/16"

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR="172.30.0.0/16"

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE="8400-9000"

# etcd 集群服务地址列表

export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX="/kubernetes/network"

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP="10.254.0.2"

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN="cluster.local."

export NODE_NAME=etcd-host1 # 当前部署的机器名称(随便定义,只要能区分不同机器即可)

export NODE_IP=192.168.1.207 # 当前部署的机器 IP

export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER="https://${MASTER_IP}:6443"

192.168.1.208 environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN="41f7e4ba8b7be874fcff18bf5cf41a7c"

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR="10.254.0.0/16"

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR="172.30.0.0/16"

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE="8400-9000"

# etcd 集群服务地址列表

export ETCD_ENDPOINTS="https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379"

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX="/kubernetes/network"

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP="10.254.0.1"

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP="10.254.0.2"

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN="cluster.local."

export NODE_NAME=etcd-host2 # 当前部署的机器名称(随便定义,只要能区分不同机器即可)

export NODE_IP=192.168.1.208 # 当前部署的机器 IP

export NODE_IPS="192.168.1.206 192.168.1.207 192.168.1.208" # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER="https://${MASTER_IP}:6443"

02-创建CA证书和秘钥

创建 CA 证书和秘钥
kubernetes 系统各组件需要使用 TLS 证书对通信进行加密,本文档使用 CloudFlare 的
PKI工具集 cfssl 来生成Certificate Authority (CA) 证书和秘钥文件,CA 是自签名的证书,用来签名后续创建的其它 TLS 证书。

安装 CFSSL
$ wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64

$ chmod +x cfssl_linux-amd64

$ sudo mv cfssl_linux-amd64 /root/local/bin/cfssl
$ wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64

$ chmod +x cfssljson_linux-amd64

$ sudo mv cfssljson_linux-amd64 /root/local/bin/cfssljson
$ wget https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64

$ chmod +x cfssl-certinfo_linux-amd64

$ sudo mv cfssl-certinfo_linux-amd64 /root/local/bin/cfssl-certinfo
$ exportPATH=/root/local/bin:$PATH

$ mkdir ssl

$ cd ssl

$ cfssl print-defaults config >config.json

$ cfssl print-defaults csr > csr.json

以上工具每个节点都要安装
创建 CA (Certificate Authority)
创建 CA 配置文件:
$ cat ca-config.json

{

"signing": {

"default": {

"expiry":"8760h"

},

"profiles": {

"kubernetes": {

"usages": [

"signing",

"key encipherment",

"server auth",

"client auth"

],

"expiry":"8760h"

}

}

}

}

ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE;
server auth:表示 client 可以用该 CA 对 server 提供的证书进行验证;
client auth:表示 server 可以用该 CA 对 client 提供的证书进行验证;

创建 CA 证书签名请求:
$ cat ca-csr.json

{

"CN":"kubernetes",

"key": {

"algo":"rsa",

"size": 2048

},

"names": [

{

"C":"CN",

"ST":"BeiJing",

"L":"BeiJing",

"O":"k8s",

"OU":"System"

}

]

}

"CN":Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
"O":Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);

生成 CA 证书和私钥:
$ cfssl gencert -initca ca-csr.json| cfssljson -bare ca

$ ls ca*

ca-config.json ca.csr ca-csr.json ca-key.pem ca.pem

$

分发证书
将生成的 CA 证书、秘钥文件、配置文件拷贝到所有机器的 /etc/kubernetes/ssl 目录下
$ sudo mkdir -p /etc/kubernetes/ssl

$ sudo cp ca* /etc/kubernetes/ssl

$

校验证书(这个是个例子)
以校验 kubernetes 证书(后续部署 master 节点时生成的)为例:
使用 openssl 命令
$ openssl x509 -noout -text -in kubernetes.pem

...

&n
19f07
bsp; Signature Algorithm:sha256WithRSAEncryption

Issuer: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=Kubernetes

Validity

Not Before: Apr 5 05:36:00 2017 GMT

Not After : Apr 5 05:36:00 2018GMT

Subject: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=kubernetes

...

X509v3 extensions:

X509v3 Key Usage:critical

Digital Signature, KeyEncipherment

X509v3 Extended KeyUsage:

TLS Web ServerAuthentication, TLS Web Client Authentication

X509v3 Basic Constraints:critical

CA:FALSE

X509v3 Subject KeyIdentifier:

DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0

X509v3 Authority KeyIdentifier:

keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD

X509v3 Subject Alternative Name:

DNS:kubernetes,DNS:kubernetes.default, DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local,IP Address:127.0.0.1, IP Address:10.64.3.7, IP Address:10.254.0.1

...

确认 Issuer 字段的内容和 ca-csr.json 一致;
确认 Subject 字段的内容和 kubernetes-csr.json 一致;
确认 X509v3 Subject Alternative Name 字段的内容和 kubernetes-csr.json 一致;
确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 kubernetes profile 一致;

使用 cfssl-certinfo 命令
$ cfssl-certinfo -certkubernetes.pem

...

{

"subject":{

"common_name":"kubernetes",

"country":"CN",

"organization":"k8s",

"organizational_unit":"System",

"locality":"BeiJing",

"province":"BeiJing",

"names":[

"CN",

"BeiJing",

"BeiJing",

"k8s",

"System",

"kubernetes"

]

},

"issuer":{

"common_name":"Kubernetes",

"country":"CN",

"organization":"k8s",

"organizational_unit":"System",

"locality":"BeiJing",

"province":"BeiJing",

"names":[

"CN",

"BeiJing",

"BeiJing",

"k8s",

"System",

"Kubernetes"

]

},

"serial_number":"174360492872423263473151971632292895707129022309",

"sans":[

"kubernetes",

"kubernetes.default",

"kubernetes.default.svc",

"kubernetes.default.svc.cluster",

"kubernetes.default.svc.cluster.local",

"127.0.0.1",

"192.168.1.206",

"192.168.1.207",

"192.168.1.208",

"10.254.0.1"

],

"not_before":"2017-04-05T05:36:00Z",

"not_after":"2018-04-05T05:36:00Z",

"sigalg":"SHA256WithRSA",

...

03-部署高可用Etcd集群

部署高可用 etcd 集群
kuberntes 系统使用 etcd存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为etcd-host0、etcd-host1、etcd-host2:

etcd-host0:192.168.1.206
etcd-host1:192.168.1.207
etcd-host2:192.168.1.208

使用的变量
本文档用到的变量定义如下(一开始已经加入environment.sh脚本里了):
$export
NODE_NAME=etcd-host0#当前部署的机器名称(随便定义,只要能区分不同机器即可)

$ export NODE_IP=192.168.1.206#
当前部署的机器 IP

$ export NODE_IPS="192.168.1.206
192.168.1.207 192.168.1.208"#
etcd 集群所有机器 IP

$ # etcd 集群间通信的IP和端口

$ export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

$ #导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

$ source /root/local/bin/environment.sh

$
下载二进制文件
https://github.com/coreos/etcd/releases 页面下载最新版本的二进制文件:
$ wgethttps://github.com/coreos/etcd/releases/download/v3.1.6/etcd-v3.1.6-linux-amd64.tar.gz

$ tar -xvf etcd-v3.1.6-linux-amd64.tar.gz

$ sudo mv etcd-v3.1.6-linux-amd64/etcd*/root/local/bin

$
创建 TLS 秘钥和证书
为了保证通信安全,客户端(如 etcdctl) 与 etcd 集群、etcd集群之间的通信需要使用 TLS 加密,本节创建 etcd TLS 加密所需的证书和私钥。
创建 etcd 证书签名请求:
$ cat>etcd-csr.json<<EOF

{

"CN": "etcd",

"hosts": [

"127.0.0.1",

"${NODE_IP}"

],

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"ST": "BeiJing",

"L": "BeiJing",

"O": "k8s",

"OU": "System"

}

]

}

EOF

hosts 字段指定授权使用该证书的 etcd 节点 IP;

生成 etcd 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes etcd-csr.json | cfssljson -bare etcd

$ ls etcd*

etcd.csr etcd-csr.json etcd-key.pem etcd.pem

$ sudo mkdir -p /etc/etcd/ssl

$ sudo mv etcd*.pem /etc/etcd/ssl

$ rm etcd.csr etcd-csr.json
创建 etcd 的 systemd unit 文件
$ sudo mkdir -p /var/lib/etcd # 必须先创建工作目录

$ cat > etcd.service<<EOF

[Unit]

Description=Etcd Server

After=network.target

After=network-online.target

Wants=network-online.target

Documentation=https://github.com/coreos
[Service]

Type=notify

WorkingDirectory=/var/lib/etcd/

ExecStart=/root/local/bin/etcd\\

--name=${NODE_NAME} \\

--cert-file=/etc/etcd/ssl/etcd.pem \\

--key-file=/etc/etcd/ssl/etcd-key.pem \\

--peer-cert-file=/etc/etcd/ssl/etcd.pem \\

--peer-key-file=/etc/etcd/ssl/etcd-key.pem \\

--trusted-ca-file=/etc/kubernetes/ssl/ca.pem\\

--peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\

--initial-advertise-peer-urls=https://${NODE_IP}:2380 \\

--listen-peer-urls=https://${NODE_IP}:2380 \\

--listen-client-urls=https://${NODE_IP}:2379,http://127.0.0.1:2379 \\

--advertise-client-urls=https://${NODE_IP}:2379 \\

--initial-cluster-token=etcd-cluster-0 \\

--initial-cluster=${ETCD_NODES} \\

--initial-cluster-state=new \\

--data-dir=/var/lib/etcd

Restart=on-failure

RestartSec=5

LimitNOFILE=65536
[Install]

WantedBy=multi-user.target

EOF

指定 etcd 的工作目录和数据目录为 /var/lib/etcd,需在启动服务前创建这个目录;
为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);
--initial-cluster-state 值为 new 时,--name 的参数值必须位于 --initial-cluster 列表中;

完整 unit 文件见:etcd.service
启动 etcd 服务
$ sudo mv etcd.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable etcd

$ sudo systemctl start etcd

$ systemctl status etcd

$
最先启动的 etcd 进程会卡住一段时间,等待其它节点上的 etcd进程加入集群,为正常现象。
在所有的 etcd 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。
验证服务
部署完 etcd 集群后,在任一 etcd 集群节点上执行如下命令:
$for ipin
${NODE_IPS}; do

ETCDCTL_API=3 /root/local/bin/etcdctl\

--endpoints=https://${ip}:2379 \

--cacert=/etc/kubernetes/ssl/ca.pem\

--cert=/etc/etcd/ssl/etcd.pem\

--key=/etc/etcd/ssl/etcd-key.pem\

endpoint health; done
预期结果:
2017-07-0517:11:58.103401 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.206:2379 is healthy:successfully committed proposal: took = 81.247077ms
2017-07-0517:11:58.356539 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.207:2379 is healthy:successfully committed proposal: took = 12.073555ms
2017-07-0517:11:58.523829 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
https://192.168.1.208:2379 is healthy:successfully committed proposal: took = 5.413361ms

04-部署Kubectl命令行工具

部署 kubectl 命令行工具
kubectl 默认从 ~/.kube/config 配置文件获取访问kube-apiserver 地址、证书、用户名等信息,如果没有配置该文件,执行命令时出错:
$ kubectl get pods

The connection to the server localhost:8080 was refused - did you specify theright host or port?
本文档介绍下载和配置 kubernetes 集群命令行工具 kubectl 的步骤。
需要将下载的 kubectl二进制程序和生成的 ~/.kube/config 配置文件拷贝到所有使用 kubectl命令的机器。
使用的变量
本文档用到的变量定义如下(一开始已经加入environment.sh脚本里了):
$export MASTER_IP=192.168.1.206 #在主节点
206上操作

$ export KUBE_APISERVER="https://${MASTER_IP}:6443"

$

变量 KUBE_APISERVER 指定 kubelet 访问的 kube-apiserver 的地址,后续被写入 ~/.kube/config 配置文件;

下载 kubectl
$ wgethttps://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz

$ tar -xzvf kubernetes-client-linux-amd64.tar.gz

$ sudo cp kubernetes/client/bin/kube*/root/local/bin/

$ chmod a+x /root/local/bin/kube*

$ export PATH=/root/local/bin:$PATH

$
创建 admin 证书
kubectl 与 kube-apiserver 的安全端口通信,需要为安全通信提供 TLS证书和秘钥。
创建 admin 证书签名请求
$ cat admin-csr.json

{

"CN":"admin",

"hosts": [],

"key": {

"algo":"rsa",

"size": 2048

},

"names": [

{

"C":"CN",

"ST":"BeiJing",

"L":"BeiJing",

"O":"system:masters",

"OU":"System"

}

]

}

后续 kube-apiserver 使用 RBAC 对客户端(如 kubelet、kube-proxy、Pod)请求进行授权;
kube-apiserver 预定义了一些 RBAC 使用的 RoleBindings,如 cluster-admin 将
Group system:masters 与 Role cluster-admin 绑定,该 Role
授予了调用kube-apiserver 所有 API的权限;
O 指定该证书的 Group 为 system:masters,kubelet 使用该证书访问 kube-apiserver 时
,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的 system:masters,所以被授予访问所有 API 的权限;
hosts 属性值为空列表;

生成 admin 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes admin-csr.json | cfssljson -bare admin

$ ls admin*

admin.csr admin-csr.json admin-key.pem admin.pem

$ sudo mv admin*.pem /etc/kubernetes/ssl/

$ rm admin.csr admin-csr.json

$
创建 kubectl kubeconfig 文件
$# 设置集群参数

$ kubectl config set-cluster kubernetes \

--certificate-authority=/etc/kubernetes/ssl/ca.pem \

--embed-certs=true \

--server=${KUBE_APISERVER}

$ # 设置客户端认证参数

$ kubectl config set-credentials admin \

--client-certificate=/etc/kubernetes/ssl/admin.pem \

--embed-certs=true \

--client-key=/etc/kubernetes/ssl/admin-key.pem

$ # 设置上下文参数

$ kubectl config set-context kubernetes \

--cluster=kubernetes \

--user=admin

$ # 设置默认上下文

$ kubectl config use-context kubernetes

admin.pem 证书 O 字段值为 system:masters,kube-apiserver 预定义的
RoleBinding cluster-admin 将 Group system:masters 与
Role cluster-admin 绑定,该 Role 授予了调用kube-apiserver 相关
API 的权限;
生成的 kubeconfig 被保存到 ~/.kube/config 文件;

分发 kubeconfig 文件
将 ~/.kube/config 文件拷贝到运行 kubelet 命令的机器的 ~/.kube/ 目录下。

05-部署Flannel网络

部署 Flannel 网络
kubernetes 要求集群内各节点能通过 Pod 网段互联互通,本文档介绍使用Flannel 在所有节点 (Master、Node) 上创建互联互通的 Pod 网段的步骤。
使用的变量
本文档用到的变量定义如下:
$export NODE_IP=192.168.1.206#
当前部署节点的 IP

$ # 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

$ source /root/local/bin/environment.sh

$
创建 TLS 秘钥和证书
etcd 集群启用了双向 TLS 认证,所以需要为 flanneld 指定与 etcd集群通信的 CA 和秘钥。
创建 flanneld 证书签名请求:
$ cat>flanneld-csr.json<<EOF

{

"CN": "flanneld",

"hosts": [],

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"ST": "BeiJing",

"L": "BeiJing",

"O": "k8s",

"OU": "System"

}

]

}

EOF

hosts 字段为空;

生成 flanneld 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes flanneld-csr.json | cfssljson -bare flanneld

$ ls flanneld*

flanneld.csr flanneld-csr.json flanneld-key.pem flanneld.pem

$ sudo mkdir -p /etc/flanneld/ssl

$ sudo mv flanneld*.pem /etc/flanneld/ssl

$ rm flanneld.csr flanneld-csr.json
向 etcd 写入集群 Pod 网段信息
注意:本步骤只需在第一次部署 Flannel 网络时执行,后续在其它节点上部署 Flannel时无需再写入该信息!
$ /root/local/bin/etcdctl \

--endpoints=${ETCD_ENDPOINTS}\

--ca-file=/etc/kubernetes/ssl/ca.pem\

--cert-file=/etc/flanneld/ssl/flanneld.pem\

--key-file=/etc/flanneld/ssl/flanneld-key.pem \

set${FLANNEL_ETCD_PREFIX}/config'{"Network":"'${CLUSTER_CIDR}'",
"SubnetLen": 24, "Backend":{"Type": "vxlan"}}'

flanneld 目前版本 (v0.7.1) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;
写入的 Pod 网段(${CLUSTER_CIDR},172.30.0.0/16) 必须与 kube-controller-manager 的 --cluster-cidr 选项值一致;

安装和配置 flanneld
下载 flanneld

$ mkdir flannel

$ wget https://github.com/coreos/flannel/releases/download/v0.7.1/flannel-v0.7.1-linux-amd64.tar.gz

$ tar -xzvf flannel-v0.7.1-linux-amd64.tar.gz -C flannel

$ sudo cp flannel/{flanneld,mk-docker-opts.sh} /root/local/bin

$
创建 flanneld 的 systemd unit 文件
$ cat>flanneld.service<<EOF

[Unit]

Description=Flanneld overlay address etcdagent

After=network.target

After=network-online.target

Wants=network-online.target

After=etcd.service

Before=docker.service

[Service]

Type=notify

ExecStart=/root/local/bin/flanneld \\

-etcd-cafile=/etc/kubernetes/ssl/ca.pem\\

-etcd-certfile=/etc/flanneld/ssl/flanneld.pem \\

-etcd-keyfile=/etc/flanneld/ssl/flanneld-key.pem \\

-etcd-endpoints=${ETCD_ENDPOINTS}\\

-etcd-prefix=${FLANNEL_ETCD_PREFIX}

ExecStartPost=/root/local/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d/run/flannel/docker

Restart=on-failure

[Install]

WantedBy=multi-user.target

RequiredBy=docker.service

EOF

mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入到 /run/flannel/docker 文件中,后续
docker 启动时使用这个文件中参数值设置 docker0 网桥;
flanneld 使用系统缺省路由所在的接口和其它节点通信,对于有多个网络接口的机器(如,内网和公网),可以用 -iface选项值指定通信接口(上面的
systemd unit 文件没指定这个选项);

完整 unit 见 flanneld.service
启动 flanneld
$ sudo cp flanneld.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable flanneld

$ sudo systemctl start flanneld

$ systemctl status flanneld

$
检查 flanneld 服务
$ journalctl -u flanneld|grep'Lease
acquired'

$ ifconfig flannel.1

$
检查分配给各 flanneld 的 Pod 网段信息
$# 查看集群 Pod 网段(/16)

$ /root/local/bin/etcdctl \

--endpoints=${ETCD_ENDPOINTS}\

--ca-file=/etc/kubernetes/ssl/ca.pem\

--cert-file=/etc/flanneld/ssl/flanneld.pem\

--key-file=/etc/flanneld/ssl/flanneld-key.pem \

get${FLANNEL_ETCD_PREFIX}/config

{ "Network":"172.30.0.0/16","SubnetLen":
24,"Backend": {"Type":"vxlan"
} }

$ # 查看已分配的 Pod 子网段列表(/24)

$ /root/local/bin/etcdctl \

--endpoints=${ETCD_ENDPOINTS}\

--ca-file=/etc/kubernetes/ssl/ca.pem\

--cert-file=/etc/flanneld/ssl/flanneld.pem\

--key-file=/etc/flanneld/ssl/flanneld-key.pem \

ls ${FLANNEL_ETCD_PREFIX}/subnets

2017-07-0517:27:46.007743 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
/kubernetes/network/subnets/172.30.43.0-24
/kubernetes/network/subnets/172.30.44.0-24
/kubernetes/network/subnets/172.30.45.0-24

$ # 查看某一 Pod 网段对应的 flanneld 进程监听的 IP 和网络参数

$ /root/local/bin/etcdctl \

--endpoints=${ETCD_ENDPOINTS}\

--ca-file=/etc/kubernetes/ssl/ca.pem\

--cert-file=/etc/flanneld/ssl/flanneld.pem\

--key-file=/etc/flanneld/ssl/flanneld-key.pem \

get${FLANNEL_ETCD_PREFIX}/subnets/172.30.43.0-24
2017-07-0517:28:34.116874 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated
{"PublicIP":"192.168.1.207","BackendType":"vxlan","BackendData":{"VtepMAC":"52:73:8c:2f:ae:3c"}}
确保各节点间 Pod 网段能互联互通
在各节点上部署完 Flannel 后,查看已分配的 Pod 子网段列表(/24)
$ /root/local/bin/etcdctl\

--endpoints=${ETCD_ENDPOINTS}\

--ca-file=/etc/kubernetes/ssl/ca.pem\

--cert-file=/etc/flanneld/ssl/flanneld.pem\

--key-file=/etc/flanneld/ssl/flanneld-key.pem \

ls${FLANNEL_ETCD_PREFIX}/subnets

/kubernetes/network/subnets/172.30.43.0-24

/kubernetes/network/subnets/172.30.44.0-24

/kubernetes/network/subnets/172.30.45.0-24
当前三个节点分配的 Pod 网段分别是:172.30.43.0-24、172.30.44.0-24、172.30.45.0-24。

06-部署Master节点

部署 master 节点
kubernetes master 节点包含的组件:

kube-apiserver
kube-scheduler
kube-controller-manager

目前这三个组件需要部署在同一台机器上:

kube-scheduler、kube-controller-manager 和 kube-apiserver 三者的功能紧密相关;
同时只能有一个 kube-scheduler、kube-controller-manager 进程处于工作状态,如果运行多个,则需要通过选举产生一个
leader;

本文档介绍部署单机 kubernetes master 节点的步骤,没有实现高可用master 集群。
计划后续再介绍部署 LB 的步骤,客户端(kubectl、kubelet、kube-proxy) 使用 LB 的 VIP 来访问 kube-apiserver,从而实现高可用 master 集群。
master 节点与 node 节点上的 Pods 通过 Pod 网络通信,所以需要在master 节点上部署 Flannel 网络。
使用的变量
本文档用到的变量定义如下:
$export MASTER_IP=192.168.1.206 #
替换为当前部署的 master 机器 IP

$ #导入用到的其它全局变量:SERVICE_CIDR、CLUSTER_CIDR、NODE_PORT_RANGE、ETCD_ENDPOINTS、BOOTSTRAP_TOKEN

$ source /root/local/bin/environment.sh

$
下载最新版本的二进制文件
有两种下载方式:

github release 页面 下载发布版 tarball,解压后再执行下载脚本

$ wgethttps://github.com/kubernetes/kubernetes/releases/download/v1.6.2/kubernetes.tar.gz

$ tar -xzvf kubernetes.tar.gz

...

$ cd kubernetes

$ ./cluster/get-kube-binaries.sh

...

CHANGELOG页面 下载 client 或 server tarball
文件

server 的 tarball kubernetes-server-linux-amd64.tar.gz 已经包含了 client(kubectl)
二进制文件,所以不用单独下载kubernetes-client-linux-amd64.tar.gz文件;

$# wgethttps://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz

$ wget https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz

$ tar -xzvf kubernetes-server-linux-amd64.tar.gz

...

$ cd kubernetes

$ tar -xzvf kubernetes-src.tar.gz

将二进制文件拷贝到指定路径:

$ sudo cp -rserver/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet}/root/local/bin/

$
安装和配置 flanneld
参考 05-部署Flannel网络.md
创建 kubernetes 证书
创建 kubernetes 证书签名请求
$ cat>kubernetes-csr.json<<EOF

{

"CN": "kubernetes",

"hosts": [

"127.0.0.1",

"${MASTER_IP}",

"${CLUSTER_KUBERNETES_SVC_IP}",

"kubernetes",

"kubernetes.default",

"kubernetes.default.svc",

"kubernetes.default.svc.cluster",

"kubernetes.default.svc.cluster.local"

],

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"ST": "BeiJing",

"L": "BeiJing",

"O": "k8s",

"OU": "System"

}

]

}

EOF

如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,所以上面分别指定了当前部署的 master 节点主机 IP;

还需要添加 kube-apiserver 注册的名为 kubernetes 的服务 IP (Service Cluster IP),一般是 kube-apiserver --service-cluster-ip-range 选项值指定的网段的第一个IP,如
"10.254.0.1";

$ kubectl get svc kubernetes

NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes 10.254.0.1 <none>
443/TCP 1d

生成 kubernetes 证书和私钥
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes kubernetes-csr.json| cfssljson -bare kubernetes

$ ls kubernetes*

kubernetes.csr kubernetes-csr.json kubernetes-key.pem kubernetes.pem

$ sudo mkdir -p /etc/kubernetes/ssl/

$ sudo mv kubernetes*.pem /etc/kubernetes/ssl/

$ rm kubernetes.csr kubernetes-csr.json
配置和启动 kube-apiserver
创建 kube-apiserver 使用的客户端 token 文件
kubelet 首次启动时向 kube-apiserver发送 TLS Bootstrapping 请求,kube-apiserver 验证 kubelet 请求中的 token 是否与它配置的 token.csv一致,如果一致则自动为 kubelet生成证书和秘钥。(这个token只要master做一次)
$# 导入的 environment.sh 文件定义了 BOOTSTRAP_TOKEN 变量

$ cat > token.csv<<EOF

${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:kubelet-bootstrap"

EOF

$ mv token.csv /etc/kubernetes/

$
创建 kube-apiserver 的 systemd unit 文件
$ cat > kube-apiserver.service<<EOF

[Unit]

Description=KubernetesAPI Server

Documentation=https://github.com/GoogleCloudPlatform/kubernetes

After=network.target
[Service]

ExecStart=/root/local/bin/kube-apiserver\\

--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota\\

--advertise-address=${MASTER_IP} \\

--bind-address=${MASTER_IP} \\

--insecure-bind-address=${MASTER_IP} \\

--authorization-mode=RBAC \\

--runtime-config=rbac.authorization.k8s.io/v1alpha1 \\

--kubelet-https=true \\

--experimental-bootstrap-token-auth \\

--token-auth-file=/etc/kubernetes/token.csv\\

--service-cluster-ip-range=${SERVICE_CIDR} \\

--service-node-port-range=${NODE_PORT_RANGE}\\

--tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\

--tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\

--client-ca-file=/etc/kubernetes/ssl/ca.pem\\

--service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \\

--etcd-cafile=/etc/kubernetes/ssl/ca.pem \\

--etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\

--etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\

--etcd-servers=${ETCD_ENDPOINTS} \\

--enable-swagger-ui=true \\

--allow-privileged=true \\

--apiserver-count=3 \\

--audit-log-maxage=30 \\

--audit-log-maxbackup=3 \\

--audit-log-maxsize=100 \\

--audit-log-path=/var/lib/audit.log \\

--event-ttl=1h \\

--v=2

Restart=on-failure

RestartSec=5

Type=notify

LimitNOFILE=65536
[Install]

WantedBy=multi-user.target

EOF

kube-apiserver 1.6 版本开始使用 etcd v3 API 和存储格式;
--authorization-mode=RBAC 指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;
kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;
kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;
kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;
如果使用了 kubelet TLS Boostrap 机制,则不能再指定 --kubelet-certificate-authority、--kubelet-client-certificate 和 --kubelet-client-key 选项,否则后续
kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;
--admission-control 值必须包含 ServiceAccount,否则部署集群插件时会失败;
--bind-address 不能为 127.0.0.1;
--service-cluster-ip-range 指定 Service Cluster IP 地址段,该地址段不能路由可达;
--service-node-port-range=${NODE_PORT_RANGE} 指定 NodePort 的端口范围;
缺省情况下 kubernetes 对象保存在 etcd /registry 路径下,可以通过 --etcd-prefix 参数进行调整;

完整 unit 见 kube-apiserver.service
启动 kube-apiserver
$ sudo cp kube-apiserver.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable kube-apiserver

$ sudo systemctl start kube-apiserver

$ sudo systemctl status kube-apiserver

$
配置和启动 kube-controller-manager
创建 kube-controller-manager 的 systemd unit 文件
$ cat>kube-controller-manager.service<<EOF

[Unit]

Description=KubernetesController Manager

Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]

ExecStart=/root/local/bin/kube-controller-manager\\

--address=127.0.0.1 \\

--master=http://${MASTER_IP}:8080 \\

--allocate-node-cidrs=true \\

--service-cluster-ip-range=${SERVICE_CIDR} \\

--cluster-cidr=${CLUSTER_CIDR} \\

--cluster-name=kubernetes \\

--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \\

--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \\

--service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \\

--root-ca-file=/etc/kubernetes/ssl/ca.pem \\

--leader-elect=true \\

--v=2

Restart=on-failure

RestartSec=5
[Install]

WantedBy=multi-user.target

EOF

--address 值必须为 127.0.0.1,因为当前 kube-apiserver 期望 scheduler 和
controller-manager 在同一台机器,否则:

$ kubectl get componentstatuses

NAME STATUS MESSAGE ERROR

controller-manager Unhealthy Get http://127.0.0.1:10252/healthz:
dial tcp 127.0.0.1:10252: getsockopt: connection refused

scheduler Unhealthy Get http://127.0.0.1:10251/healthz:
dial tcp 127.0.0.1:10251: getsockopt: connection refused

参考:https://github.com/kubernetes-incubator/bootkube/issues/64

--master=http://{MASTER_IP}:8080:使用非安全 8080 端口与 kube-apiserver 通信;

--cluster-cidr 指定 Cluster 中 Pod 的 CIDR 范围,该网段在各 Node 间必须路由可达(flanneld保证);

--service-cluster-ip-range 参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;

--cluster-signing-* 指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;

--root-ca-file 用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件;

--leader-elect=true 部署多台机器组成的 master 集群时选举产生一处于工作状态的 kube-controller-manager 进程;

完整 unit 见 kube-controller-manager.service
启动 kube-controller-manager
$ sudo cp kube-controller-manager.service /etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable kube-controller-manager

$ sudo systemctl start kube-controller-manager

$
配置和启动 kube-scheduler
创建 kube-scheduler 的 systemd unit 文件
$ cat>kube-scheduler.service<<EOF

[Unit]

Description=KubernetesScheduler

Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]

ExecStart=/root/local/bin/kube-scheduler\\

--address=127.0.0.1 \\

--master=http://${MASTER_IP}:8080 \\

--leader-elect=true \\

--v=2

Restart=on-failure

RestartSec=5
[Install]

WantedBy=multi-user.target

EOF

--address 值必须为 127.0.0.1,因为当前
kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;
--master=http://{MASTER_IP}:8080:使用非安全 8080 端口与 kube-apiserver 通信;
--leader-elect=true 部署多台机器组成的 master 集群时选举产生一处于工作状态的 kube-controller-manager 进程;

完整 unit 见 kube-scheduler.service
启动 kube-scheduler
$ sudo cp kube-scheduler.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable kube-scheduler

$ sudo systemctl start kube-scheduler

$
验证 master 节点功能
$ kubectl getcomponentstatuses

NAME STATUS MESSAGE ERROR

controller-manager Healthy ok

scheduler Healthy ok

etcd-0 Healthy {"health":"true"}

etcd-1 Healthy {"health":"true"}

etcd-2 Healthy {"health":"true"}

07-部署Node节点

部署 Node 节点
kubernetes Node 节点包含如下组件:

flanneld
docker
kubelet
kube-proxy

使用的变量
本文档用到的变量定义如下:
$ # 替换为 kubernetesmaster 集群任一机器 IP

$ export MASTER_IP=192.168.1.206

$ export KUBE_APISERVER="https://${MASTER_IP}:6443"

$ # 当前部署的节点 IP

$ export NODE_IP=192.168.1.206

$ #导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR、CLUSTER_DNS_SVC_IP、CLUSTER_DNS_DOMAIN、SERVICE_CIDR

$ source /root/local/bin/environment.sh

$

安装和配置 flanneld
参考 05-部署Flannel网络.md

安装和配置 docker
下载最新的 docker 二进制文件
$ wget https://get.docker.com/builds/Linux/x86_64/docker-17.04.0-ce.tgz

$ tar -xvf docker-17.04.0-ce.tgz

$ cp docker/docker* /root/local/bin

$ cp docker/completion/bash/docker /etc/bash_completion.d/

$

创建 docker 的 systemd unit 文件
$ catdocker.service

[Unit]

Description=Docker Application ContainerEngine

Documentation=http://docs.docker.io
[Service]

Environment="PATH=/root/local/bin:/bin:/sbin:/usr/bin:/usr/sbin"

EnvironmentFile=-/run/flannel/docker

ExecStart=/root/local/bin/dockerd --log-level=error$DOCKER_NETWORK_OPTIONS

ExecReload=/bin/kill -s HUP$MAINPID

Restart=on-failure

RestartSec=5

LimitNOFILE=infinity

LimitNPROC=infinity

LimitCORE=infinity

Delegate=yes

KillMode=process

[Install]

WantedBy=multi-user.target

dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
flanneld 启动时将网络配置写入到 /run/flannel/docker 文件中的变量 DOCKER_NETWORK_OPTIONS,dockerd 命令行上指定该变量值来设置 docker0 网桥参数;
如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
不能关闭默认开启的 --iptables 和 --ip-masq 选项;
如果内核版本比较新,建议使用 overlay 存储驱动;
docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:

$ sudo iptables -P FORWARD ACCEPT

$
为了加快 pull image 的速度,可以使用国内的仓库镜像服务器,同时增加下载的并发数。(如果 dockerd 已经运行,则需要重启 dockerd 生效。)

$ cat /etc/docker/daemon.json

{

"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn","hub-mirror.c.163.com"],

"max-concurrent-downloads": 10

}

完整 unit 见 docker.service

启动 dockerd
$ sudo cp docker.service/etc/systemd/system/docker.service

$ sudo systemctl daemon-reload

$ sudo systemctl stop firewalld

$ sudo systemctl disable firewalld

$ sudo iptables -F && sudo iptables -X&&
sudo iptables -F -t nat&& sudo iptables -X -t nat

$ sudo systemctl enable docker

$ sudo systemctl start docker

$

需要关闭 firewalld,否则可能会重复创建的 iptables 规则;
最好清理旧的 iptables rules 和 chains 规则;

检查 docker 服务
$ docker version

$
安装和配置 kubelet
kubelet 启动时向 kube-apiserver 发送 TLSbootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予system:node-bootstrapper 角色,然后 kubelet 才有权限创建认证请求(certificatesigningrequests):(这个只要在第一个node节点上执行一次)
$ kubectl create clusterrolebindingkubelet-bootstrap --clusterrole=system:node-bootstrapper--user=kubelet-bootstrap

$

--user=kubelet-bootstrap 是文件 /etc/kubernetes/token.csv 中指定的用户名,同时也写入了文件 /etc/kubernetes/bootstrap.kubeconfig;

下载最新的 kubelet 和 kube-proxy 二进制文件
$ wget https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz

$ tar -xzvf kubernetes-server-linux-amd64.tar.gz

$ cd kubernetes

$ tar -xzvf kubernetes-src.tar.gz

$ sudo cp -r ./server/bin/{kube-proxy,kubelet} /root/local/bin/

$

创建 kubelet bootstrapping kubeconfig 文件
$ # 设置集群参数

$ kubectl config set-cluster kubernetes \

--certificate-authority=/etc/kubernetes/ssl/ca.pem \

--embed-certs=true \

--server=${KUBE_APISERVER} \

--kubeconfig=bootstrap.kubeconfig

$ # 设置客户端认证参数

$ kubectl config set-credentials kubelet-bootstrap \

--token=${BOOTSTRAP_TOKEN} \

--kubeconfig=bootstrap.kubeconfig

$ # 设置上下文参数

$ kubectl config set-context default \

--cluster=kubernetes \

--user=kubelet-bootstrap \

--kubeconfig=bootstrap.kubeconfig

$ # 设置默认上下文

$ kubectl config use-context default --kubeconfig=bootstrap.kubeconfig

$ mv bootstrap.kubeconfig /etc/kubernetes/

--embed-certs 为 true 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
设置 kubelet 客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;

创建 kubelet 的 systemd unit 文件
$ sudo mkdir /var/lib/kubelet
# 必须先创建工作目录

$ cat > kubelet.service<<EOF

[Unit]

Description=KubernetesKubelet

Documentation=https://github.com/GoogleCloudPlatform/kubernetes

After=docker.service

Requires=docker.service

[Service]

WorkingDirectory=/var/lib/kubelet

ExecStart=/root/local/bin/kubelet\\

--address=${NODE_IP} \\

--hostname-override=${NODE_IP} \\

--pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest\\

--experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig\\

--kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\

--require-kubeconfig \\

--cert-dir=/etc/kubernetes/ssl \\

--cluster-dns=${CLUSTER_DNS_SVC_IP} \\

--cluster-domain=${CLUSTER_DNS_DOMAIN} \\

--hairpin-mode promiscuous-bridge \\

--allow-privileged=true \\

--serialize-image-pulls=false \\

--logtostderr=true \\

--v=2

ExecStopPost=/sbin/iptables-A INPUT -s 10.0.0.0/8 -p tcp --dport 4194 -j ACCEPT

ExecStopPost=/sbin/iptables-A INPUT -s 172.16.0.0/12 -p tcp --dport 4194 -j ACCEPT

ExecStopPost=/sbin/iptables-A INPUT -s 192.168.0.0/16 -p tcp --dport 4194 -j ACCEPT

ExecStopPost=/sbin/iptables-A INPUT -p tcp --dport 4194 -j DROP

Restart=on-failure

RestartSec=5

[Install]

WantedBy=multi-user.target

EOF

--address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
如果设置了 --hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
--experimental-bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
管理员通过了 CSR 请求后,kubelet 自动在 --cert-dir 目录创建证书和私钥文件(kubelet-client.crt 和 kubelet-client.key),然后写入 --kubeconfig 文件(自动创建 --kubeconfig 指定的文件);
建议在 --kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 --api-servers 选项,则必须指定 --require-kubeconfig 选项后才从配置文件中读取 kue-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息;
--cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),--cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
kubelet cAdvisor 默认在所有接口监听 4194 端口的请求,对于有外网的机器来说不安全,ExecStopPost 选项指定的 iptables 规则只允许内网机器访问 4194 端口;

完整 unit 见 kubelet.service

启动 kubelet
$ sudo cp kubelet.service/etc/systemd/system/kubelet.service

$ sudo systemctl daemon-reload

$ sudo systemctl enable kubelet

$ sudo systemctl start kubelet

$ systemctl status kubelet

$

通过 kubelet 的 TLS 证书请求
kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后kubernetes 系统才会将该 Node 加入到集群。
查看未授权的 CSR 请求:
$ kubectl get csr

NAME AGE REQUESTOR CONDITION

csr-2b308 4m kubelet-bootstrap Pending

$ kubectl get nodes

No resources found.
通过 CSR 请求:
$ kubectl certificate approvecsr-2b308

certificatesigningrequest "csr-2b308" approved

$ kubectl get nodes

NAME STATUS AGE VERSION

10.64.3.7 Ready 49m v1.6.2

自动生成了 kubelet kubeconfig 文件和公私钥:
$ ls -l/etc/kubernetes/kubelet.kubeconfig

-rw------- 1 root root 2284 Apr 7 02:07/etc/kubernetes/kubelet.kubeconfig

$ ls -l /etc/kubernetes/ssl/kubelet*

-rw-r--r-- 1 root root 1046 Apr 7 02:07/etc/kubernetes/ssl/kubelet-client.crt

-rw------- 1 root root 227 Apr 7 02:04 /etc/kubernetes/ssl/kubelet-client.key

-rw-r--r-- 1 root root 1103 Apr 7 02:07/etc/kubernetes/ssl/kubelet.crt

-rw------- 1 root root 1675 Apr 7 02:07/etc/kubernetes/ssl/kubelet.key

配置 kube-proxy
创建 kube-proxy 证书
创建 kube-proxy 证书签名请求:
$ catkube-proxy-csr.json

{

"CN":"system:kube-proxy",

"hosts": [],

"key": {

"algo":"rsa",

"size": 2048

},

"names": [

{

"C":"CN",

"ST":"BeiJing",

"L":"BeiJing",

"O":"k8s",

"OU":"System"

}

]

}

CN 指定该证书的 User 为 system:kube-proxy;
kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
hosts 属性值为空列表;

生成 kube-proxy 客户端证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy

$ ls kube-proxy*

kube-proxy.csr kube-proxy-csr.json kube-proxy-key.pem kube-proxy.pem

$ sudo mv kube-proxy*.pem /etc/kubernetes/ssl/

$ rm kube-proxy.csr kube-proxy-csr.json

$

创建 kube-proxy kubeconfig 文件
$ # 设置集群参数

$ kubectl config set-cluster kubernetes \

--certificate-authority=/etc/kubernetes/ssl/ca.pem \

--embed-certs=true \

--server=${KUBE_APISERVER} \

--kubeconfig=kube-proxy.kubeconfig

$ # 设置客户端认证参数

$ kubectl config set-credentials kube-proxy \

--client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \

--client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \

--embed-certs=true \

--kubeconfig=kube-proxy.kubeconfig

$ # 设置上下文参数

$ kubectl config set-context default \

--cluster=kubernetes \

--user=kube-proxy \

--kubeconfig=kube-proxy.kubeconfig

$ # 设置默认上下文

$ kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig

$ mv kube-proxy.kubeconfig /etc/kubernetes/

设置集群参数和客户端认证参数时 --embed-certs 都为 true,这会将 certificate-authority、client-certificate 和 client-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
kube-proxy.pem 证书中 CN 为 system:kube-proxy,kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

创建 kube-proxy 的 systemd unit 文件
$ sudo mkdir -p /var/lib/kube-proxy# 必须先创建工作目录

$ cat > kube-proxy.service<<EOF

[Unit]

Description=KubernetesKube-Proxy Server

Documentation=https://github.com/GoogleCloudPlatform/kubernetes

After=network.target

[Service]

WorkingDirectory=/var/lib/kube-proxy

ExecStart=/root/local/bin/kube-proxy\\

--bind-address=${NODE_IP} \\

--hostname-override=${NODE_IP} \\

--cluster-cidr=${SERVICE_CIDR} \\

--kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig \\

--logtostderr=true \\

--v=2

Restart=on-failure

RestartSec=5

LimitNOFILE=65536

[Install]

WantedBy=multi-user.target

EOF

--hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
--cluster-cidr 必须与 kube-apiserver 的 --service-cluster-ip-range 选项值一致;
kube-proxy 根据 --cluster-cidr 判断集群内部和外部流量,指定 --cluster-cidr 或 --masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
--kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

完整 unit 见 kube-proxy.service

启动 kube-proxy
$ sudo cp kube-proxy.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl enable kube-proxy

$ sudo systemctl start kube-proxy

$ systemctl status kube-proxy

$

验证集群功能
定义文件:
$ cat nginx-ds.yml

apiVersion: v1

kind: Service

metadata:

name: nginx-ds

labels:

app: nginx-ds

spec:

type: NodePort

selector:

app: nginx-ds

ports:

- name: http

port: 80

targetPort: 80
---
apiVersion: extensions/v1beta1

kind: DaemonSet

metadata:

name: nginx-ds

labels:

addonmanager.kubernetes.io/mode:Reconcile

spec:

template:

metadata:

labels:

app: nginx-ds

spec:

containers:

- name: my-nginx

image: nginx:1.7.9

ports:

- containerPort: 80
创建 Pod 和服务:
$ kubectl create -fnginx-ds.yml

service "nginx-ds" created

daemonset "nginx-ds" created

检查节点状态
$ kubectl get nodes
NAME STATUS AGE VERSION

192.168.1.206 Ready 1d v1.6.2

192.168.1.207 Ready 1d v1.6.2

192.168.1.208 Ready 1d v1.6.2

都为Ready 时正常。
检查各 Node 上的 Pod IP 连通性
$ kubectl get pods -o wide|grepnginx-ds

nginx-ds-6ktz8 1/1 Running 0 5m 172.30.43.19 192.168.1.206

nginx-ds-6ktz9 1/1 Running 0 5m 172.30.44.20 192.168.1.207

可见,nginx-ds 的 PodIP 分别是 172.30.43.19、172.30.44.20,在所有 Node 上分别 ping 这两个 IP,看是否连通。
检查服务 IP 和端口可达性
$ kubectl get svc |grep nginx-ds

nginx-ds 10.254.136.178 <nodes> 80:8744/TCP
11m

可见:

服务IP:10.254.136.178
服务端口:80
NodePort端口:8744

在所有 Node 上执行:
$ curl 10.254.136.178 # `kubectl get svc |grep nginx-ds`输出中的服务 IP

$

预期输出 nginx 欢迎页面内容。
检查服务的 NodePort 可达性
在所有 Node 上执行:
$ export NODE_IP=192.168.1.207#
当前 Node 的 IP

$ export NODE_PORT=8744# `kubectl get svc|grep nginx-ds` 输出中 80 端口映射的 NodePort

$ curl ${NODE_IP}:${NODE_PORT}

$

预期输出 nginx 欢迎页面内容。

08-部署DNS插件

部署 kubedns 插件
官方文件目录:kubernetes/cluster/addons/dns
使用的文件:
$ ls*.yaml*.base

kubedns-cm.yaml kubedns-sa.yaml kubedns-controller.yaml.base kubedns-svc.yaml.base
已经修改好的 yaml 文件见:dns
系统预定义的 RoleBinding
预定义的 RoleBinding system:kube-dns 将 kube-system命名空间的 kube-dns ServiceAccount
与 system:kube-dns Role 绑定, 该 Role 具有访问kube-apiserver DNS 相关 API 的权限;
$ kubectl get clusterrolebindingssystem:kube-dns -o yaml

apiVersion: rbac.authorization.k8s.io/v1beta1

kind: ClusterRoleBinding

metadata:

annotations:

rbac.authorization.kubernetes.io/autoupdate: "true"

creationTimestamp:2017-04-06T17:40:47Z

labels:

kubernetes.io/bootstrapping:rbac-defaults

name: system:kube-dns

resourceVersion: "56"

selfLink:/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindingssystem%3Akube-dns

uid:2b55cdbe-1af0-11e7-af35-8cdcd4b3be48

roleRef:

apiGroup:rbac.authorization.k8s.io

kind: ClusterRole

name:system:kube-dns

subjects:

- kind: ServiceAccount

name: kube-dns

namespace: kube-system
kubedns-controller.yaml 中定义的 Pods 时使用了 kubedns-sa.yaml 文件定义的 kube-dns ServiceAccount,所以具有访问kube-apiserver
DNS 相关 API 的权限;
配置 kube-dns ServiceAccount
无需修改;
配置 kube-dns 服务
$ diff kubedns-svc.yaml.basekubedns-svc.yaml

30c30

< clusterIP: __PILLAR__DNS__SERVER__

---

> clusterIP: 10.254.0.2

需要将 spec.clusterIP 设置为集群环境变量中变量 CLUSTER_DNS_SVC_IP 值,这个
IP 需要和 kubelet 的 —cluster-dns 参数值一致;

配置 kube-dns Deployment
$ diff kubedns-controller.yaml.basekubedns-controller.yaml

58c58

< image:gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1

---

> image:xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1

88c88

< ---domain=__PILLAR__DNS__DOMAIN__.

---

> ---domain=cluster.local.

92c92

< __PILLAR__FEDERATIONS__DOMAIN__MAP__

---

> #__PILLAR__FEDERATIONS__DOMAIN__MAP__

110c110

< image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1

---

> image:xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1

129c129

< ---server=/__PILLAR__DNS__DOMAIN__/127.0.0.1#10053

---

> ---server=/cluster.local./127.0.0.1#10053

148c148

< image:gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1

---

> image:xuejipeng/k8s-dns-sidecar-amd64:v1.14.1

161,162c161,162

< ---probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A

< ---probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A

---

> ---probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A

> ---probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A

--domain 为集群环境文档 变量 CLUSTER_DNS_DOMAIN 的值;
使用系统已经做了 RoleBinding 的 kube-dns ServiceAccount,该账户具有访问
kube-apiserver DNS 相关 API 的权限;

执行所有定义文件
$pwd

/root/kubernetes-git/cluster/addons/dns

$ ls *.yaml

kubedns-cm.yaml kubedns-controller.yaml kubedns-sa.yaml kubedns-svc.yaml

$ kubectl create -f .

$
检查 kubedns 功能
新建一个 Deployment

$ cat my-nginx.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: my-nginx

spec:

replicas: 2

template:

metadata:

labels:

run: my-nginx

spec:

containers:

- name: my-nginx

image: nginx:1.7.9

ports:

- containerPort: 80

$ kubectl create -f my-nginx.yaml

$
Export 该 Deployment,生成 my-nginx 服务
$ kubectl expose deploymy-nginx

$ kubectl get services --all-namespaces |grepmy-nginx

default my-nginx 10.254.86.48 <none>
80/TCP 1d
创建另一个 Pod,查看 /etc/resolv.conf 是否包含 kubelet 配置的 --cluster-dns 和 --cluster-domain,是否能够将服务 my-nginx 解析到上面显示的
ClusterIP 10.254.86.48
$ cat pod-nginx.yaml

apiVersion: v1

kind: Pod

metadata:

name: nginx

spec:

containers:

- name: nginx

image: nginx:1.7.9

ports:

- containerPort: 80

$ kubectl create -f pod-nginx.yaml

$ kubectl exec nginx -i -t -- /bin/bash

root@nginx:/# cat/etc/resolv.conf

nameserver 10.254.0.2

search default.svc.cluster.local svc.cluster.local cluster.localtjwq01.ksyun.com

options ndots:5
root@nginx:/# ping my-nginx

PING my-nginx.default.svc.cluster.local (10.254.86.48): 48 databytes

^C--- my-nginx.default.svc.cluster.local ping statistics ---

2 packets transmitted, 0 packets received, 100% packet loss
root@nginx:/# ping kubernetes

PING kubernetes.default.svc.cluster.local (10.254.0.1): 48 databytes

^C--- kubernetes.default.svc.cluster.local ping statistics ---

1 packets transmitted, 0 packets received, 100% packet loss
root@nginx:/# pingkube-dns.kube-system.svc.cluster.local

PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 48 databytes

^C--- kube-dns.kube-system.svc.cluster.local ping statistics ---

1 packets transmitted, 0 packets received, 100% packet loss

附件:
kubedns-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: kube-dns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
kubedns-controller.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
# replicas: not specified here:
# 1. In order to make Addon Manager do not reconcile this replicas parameter.
# 2. Default is 1.
# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
rollingUpdate:
maxSurge: 10%
maxUnavailable: 0
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
volumes:
- name: kube-dns-config
configMap:
name: kube-dns
optional: true
containers:
- name: kubedns
image: xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1
resources:
# TODO: Set memory limits when we've profiled the container for large
# clusters, then set request = limit to keep this container in
# guaranteed class. Currently, this container falls into the
# "burstable" category so the kubelet doesn't backoff from restarting it.
limits:
memory: 170Mi
requests:
cpu: 100m
memory: 70Mi
livenessProbe:
httpGet:
path: /healthcheck/kubedns
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /readiness
port: 8081
scheme: HTTP
# we poll on pod startup for the Kubernetes master service and
# only setup the /readiness HTTP server once that's available.
initialDelaySeconds: 3
timeoutSeconds: 5
args:
- --domain=cluster.local.
- --dns-port=10053
- --config-dir=/kube-dns-config
- --v=2
#__PILLAR__FEDERATIONS__DOMAIN__MAP__
env:
- name: PROMETHEUS_PORT
value: "10055"
ports:
- containerPort: 10053
name: dns-local
protocol: UDP
- containerPort: 10053
name: dns-tcp-local
protocol: TCP
- containerPort: 10055
name: metrics
protocol: TCP
volumeMounts:
- name: kube-dns-config
mountPath: /kube-dns-config
- name: dnsmasq
image: xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1
livenessProbe:
httpGet:
path: /healthcheck/dnsmasq
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
args:
- -v=2
- -logtostderr
- -configDir=/etc/k8s/dns/dnsmasq-nanny
- -restartDnsmasq=true
- --
- -k
- --cache-size=1000
- --log-facility=-
- --server=/cluster.local./127.0.0.1#10053
- --server=/in-addr.arpa/127.0.0.1#10053
- --server=/ip6.arpa/127.0.0.1#10053
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
# see: https://github.com/kubernetes/kubernetes/issues/29055 for details
resources:
requests:
cpu: 150m
memory: 20Mi
volumeMounts:
- name: kube-dns-config
mountPath: /etc/k8s/dns/dnsmasq-nanny
- name: sidecar
image: xuejipeng/k8s-dns-sidecar-amd64:v1.14.1
livenessProbe:
httpGet:
path: /metrics
port: 10054
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
args:
- --v=2
- --logtostderr
- --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A
- --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A
ports:
- containerPort: 10054
name: metrics
protocol: TCP
resources:
requests:
memory: 20Mi
cpu: 10m
dnsPolicy: Default # Don't use cluster DNS.
serviceAccountName: kube-dns
kubedns-sa.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: kube-dns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubedns-svc.yaml

apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "KubeDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.254.0.2
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
09-部署Dashboard插件

部署 dashboard 插件
官方文件目录:kubernetes/cluster/addons/dashboard
使用的文件:
$ ls*.yaml

dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml

新加了 dashboard-rbac.yaml 文件,定义 dashboard 使用的
RoleBinding。

由于 kube-apiserver 启用了 RBAC 授权,而官方源码目录的 dashboard-controller.yaml 没有定义授权的ServiceAccount,所以后续访问 kube-apiserver 的
API 时会被拒绝,前端界面提示:



解决办法是:定义一个名为 dashboard 的ServiceAccount,然后将它和 Cluster Role view 绑定,具体参考 dashboard-rbac.yaml文件
已经修改好的 yaml 文件见:dashboard
配置dashboard-service
$ diff dashboard-service.yaml.origdashboard-service.yaml

10a11

> type: NodePort

指定端口类型为 NodePort,这样外界可以通过地址 nodeIP:nodePort 访问 dashboard;

配置dashboard-controller
20a21

> serviceAccountName: dashboard

23c24

< image:gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0

---

> image:cokabug/kubernetes-dashboard-amd64:v1.6.0

使用名为 dashboard 的自定义 ServiceAccount;

执行所有定义文件
$pwd

/root/kubernetes/cluster/addons/dashboard

$ ls *.yaml

dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml

$ kubectl create -f .

$
检查执行结果
查看分配的 NodePort
$ kubectl get serviceskubernetes-dashboard -n kube-system

NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE

kubernetes-dashboard 10.254.224.130 <nodes> 80:30312/TCP
25s

NodePort 30312映射到 dashboard pod 80端口;

检查 controller
$ kubectl get deploymentkubernetes-dashboard -nkube-system

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE

kubernetes-dashboard 1 1 1 1 3m

$ kubectl get pods -n kube-system | grepdashboard

kubernetes-dashboard-1339745653-pmn6z 1/1 Running 0 4m
访问dashboard

kubernetes-dashboard 服务暴露了 NodePort,可以使用 http://NodeIP:nodePort 地址访问
dashboard;
通过 kube-apiserver 访问 dashboard;
通过 kubectl proxy 访问 dashboard:

通过 kubectl proxy 访问 dashboard
启动代理
$ kubectl proxy --address='192.168.1.206'--port=8086
--accept-hosts='^*$'

Starting to serve on 192.168.1.206:8086

需要指定 --accept-hosts 选项,否则浏览器访问 dashboard 页面时提示
“Unauthorized”;

浏览器访问 URL:http://192.168.1.206:8086/ui 自动跳转到:http://192.168.1.206:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default
通过 kube-apiserver 访问dashboard
获取集群服务地址列表
$ kubectlcluster-info

Kubernetes master is running at https://192.168.1.206:6443

KubeDNS is running at https://192.168.1.206:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns

kubernetes-dashboard is running at https://192.168.1.206:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:
浏览器访问 URL:http://192.168.1.206:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard



由于缺少 Heapster 插件,当前dashboard 不能展示 Pod、Nodes 的 CPU、内存等 metric 图形;

附件:

dashboard-controller.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
selector:
matchLabels:
k8s-app: kubernetes-dashboard
template:
metadata:
labels:
k8s-app: kubernetes-dashboard
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccountName: dashboard
containers:
- name: kubernetes-dashboard
image: cokabug/kubernetes-dashboard-amd64:v1.6.0
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
ports:
- containerPort: 9090
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 30
timeoutSeconds: 30
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
dashboard-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: dashboard
namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: dashboard
subjects:
- kind: ServiceAccount
name: dashboard
namespace: kube-system
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
dashboard-service.yaml

apiVersion: v1
kind: Service
metadata:
name: kubernetes-dashboard
namespace: kube-system
labels:
k8s-app: kubernetes-dashboard
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
type: NodePort
selector:
k8s-app: kubernetes-dashboard
ports:
- port: 80
targetPort: 9090
10-部署Heapster插件

部署 heapster 插件
heapster release 页面 下载最新版本的 heapster

$ wget https://github.com/kubernetes/heapster/archive/v1.3.0.zip

$ unzip v1.3.0.zip

$ mv v1.3.0.zip heapster-1.3.0

$
官方文件目录: heapster-1.3.0/deploy/kube-config/influxdb
$cdheapster-1.3.0/deploy/kube-config/influxdb

$ ls *.yaml

grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml

grafana-service.yaml heapster-rbac.yaml influxdb-cm.yaml influxdb-service.yaml

新加了 heapster-rbac.yaml 和 influxdb-cm.yaml 文件,分别定义
RoleBinding 和 inflxudb 的配置;

已经修改好的 yaml 文件见:heapster
配置 grafana-deployment
$ diff grafana-deployment.yaml.origgrafana-deployment.yaml

16c16

< image:gcr.io/google_containers/heapster-grafana-amd64:v4.0.2

---

> image:lvanneo/heapster-grafana-amd64:v4.0.2

40,41c40,41

< # value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/

< value: /

---

> value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/

> #value: /

如果后续使用 kube-apiserver 或者 kubectl proxy 访问 grafana dashboard,则必须将 GF_SERVER_ROOT_URL 设置为 /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/,否则后续访问grafana时访问时提示找不到http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/api/dashboards/home 页面;

配置 heapster-deployment
$ diff heapster-deployment.yaml.origheapster-deployment.yaml

13a14

> serviceAccountName: heapster

16c17

< image:gcr.io/google_containers/heapster-amd64:v1.3.0-beta.1

---

> image:lvanneo/heapster-amd64:v1.3.0-beta.1

使用的是自定义的、名为 heapster 的 ServiceAccount;

配置 influxdb-deployment
influxdb 官方建议使用命令行或 HTTP API 接口来查询数据库,从 v1.1.0版本开始默认关闭 admin UI,将在后续版本中移除 admin UI 插件。
开启镜像中 admin UI的办法如下:先导出镜像中的 influxdb 配置文件,开启admin 插件后,再将配置文件内容写入 ConfigMap,最后挂载到镜像中,达到覆盖原始配置的目的。相关步骤如下:
注意:无需自己导出、修改和创建ConfigMap,可以直接使用放在 manifests 目录下的 ConfigMap文件
$# 导出镜像中的 influxdb 配置文件

$ docker run --rm --entrypoint 'cat' -ti lvanneo/heapster-influxdb-amd64:v1.1.1/etc/config.toml>config.toml.orig

$ cp config.toml.orig config.toml

$ # 修改:启用 admin 接口

$ vim config.toml

$ diff config.toml.orig config.toml

35c35

< enabled =false

---

> enabled =true

$ # 将修改后的配置写入到 ConfigMap对象中

$ kubectl create configmap influxdb-config --from-file=config.toml -n kube-system

configmap "influxdb-config" created

$ # 将 ConfigMap 中的配置文件挂载到Pod 中,达到覆盖原始配置的目的

$ diff influxdb-deployment.yaml.orig influxdb-deployment.yaml

16c16

< image:gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1

---

> image:lvanneo/heapster-influxdb-amd64:v1.1.1

19a20,21

> - mountPath: /etc/

> name:influxdb-config

22a25,27

> - name: influxdb-config

> configMap:

> name: influxdb-config
配置 monitoring-influxdb Service
$ diff influxdb-service.yaml.originfluxdb-service.yaml

12a13

> type: NodePort

15a17,20

> name: http

> - port: 8083

> targetPort: 8083

> name: admin

定义端口类型为 NodePort,额外增加了 admin 端口映射,用于后续浏览器访问 influxdb 的 admin UI 界面;

执行所有定义文件
$pwd

/root/heapster-1.3.0/deploy/kube-config/influxdb

$ ls *.yaml

grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml

grafana-service.yaml heapster-rbac.yaml influxdb-cm.yaml influxdb-service.yaml

$ kubectl create -f .

$
检查执行结果
检查 Deployment
$ kubectl get deployments -nkube-system|
grep -E'heapster|monitoring'

heapster 1 1 1 1 1m

monitoring-grafana 1 1 1 1 1m

monitoring-influxdb 1 1 1 1 1m
检查 Pods
$ kubectl get pods -n kube-system|
grep -E'heapster|monitoring'

heapster-3273315324-tmxbg 1/1 Running 0 11m

monitoring-grafana-2255110352-94lpn 1/1 Running 0 11m

monitoring-influxdb-884893134-3vb6n 1/1 Running 0 11m
检查 kubernets dashboard 界面,看是显示各 Nodes、Pods 的CPU、内存、负载等利用率曲线图;



访问 grafana

通过 kube-apiserver 访问:

获取 monitoring-grafana 服务 URL

$ kubectl cluster-info

Kubernetes master is running at https://10.64.3.7:6443

Heapster is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster

KubeDNS is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns

kubernetes-dashboard is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

monitoring-grafana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

monitoring-influxdb is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

$

由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:

浏览器访问 URL: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

通过 kubectl proxy 访问:

创建代理

$ kubectl proxy --address='10.64.3.7'
--port=8086 --accept-hosts='^*$'

Starting to serve on 10.64.3.7:8086

浏览器访问 URL:http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana



访问 influxdb admin UI
获取 influxdb http 8086 映射的 NodePort
$ kubectl get svc -n kube-system|grep
influxdb

monitoring-influxdb 10.254.255.183 <nodes> 8086:8670/TCP,8083:8595/TCP
21m
通过 kube-apiserver 的非安全端口访问influxdb 的 admin UI 界面: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/
在页面的 “Connection Settings” 的 Host 中输入 node IP,Port 中输入 8086 映射的 nodePort 如上面的 8670,点击 “Save” 即可:



附件:
grafana-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-grafana
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: grafana
spec:
containers:
- name: grafana
image: lvanneo/heapster-grafana-amd64:v4.0.2
ports:
- containerPort: 3000
protocol: TCP
volumeMounts:
- mountPath: /var
name: grafana-storage
env:
- name: INFLUXDB_HOST
value: monitoring-influxdb
- name: GRAFANA_PORT
value: "3000"
# The following env variables are required to make Grafana accessible via
# the kubernetes api-server proxy. On production clusters, we recommend
# removing these env variables, setup auth for grafana, and expose the grafana
# service using a LoadBalancer or a public IP.
- name: GF_AUTH_BASIC_ENABLED
value: "false"
- name: GF_AUTH_ANONYMOUS_ENABLED
value: "true"
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
value: Admin
- name: GF_SERVER_ROOT_URL
# If you're only using the API Server proxy, set this value instead:
value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/
#value: /
volumes:
- name: grafana-storage
emptyDir: {}
grafana-service.yaml

apiVersion: v1
kind: Service
metadata:
labels:
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-grafana
name: monitoring-grafana
namespace: kube-system
spec:
# In a production setup, we recommend accessing Grafana through an external Loadbalancer
# or through a public IP.
# type: LoadBalancer
# You could also use NodePort to expose the service at a randomly-generated port
ports:
- port : 80
targetPort: 3000
selector:
k8s-app: grafana
heapster-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: heapster
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: heapster
spec:
serviceAccountName: heapster
containers:
- name: heapster
image: lvanneo/heapster-amd64:v1.3.0-beta.1
imagePullPolicy: IfNotPresent
command:
- /heapster
- --source=kubernetes:https://kubernetes.default
- --sink=influxdb:http://monitoring-influxdb:8086
heapster-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: heapster
namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: heapster
subjects:
- kind: ServiceAccount
name: heapster
namespace: kube-system
roleRef:
kind: ClusterRole
name: system:heapster
apiGroup: rbac.authorization.k8s.io
heapster-service.yaml

apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: Heapster
name: heapster
namespace: kube-system
spec:
ports:
- port: 80
targetPort: 8082
selector:
k8s-app: heapster
influxdb-cm.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: influxdb-config
namespace: kube-system
data:
config.toml: |
reporting-disabled = true
bind-address = ":8088"
[meta]
dir = "/data/meta"
retention-autocreate = true
logging-enabled = true
[data]
dir = "/data/data"
wal-dir = "/data/wal"
query-log-enabled = true
cache-max-memory-size = 1073741824
cache-snapshot-memory-size = 26214400
cache-snapshot-write-cold-duration = "10m0s"
compact-full-write-cold-duration = "4h0m0s"
max-series-per-database = 1000000
max-values-per-tag = 100000
trace-logging-enabled = false
[coordinator]
write-timeout = "10s"
max-concurrent-queries = 0
query-timeout = "0s"
log-queries-after = "0s"
max-select-point = 0
max-select-series = 0
max-select-buckets = 0
[retention]
enabled = true
check-interval = "30m0s"
[admin]
enabled = true
bind-address = ":8083"
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"
[shard-precreation]
enabled = true
check-interval = "10m0s"
advance-period = "30m0s"
[monitor]
store-enabled = true
store-database = "_internal"
store-interval = "10s"
[subscriber]
enabled = true
http-timeout = "30s"
insecure-skip-verify = false
ca-certs = ""
write-concurrency = 40
write-buffer-size = 1000
[http]
enabled = true
bind-address = ":8086"
auth-enabled = false
log-enabled = true
write-tracing = false
pprof-enabled = false
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"
https-private-key = ""
max-row-limit = 10000
max-connection-limit = 0
shared-secret = ""
realm = "InfluxDB"
unix-socket-enabled = false
bind-socket = "/var/run/influxdb.sock"
[[graphite]]
enabled = false
bind-address = ":2003"
database = "graphite"
retention-policy = ""
protocol = "tcp"
batch-size = 5000
batch-pending = 10
batch-timeout = "1s"
consistency-level = "one"
separator = "."
udp-read-buffer = 0
[[collectd]]
enabled = false
bind-address = ":25826"
database = "collectd"
retention-policy = ""
batch-size = 5000
batch-pending = 10
batch-timeout = "10s"
read-buffer = 0
typesdb = "/usr/share/collectd/types.db"
[[opentsdb]]
enabled = false
bind-address = ":4242"
database = "opentsdb"
retention-policy = ""
consistency-level = "one"
tls-enabled = false
certificate = "/etc/ssl/influxdb.pem"
batch-size = 1000
batch-pending = 5
batch-timeout = "1s"
log-point-errors = true
[[udp]]
enabled = false
bind-address = ":8089"
database = "udp"
retention-policy = ""
batch-size = 5000
batch-pending = 10
read-buffer = 0
batch-timeout = "1s"
precision = ""
[continuous_queries]
log-enabled = true
enabled = true
run-interval = "1s"
influxdb-deployment.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: monitoring-influxdb
namespace: kube-system
spec:
replicas: 1
template:
metadata:
labels:
task: monitoring
k8s-app: influxdb
spec:
containers:
- name: influxdb
image: lvanneo/heapster-influxdb-amd64:v1.1.1
volumeMounts:
- mountPath: /data
name: influxdb-storage
- mountPath: /etc/
name: influxdb-config
volumes:
- name: influxdb-storage
emptyDir: {}
- name: influxdb-config
configMap:
name: influxdb-config
influxdb-service.yaml

apiVersion: v1
kind: Service
metadata:
labels:
task: monitoring
# For use as a Cluster add-on (https://github.com/kubernetes/kubernetes/tree/master/cluster/addons)
# If you are NOT using this as an addon, you should comment out this line.
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: monitoring-influxdb
name: monitoring-influxdb
namespace: kube-system
spec:
type: NodePort
ports:
- port: 8086
targetPort: 8086
name: http
- port: 8083
targetPort: 8083
name: admin
selector:
k8s-app: influxdb
11-部署EFK插件

部署 EFK 插件
官方文件目录:kubernetes/cluster/addons/fluentd-elasticsearch
$ ls*.yaml

es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml

新加了 es-rbac.yaml 和 fluentd-es-rbac.yaml 文件,定义了
elasticsearch 和 fluentd 使用的 Role 和 RoleBinding;

已经修改好的 yaml 文件见:EFK
配置 es-controller.yaml
$ diff es-controller.yaml.origes-controller.yaml

22a23

> serviceAccountName: elasticsearch

24c25

< - image: gcr.io/google_containers/elasticsearch:v2.4.1-2

---

> - image: onlyerich/elasticsearch:v2.4.1-2
配置 es-service.yaml
无需配置;
配置 fluentd-es-ds.yaml
$ diff fluentd-es-ds.yaml.origfluentd-es-ds.yaml

23a24

> serviceAccountName: fluentd

26c27

< image:gcr.io/google_containers/fluentd-elasticsearch:1.22

---

> image:onlyerich/fluentd-elasticsearch:1.22
配置 kibana-controller.yaml
$ diff kibana-controller.yaml.origkibana-controller.yaml

22c22

< image:gcr.io/google_containers/kibana:v4.6.1-1

---

> image: onlyerich/kibana:v4.6.1-1
给 Node 设置标签
DaemonSet fluentd-es-v1.22 只会调度到设置了标签 beta.kubernetes.io/fluentd-ds-ready=true 的
Node,需要在期望运行 fluentd的 Node 上设置该标签;

$ kubectl get nodes

NAME STATUS AGE VERSION

10.64.3.7 Ready 1d v1.6.2
$ kubectl label nodes 10.64.3.7beta.kubernetes.io/fluentd-ds-ready=true

node "10.64.3.7" labeled
执行定义文件
$pwd

/root/kubernetes/cluster/addons/fluentd-elasticsearch

$ ls *.yaml

es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml

$ kubectl create -f .

$
检查执行结果
$ kubectl get deployment -nkube-system|grep
kibana

kibana-logging 1 1 1 1 2m
$ kubectl get pods -n kube-system|grep
-E 'elasticsearch|fluentd|kibana'

elasticsearch-logging-v1-kwc9w 1/1 Running 0 4m

elasticsearch-logging-v1-ws9mk 1/1 Running 0 4m

fluentd-es-v1.22-g76x0 1/1 Running 0 4m

kibana-logging-324921636-ph7sn 1/1 Running 0 4m
$ kubectl get service -n kube-system|grep-E'elasticsearch|kibana'

elasticsearch-logging 10.254.128.156 <none> 9200/TCP 3m

kibana-logging 10.254.88.109 <none> 5601/TCP
3m
kibana Pod 第一次启动时会用**较长时间(10-20分钟)**来优化和 Cache状态页面,可以 tailf 该 Pod 的日志观察进度:
$ kubectl logskibana-logging-324921636-ph7sn -n kube-system-f

ELASTICSEARCH_URL=http://elasticsearch-logging:9200

server.basePath:/api/v1/proxy/namespaces/kube-system/services/kibana-logging

{"type":"log","@timestamp":"2017-04-08T09:30:30Z","tags":["info","optimize"],"pid":7,"message":"Optimizingand
caching bundles for kibana and statusPage. This may take a fewminutes"}

{"type":"log","@timestamp":"2017-04-08T09:44:01Z","tags":["info","optimize"],"pid":7,"message":"Optimizationof
bundles for kibana and statusPage complete in 811.00 seconds"}

{"type":"log","@timestamp":"2017-04-08T09:44:02Z","tags":["status","plugin:kibana@1.0.0","info"],"pid":7,"state":"green","message":"Statuschanged
from uninitialized to green - Ready","prevState":"uninitialized","prevMsg":"uninitialized"}
访问 kibana

通过 kube-apiserver 访问:

获取 monitoring-grafana 服务 URL

$ kubectl cluster-info

Kubernetes master is running at https://10.64.3.7:6443

Elasticsearch is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging

Heapster is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster

Kibana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging

KubeDNS is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns

kubernetes-dashboard is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

monitoring-grafana is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

monitoring-influxdb is running at https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb

由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:

浏览器访问 URL: http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/kibana-logging

通过 kubectl proxy 访问:

创建代理

$ kubectl proxy --address='10.64.3.7'
--port=8086 --accept-hosts='^*$'

Starting to serve on 10.64.3.7:8086

浏览器访问 URL:http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging

在 Settings -> Indices页面创建一个 index(相当于 mysql 中的一个 database),选中 Index contains time-based events,使用默认的 logstash-* pattern,点击 Create ;



创建Index后,稍等几分钟就可以在 Discover 菜单下看到 ElasticSearchlogging 中汇聚的日志;



附件:
es-controller.yaml

apiVersion: v1
kind: ReplicationController
metadata:
name: elasticsearch-logging-v1
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
version: v1
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 2
selector:
k8s-app: elasticsearch-logging
version: v1
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: v1
kubernetes.io/cluster-service: "true"
spec:
serviceAccountName: elasticsearch
containers:
- image: onlyerich/elasticsearch:v2.4.1-2
name: elasticsearch-logging
resources:
# need more cpu upon initialization, therefore burstable class
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- name: es-persistent-storage
mountPath: /data
env:
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumes:
- name: es-persistent-storage
emptyDir: {}
es-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: elasticsearch
namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: elasticsearch
subjects:
- kind: ServiceAccount
name: elasticsearch
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
es-service.yaml

apiVersion: v1
kind: Service
metadata:
name: elasticsearch-logging
namespace: kube-system
labels:
k8s-app: elasticsearch-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Elasticsearch"
spec:
ports:
- port: 9200
protocol: TCP
targetPort: db
selector:
k8s-app: elasticsearch-logging
fluentd-es-ds.yaml

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: fluentd-es-v1.22
namespace: kube-system
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
version: v1.22
spec:
template:
metadata:
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
version: v1.22
# This annotation ensures that fluentd does not get evicted if the node
# supports critical pod annotation based priority scheme.
# Note that this does not guarantee admission on the nodes (#40573).
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
serviceAccountName: fluentd
containers:
- name: fluentd-es
image: onlyerich/fluentd-elasticsearch:1.22
command:
- '/bin/sh'
- '-c'
- '/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log'
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
nodeSelector:
beta.kubernetes.io/fluentd-ds-ready: "true"
tolerations:
- key : "node.alpha.kubernetes.io/ismaster"
effect: "NoSchedule"
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
fluentd-es-rbac.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
namespace: kube-system

---

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1alpha1
metadata:
name: fluentd
subjects:
- kind: ServiceAccount
name: fluentd
namespace: kube-system
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
kibana-controller.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 1
selector:
matchLabels:
k8s-app: kibana-logging
template:
metadata:
labels:
k8s-app: kibana-logging
spec:
containers:
- name: kibana-logging
image: onlyerich/kibana:v4.6.1-1
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
requests:
cpu: 100m
env:
- name: "ELASTICSEARCH_URL"
value: "http://elasticsearch-logging:9200"
- name: "KIBANA_BASE_URL"
value: "/api/v1/proxy/namespaces/kube-system/services/kibana-logging"
ports:
- containerPort: 5601
name: ui
protocol: TCP
kibana-service.yaml

apiVersion: v1
kind: Service
metadata:
name: kibana-logging
namespace: kube-system
labels:
k8s-app: kibana-logging
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "Kibana"
spec:
ports:
- port: 5601
protocol: TCP
targetPort: ui
selector:
k8s-app: kibana-logging
12-部署Docker-Registry

部署私有 docker registry
注意:本文档介绍使用 docker 官方的 registry v2镜像部署私有仓库的步骤,你也可以部署 Harbor 私有仓库(部署Harbor
私有仓库)。

本文档讲解部署一个 TLS 加密、HTTP Basic 认证、用 ceph rgw做后端存储的私有 docker registry 步骤,如果使用其它类型的后端存储,则可以从 “创建 docker registry” 节开始;
示例两台机器 IP 如下:

ceph rgw: 10.64.3.9
docker registry: 10.64.3.7

部署 ceph RGW 节点
$ ceph-deploy rgw create 10.64.3.9# rgw 默认监听7480端口

$

创建测试账号 demo
$ radosgw-admin user create --uid=demo--display-name="cephrgw demo user"

$

创建 demo 账号的子账号 swift
当前 registry 只支持使用 swift 协议访问 ceph rgw 存储,暂时不支持s3 协议;
$ radosgw-admin subuser create --uid demo--subuser=demo:swift --access=full --secret=secretkey --key-type=swift

$
创建 demo:swift 子账号的 sercret key
$ radosgw-admin key create--subuser=demo:swift --key-type=swift --gen-secret

{

"user_id":"demo",

"display_name":"cephrgw demo user",

"email":"",

"suspended": 0,

"max_buckets": 1000,

"auid": 0,

"subusers": [

{

"id":"demo:swift",

"permissions":"full-control"

}

],

"keys": [

{

"user":"demo",

"access_key":"5Y1B1SIJ2YHKEHO5U36B",

"secret_key":"nrIvtPqUj7pUlccLYPuR3ntVzIa50DToIpe7xFjT"

}

],

"swift_keys": [

{

"user":"demo:swift",

"secret_key":"aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh"

}

],

"caps": [],

"op_mask":"read,write, delete",

"default_placement":"",

"placement_tags": [],

"bucket_quota": {

"enabled": false,

"max_size_kb": -1,

"max_objects": -1

},

"user_quota": {

"enabled": false,

"max_size_kb": -1,

"max_objects": -1

},

"temp_url_keys": []

}

aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh 为子账号 demo:swift 的 secret key;

创建 docker registry
创建 registry 使用的 TLS 证书
$ mdir -p registry/{auth,certs}

$ cat registry-csr.json

{

"CN":"registry",

"hosts": [

"127.0.0.1",

"10.64.3.7"

],

"key": {

"algo":"rsa",

"size": 2048

},

"names": [

{

"C":"CN",

"ST":"BeiJing",

"L":"BeiJing",

"O":"k8s",

"OU":"System"

}

]

}

$ cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem \

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes registry-csr.json| cfssljson -bare registry

$ cp registry.pem registry-key.pem registry/certs

$

这里复用以前创建的 CA 证书和秘钥文件;
hosts 字段指定 registry 的 NodeIP;

创建 HTTP Baisc 认证文件
$ docker run --entrypoint htpasswdregistry:2 -Bbn foo foo123 > auth/htpasswd

$ catauth/htpasswd

foo:$2y$05$I60z69MdluAQ8i1Ka3x3Neb332yz1ioow2C4oroZSOE0fqPogAmZm

配置 registry 参数
$ exportRGW_AUTH_URL="http://10.64.3.9:7480/auth/v1"

$ export RGW_USER="demo:swift"

$ export RGW_SECRET_KEY="aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh"

$ cat > config.yml<< EOF

# https://docs.docker.com/registry/configuration/#list-of-configuration-options

version: 0.1

log:

level: info

fromatter: text

fields:

service: registry

storage:

cache:

blobdescriptor: inmemory

delete:

enabled: true

swift:

authurl: ${RGW_AUTH_URL}

username: ${RGW_USER}

password: ${RGW_SECRET_KEY}

container: registry

auth:

htpasswd:

realm: basic-realm

path: /auth/htpasswd
http:

addr: 0.0.0.0:8000

headers:

X-Content-Type-Options:[nosniff]

tls:

certificate:/certs/registry.pem

key: /certs/registry-key.pem
health:

storagedriver:

enabled: true

interval: 10s

threshold: 3

EOF

storage.swift 指定后端使用 swfit 接口协议的存储,这里配置的是 ceph rgw 存储参数;
auth.htpasswd 指定了 HTTP Basic 认证的 token 文件路径;
http.tls 指定了 registry http 服务器的证书和秘钥文件路径;

创建 docker registry
$ docker run -d -p 8000:8000\

-v $(pwd)/registry/auth/:/auth\

-v $(pwd)/registry/certs:/certs\

-v $(pwd)/config.yml:/etc/docker/registry/config.yml\

--name registry registry:2

执行该 docker run 命令的机器 IP 为 10.64.3.7;

向 registry push image
将签署 registry 证书的 CA证书拷贝到 /etc/docker/certs.d/10.64.3.7:8000 目录下
$ sudo mkdir -p/etc/docker/certs.d/10.64.3.7:8000

$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7:8000/ca.crt

$
登陆私有 registry
$ docker login 10.64.3.7:8000

Username: foo

Password:

Login Succeeded
登陆信息被写入 ~/.docker/config.json 文件
$ cat ~/.docker/config.json

{

"auths": {

"10.64.3.7:8000": {

"auth":"Zm9vOmZvbzEyMw=="

}

}

}

将本地的 image 打上私有 registry 的 tag
$ docker tagdocker.io/kubernetes/pause 10.64.3.7:8000/zhangjun3/pause

$ docker images |greppause

docker.io/kubernetes/pause latest f9d5de079539 2 years ago 239.8kB

10.64.3.7:8000/zhangjun3/pause latest f9d5de079539 2 years ago 239.8 kB

将 image push 到私有 registry
$ docker push10.64.3.7:8000/zhangjun3/pause

The push refers to a repository[10.64.3.7:8000/zhangjun3/pause]

5f70bf18a086: Pushed

e16a89738269: Pushed

latest: digest:sha256:9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359 size:916
查看 ceph 上是否已经有 push 的 pause 容器文件
$ radoslspools

rbd

.rgw.root

default.rgw.control

default.rgw.data.root

default.rgw.gc

default.rgw.log

default.rgw.users.uid

default.rgw.users.keys

default.rgw.users.swift

default.rgw.buckets.index

default.rgw.buckets.data
$ rados --pooldefault.rgw.buckets.data ls|greppause

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f9d5de0795395db6c50cb1ac82ebed1bd8eb3eefcebb1aa724e01239594e937b/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f72a00a23f01987b42cb26f259582bb33502bdb0fcf5011e03c60577c4284845/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/current/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/index/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/revisions/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link

私有 registry 的运维操作
查询私有镜像中的 images
$ curl --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/_catalog

{"repositories":["library/redis","zhangjun3/busybox","zhangjun3/pause","zhangjun3/pause2"]}

查询某个镜像的 tags 列表
$ curl --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/tags/list

{"name":"zhangjun3/busybox","tags":["latest"]}

获取 image 或 layer 的 digest
向 v2/<repoName>/manifests/<tagName> 发 GET 请求,从响应的头部 Docker-Content-Digest 获取 image digest,从响应的body 的 fsLayers.blobSum 中获取 layDigests;
注意,必须包含请求头:Accept:application/vnd.docker.distribution.manifest.v2+json:
$ curl -v -H "Accept:application/vnd.docker.distribution.manifest.v2+json" --user zhangjun3:xxx
--cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt https://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/latest

> GET /v2/zhangjun3/busybox/manifests/latest HTTP/1.1

> User-Agent: curl/7.29.0

> Host: 10.64.3.7:8000

> Accept:application/vnd.docker.distribution.manifest.v2+json

>

< HTTP/1.1 200 OK

< Content-Length: 527

< Content-Type:application/vnd.docker.distribution.manifest.v2+json

< Docker-Content-Digest:sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5

< Docker-Distribution-Api-Version:registry/2.0

< Etag:"sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5"

< X-Content-Type-Options: nosniff

< Date: Tue, 21 Mar 2017 15:19:42GMT

<

{

"schemaVersion": 2,

"mediaType":"application/vnd.docker.distribution.manifest.v2+json",

"config": {

"mediaType":"application/vnd.docker.container.image.v1+json",

"size": 1465,

"digest":"sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff"

},

"layers": [

{

"mediaType":"application/vnd.docker.image.rootfs.diff.tar.gzip",

"size": 701102,

"digest":"sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515"

}

]

}

删除 image
向 /v2/<name>/manifests/<reference> 发送 DELETE 请求,reference为上一步返回的 Docker-Content-Digest 字段内容:
$ curl -X DELETE --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5

$
删除 layer
向 /v2/<name>/blobs/<digest>发送 DELETE 请求,其中 digest是上一步返回的 fsLayers.blobSum 字段内容:
$ curl -X DELETE --user zhangjun3:xxx --cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crthttps://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4

$ curl -X DELETE --cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt https://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515

附件:
config.yml 我挂载的本地路径

version: 0.1

log:

level: info

fromatter: text

fields:

service: registry

storage:

cache:

blobdescriptor: inmemory

delete:

enabled: true

filesystem:

rootdirectory: /var/lib/registry

auth:

htpasswd:

realm: basic-realm

path: /auth/htpasswd

http:

addr: 0.0.0.0:8000

headers:

X-Content-Type-Options: [nosniff]

tls:

certificate: /certs/registry.pem

key: /certs/registry-key.pem

health:

storagedriver:

enabled: true

interval: 10s

threshold: 3

13-部署harbor私有仓库

部署 harbor 私有仓库
本文档介绍使用 docker-compose 部署 harbor私有仓库的步骤,你也可以使用 docker 官方的 registry 镜像部署私有仓库(部署Docker
Registry)。
使用的变量
本文档用到的变量定义如下:
$exportNODE_IP=10.64.3.7#
当前部署harbor 的节点 IP

$
下载文件
从 docker compose 发布页面下载最新的 docker-compose 二进制文件
$ wgethttps://github.com/docker/compose/releases/download/1.12.0/docker-compose-Linux-x86_64

$ mv ~/docker-compose-Linux-x86_64/root/local/bin/docker-compose

$ chmod a+x /root/local/bin/docker-compose

$ export PATH=/root/local/bin:$PATH

$
从 harbor 发布页面下载最新的 harbor 离线安装包
$ wget --continuehttps://github.com/vmware/harbor/releases/download/v1.1.0/harbor-offline-installer-v1.1.0.tgz

$ tar -xzvf harbor-offline-installer-v1.1.0.tgz

$ cd harbor

$
导入 docker images
导入离线安装包中 harbor 相关的 docker images:

$ docker load -i harbor.v1.1.0.tar.gz

$
创建 harbor nginx 服务器使用的 TLS 证书
创建 harbor 证书签名请求:
$ cat>harbor-csr.json<<EOF

{

"CN": "harbor",

"hosts": [

"127.0.0.1",

"$NODE_IP"

],

"key": {

"algo": "rsa",

"size": 2048

},

"names": [

{

"C": "CN",

"ST": "BeiJing",

"L": "BeiJing",

"O": "k8s",

"OU": "System"

}

]

}

EOF

hosts 字段指定授权使用该证书的当前部署节点 IP,如果后续使用域名访问 harbor则还需要添加域名;

生成 harbor 证书和私钥:
$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes harbor-csr.json | cfssljson -bare harbor

$ ls harbor*

harbor.csr harbor-csr.json harbor-key.pem harbor.pem

$ sudo mkdir -p /etc/harbor/ssl

$ sudo mv harbor*.pem /etc/harbor/ssl

$ rm harbor.csr harbor-csr.json
修改 harbor.cfg 文件
$ diff harbor.cfg.origharbor.cfg

5c5

< hostname =reg.mydomain.com

---

> hostname = 10.64.3.7

9c9

< ui_url_protocol = http

---

> ui_url_protocol =https

24,25c24,25

< ssl_cert =/data/cert/server.crt

< ssl_cert_key =/data/cert/server.key

---

> ssl_cert =/etc/harbor/ssl/harbor.pem

> ssl_cert_key =/etc/harbor/ssl/harbor-key.pem
加载和启动 harbor 镜像

mkdir -p /data

$ ./install.sh

[Step 0]: checking installation environment ...

Note: docker version: 17.04.0

Note: docker-compose version: 1.12.0

[Step 1]: loading Harbor images ...

Loaded image: vmware/harbor-adminserver:v1.1.0

Loaded image: vmware/harbor-ui:v1.1.0

Loaded image: vmware/harbor-log:v1.1.0

Loaded image: vmware/harbor-jobservice:v1.1.0

Loaded image: vmware/registry:photon-2.6.0

Loaded image: vmware/harbor-notary-db:mariadb-10.1.10

Loaded image: vmware/harbor-db:v1.1.0

Loaded image: vmware/nginx:1.11.5-patched

Loaded image: photon:1.0

Loaded image: vmware/notary-photon:server-0.5.0

Loaded image: vmware/notary-photon:signer-0.5.0

[Step 2]: preparing environment ...

Generated and saved secret to file: /data/secretkey

Generated configuration file: ./common/config/nginx/nginx.conf

Generated configuration file: ./common/config/adminserver/env

Generated configuration file: ./common/config/ui/env

Generated configuration file:./common/config/registry/config.yml

Generated configuration file: ./common/config/db/env

Generated configuration file: ./common/config/jobservice/env

Generated configuration file:./common/config/jobservice/app.conf

Generated configuration file: ./common/config/ui/app.conf

Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt

The configuration files are ready, please use docker-compose to start theservice.

[Step 3]: checking existing instance of Harbor ...
[Step 4]: starting Harbor...

Creating network "harbor_harbor" with the default driver

Creating harbor-log

Creating registry

Creating harbor-adminserver

Creating harbor-db

Creating harbor-ui

Creating harbor-jobservice

Creating nginx

✔ ----Harbor has been installed and startedsuccessfully.----
Now you should be able to visit theadmin portal athttps://10.64.3.7.

For more details, please visit https://github.com/vmware/harbor.
访问管理界面
浏览器访问 https://${NODE_IP},示例的是 https://10.64.3.7
用账号 admin 和 harbor.cfg配置文件中的默认密码 Harbor12345 登陆系统:



harbor 运行时产生的文件、目录
$# 日志目录

$ ls /var/log/harbor/2017-04-19/

adminserver.log jobservice.log mysql.log proxy.log registry.log ui.log

$ # 数据目录,包括数据库、镜像仓库

$ ls /data/

ca_download config database job_logs registry secretkey
docker 客户端登陆
将签署 harbor 证书的 CA证书拷贝到 /etc/docker/certs.d/10.64.3.7 目录下

$ sudo mkdir -p /etc/docker/certs.d/10.64.3.7

$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7/ca.crt

$
登陆 harbor

$ docker login 10.64.3.7

Username: admin

Password:
认证信息自动保存到 ~/.docker/config.json 文件。
其它操作
下列操作的工作目录均为 解压离线安装文件后 生成的 harbor 目录。
$# 停止 harbor

$ docker-compose down -v

$ # 修改配置

$ vim harbor.cfg

$ # 更修改的配置更新到docker-compose.yml 文件

[root@tjwq01-sys-bs003007 harbor]# ./prepare

Clearing the configuration file: ./common/config/ui/app.conf

Clearing the configuration file: ./common/config/ui/env

Clearing the configuration file:./common/config/ui/private_key.pem

Clearing the configuration file: ./common/config/db/env

Clearing the configuration file:./common/config/registry/root.crt

Clearing the configuration file:./common/config/registry/config.yml

Clearing the configuration file:./common/config/jobservice/app.conf

Clearing the configuration file: ./common/config/jobservice/env

Clearing the configuration file:./common/config/nginx/cert/admin.pem

Clearing the configuration file:./common/config/nginx/cert/admin-key.pem

Clearing the configuration file: ./common/config/nginx/nginx.conf

Clearing the configuration file: ./common/config/adminserver/env

loaded secret from file: /data/secretkey

Generated configuration file: ./common/config/nginx/nginx.conf

Generated configuration file: ./common/config/adminserver/env

Generated configuration file: ./common/config/ui/env

Generated configuration file:./common/config/registry/config.yml

Generated configuration file: ./common/config/db/env

Generated configuration file: ./common/config/jobservice/env

Generated configuration file:./common/config/jobservice/app.conf

Generated configuration file: ./common/config/ui/app.conf

Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt

The configuration files are ready, please use docker-compose to start theservice.

$ # 启动 harbor

[root@tjwq01-sys-bs003007 harbor]# docker-compose up -d

附件:
harbor.cfg

hostname = 192.168.1.206
ui_url_protocol = https
db_password = root123
max_job_workers = 3
customize_crt = on
ssl_cert = /etc/harbor/ssl/harbor.pem
ssl_cert_key = /etc/harbor/ssl/harbor-key.pem
secretkey_path = /data
admiral_url = NA
email_identity =
email_server = smtp.mydomain.com
email_server_port = 25
email_username = sample_admin@mydomain.com
email_password = abc
email_from = admin <sample_admin@mydomain.com>
email_ssl = false
harbor_admin_password = Harbor12345
auth_mode = db_auth
ldap_url = ldaps://ldap.mydomain.com
ldap_basedn = ou=people,dc=mydomain,dc=com
ldap_uid = uid
ldap_scope = 3
ldap_timeout = 5
self_registration = on
token_expiration = 30
project_creation_restriction = everyone
verify_remote_cert = on
14-清理集群

清理集群

清理 Node 节点

停相关进程:

$ sudo systemctl stop kubelet kube-proxy flannelddocker

$
清理文件:
$# umount kubelet 挂载的目录

$ mount | grep'/var/lib/kubelet'|
awk'{print $3}'|xargs sudo umount

$ # 删除 kubelet 工作目录

$ sudo rm -rf /var/lib/kubelet

$ # 删除 docker 工作目录

$ sudo rm -rf /var/lib/docker

$ # 删除 flanneld 写入的网络配置文件

$ sudo rm -rf /var/run/flannel/

$ # 删除 docker 的一些运行文件

$ sudo rm -rf /var/run/docker/

$ # 删除 systemd unit 文件

$ sudo rm -rf /etc/systemd/system/{kubelet,docker,flanneld}.service

$ # 删除程序文件

$ sudo rm -rf /root/local/bin/{kubelet,docker,flanneld}

$ # 删除证书文件

$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl

$
清理 kube-proxy 和 docker 创建的 iptables:
$ sudo iptables -F&& sudo iptables
-X&& sudo iptables -F -t nat&& sudo
iptables -X -t nat

$
删除 flanneld 和 docker 创建的网桥:

$ ip link del flannel.1

$ ip link del docker0

$
清理 Master 节点
停相关进程:

$ sudo systemctl stop kube-apiserverkube-controller-manager kube-scheduler

$
清理文件:
$# 删除 kube-apiserver 工作目录

$ sudo rm -rf /var/run/kubernetes

$ # 删除 systemd unit 文件

$ sudo rm -rf/etc/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler}.service

$ # 删除程序文件

$ sudo rm -rf/root/local/bin/{kube-apiserver,kube-controller-manager,kube-scheduler}

$ # 删除证书文件

$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl

$
清理 etcd 集群
停相关进程:

$ sudo systemctl stop etcd

$
清理文件:
$# 删除 etcd 的工作目录和数据目录

$ sudo rm -rf /var/lib/etcd

$ # 删除 systemd unit 文件

$ sudo rm -rf /etc/systemd/system/etcd.service

$ # 删除程序文件

$ sudo rm -rf /root/local/bin/etcd

$ # 删除 TLS 证书文件

$ sudo rm -rf /etc/etcd/ssl/*
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息