kubernetes的官方文档,目前官方维护的最老的1.13的版本里面,也已经无法找到当时我在1.9里面参照的Creating a Custom Cluster from Scratch
那篇文档了,基本上能看出来趋势是希望采用kubeadm这个工具来初始化集群。但是其实二进制模式的安装还是可行,而且个人意见:二进制安装能增强维护人员对k8s集群的细节了解程度,并且能在解决安装时遇到各种问题的情况下增加对k8s管理知识的了解。
关于文档,主体流程和重点部分可以参照 kubeadm 实施流程文档
items | version |
---|---|
OS | centos7 |
kubernetes | 1.17 |
docker | 19.03.5-ce |
etcd | v3.3.18 |
角色 | ip address | 服务 | comment |
---|---|---|---|
master01,etcd01 | 192.168.33.101 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd,docker | 主节点01,etcd节点01 |
master02,etcd02 | 192.168.33.102 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd,docker | 主节点02,etcd节点02 |
master03,etcd03 | 192.168.33.103 | kube-apiserver,kube-controller-manager,kube-scheduler,etcd,docker | 主节点03,etcd节点03 |
node01 | 192.168.33.104 | kubelet,kube-proxy,docker | node节点01 |
node02 | 192.168.33.105 | kubelet,kube-proxy,docker | node节点02 |
node03 | 192.168.33.106 | kubelet,kube-proxy,docker | node节点03 |
名称 | 网段范围 |
---|---|
pod | 10.5.0.0/16 |
service | 10.254.0.0/16 |
宿主机 | 192.168.33.0/24 |
# 安装必要的工具包
yum install -y wget vim
# 确保mac地址和product_uuid不重复
# 查看mac地址
ip link
# 查看uuid
cat /sys/class/dmi/id/product_uuid
# nftables驱动的防火墙管理工具和kube-proxy不兼容,所以需要换回老版本的iptables
systemctl stop firewalld
systemctl disable firewalld
systemctl mask firewalld
yum install -y iptables iptables-services
systemctl disable iptables
systemctl stop iptables
# 安装期间临时关闭防火墙,正式运行需开放api等服务的端口
# 关闭selinux
sed -i "s/SELINUX=enforcing/SELINUX=disabled/g" /etc/selinux/config
setenforce 0
# 设定hostname到hosts文件中
cat << EOF >> /etc/hosts
192.168.33.101 master01
192.168.33.102 master02
192.168.33.103 master03
192.168.33.101 etcd01
192.168.33.102 etcd02
192.168.33.103 etcd03
192.168.33.104 node01
192.168.33.105 node02
192.168.33.106 node03
EOF
# 关闭系统swap
swapoff -a
# 注释swap的开机挂载项,修改/etc/fstab
sed -ri "s|(^ ?+\/.*swap.*$)|#\1|g" /etc/fstab
# 关闭系统swap,是为了严格的按照cpu和内存的限制,这样scheduler在规划pod的时候就不会把pod放进swap中了,这是为了性能考虑。
# 加载内核模块br_netfilter
lsmod | grep br_netfilter
[ $? -eq 0 ] || modprobe br_netfilter
# 优化系统内核
cat << EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
参照 docker installation on centos 7 这篇文档来安装
重点关注:
- 使用overlay2存储驱动
- 使用systemd替代cgroupfs作为cgroup管理器
# 各角色ip变量
declare -A IP_LIST
IP_LIST=(
[master01]="192.168.33.101" \
[master02]="192.168.33.102" \
[master03]="192.168.33.103" \
[etcd01]="192.168.33.101" \
[etcd02]="192.168.33.102" \
[etcd03]="192.168.33.103" \
[node01]="192.168.33.104" \
[node02]="192.168.33.105" \
[node03]="192.168.33.106")
KUBE_API_PROXY_IP=192.168.33.101
# 安装环境变量
DEPLOY_DIR=/root/k8s
K8S_VER=v1.17.0
ETCD_VER=v3.3.18
# 证书环境变量
K8S_PKI_DIR=/etc/kubernetes/pki
ETCD_PKI_DIR=/etc/etcd/pki
ADMIN_KUBECONFIG_DIR=/root/.kube
KUBECONFIG_DIR=/etc/kubernetes/kubeconfig
# IP配置变量
SERVICE_CLUSTER_IP_RANGE=10.254.0.0/16
SERVICE_NODE_PORT_RANGE=30000-32767
POD_CLUSTER_IP_RANGE=10.5.0.0/16
这里我选择了master01机器上来部署高可用服务,当然你可以选择任意你希望部署的一台机器。
DOCKER_YML_DIR=/data/docker/yml
DOCKER_RUNTIME_DIR=/data/docker/runtime
mkdir -p ${DOCKER_YML_DIR}
cat << EOF > ${DOCKER_YML_DIR}/docker-compose-haproxy.yml
version: '3'
services:
haproxy:
container_name: haproxy-kube-apiserver
image: haproxy
ports:
- 443:6443
volumes:
- /data/docker/runtime/haproxy/etc/haproxy.cfg:/usr/local/etc/haproxy/haproxy.cfg
EOF
mkdir -p ${DOCKER_RUNTIME_DIR}/haproxy/etc
cat << EOF > ${DOCKER_RUNTIME_DIR}/haproxy/etc/haproxy.cfg
frontend k8s-api
bind 0.0.0.0:6443
mode tcp
option tcplog
timeout client 1h
default_backend k8s-api
backend k8s-api
mode tcp
timeout server 1h
option tcplog
option tcp-check
balance roundrobin
default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100
server k8s-api-1 ${IP_LIST["master01"]}:6443 check
server k8s-api-2 ${IP_LIST["master02"]}:6443 check
server k8s-api-3 ${IP_LIST["master03"]}:6443 check
EOF
docker-compose -f /data/docker/yml/docker-compose-haproxy.yml up -d
master01上准备二进制文件,统一下发给所有其他机器,所以提前做好ssh信任
mkdir -p ${DEPLOY_DIR}/{node,master,etcd,cni}/bin
cd ${DEPLOY_DIR}
# 下载kubernetes
wget https://dl.k8s.io/${K8S_VER}/kubernetes-server-linux-amd64.tar.gz -O ${DEPLOY_DIR}/kubernetes-server-linux-amd64.tar.gz
tar zxvf kubernetes-server-linux-amd64.tar.gz
cp ${DEPLOY_DIR}/kubernetes/server/bin/{kube-apiserver,kube-scheduler,kube-controller-manager,kubectl} ${DEPLOY_DIR}/master/bin
cp ${DEPLOY_DIR}/kubernetes/server/bin/{kubelet,kube-proxy} ${DEPLOY_DIR}/node/bin
# 下载etcd
curl -L https://github.com/coreos/etcd/releases/download/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz \
-o etcd-${ETCD_VER}-linux-amd64.tar.gz
tar xzvf etcd-${ETCD_VER}-linux-amd64.tar.gz
cp etcd-${ETCD_VER}-linux-amd64/{etcd,etcdctl} ${DEPLOY_DIR}/etcd/bin/
# 下载cni
wget https://github.com/containernetworking/plugins/releases/download/v0.8.4/cni-plugins-linux-amd64-v0.8.4.tgz
tar zxvf cni-plugins-linux-amd64-v0.8.4.tgz -C ${DEPLOY_DIR}/cni/bin/
chmod +x ${DEPLOY_DIR}/etcd/bin/*
chmod +x ${DEPLOY_DIR}/node/bin/*
chmod +x ${DEPLOY_DIR}/master/bin/*
# 下发master二进制文件
for master in {master01,master02,master03};do
rsync -av ${DEPLOY_DIR}/master/bin/* ${master}:/usr/local/bin/
done
# 下发node二进制文件
for node in {node01,node02,node03};do
rsync -av ${DEPLOY_DIR}/node/bin/* ${node}:/usr/local/bin/
done
# 下发etcd二进制文件
for etcd in {etcd01,etcd02,etcd03};do
rsync -av ${DEPLOY_DIR}/etcd/bin/* ${etcd}:/usr/local/bin/
done
# 下发cni二进制文件
for node in {node01,node02,node03};do
ssh root@${node} "mkdir -p /opt/cni/bin"
rsync -av ${DEPLOY_DIR}/cni/bin/* ${node}:/opt/cni/bin
done
创建认证文件之前,墙裂推荐阅读apiserver authentication documentation。
里面重点提到了:
这里是一个证书CSR的示例,可以通过CSR来设定CN和O
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names":[{
"C": "<country>",
"ST": "<state>",
"L": "<city>",
"O": "<organization>",
"OU": "<organization unit>"
}]
}
curl -s -L -o /usr/local/bin/cfssl https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl_1.4.1_linux_amd64
curl -s -L -o /usr/local/bin/cfssljson https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssljson_1.4.1_linux_amd64
curl -s -L -o /usr/local/bin/cfssl-certinfo https://github.com/cloudflare/cfssl/releases/download/v1.4.1/cfssl-certinfo_1.4.1_linux_amd64
chmod +x /usr/local/bin/*
export PATH=$PATH:/usr/local/bin
# 创建k8s-ssl目录
mkdir -p ${DEPLOY_DIR}/pki/{etcd,kubernetes}
# 此目录只是临时存放ca生成文件,可随意更换位置
因为issues 717: 错误提示hosts缺失问题,这里和官方文档不一样,从1.2升级到了1.4.1
cd ${DEPLOY_DIR}/pki/etcd
# step 1. 创建根CA
# 创建 ETCD CA 证书签名请求文件
cat > ca-csr.json << EOF
{
"CN": "etcd.local",
"key": {
"algo": "rsa",
"size": 4096
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "kubernetes",
"OU": "System"
}
]
}
EOF
# step 2. 签名证书
# 创建CA签名配置文件
# [issue]: 因为etcd开启--client-cert-auth选项,导致需要给serverde profile (client auth) 权限
# [issue-url]: https://github.com/etcd-io/etcd/issues/9785
cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"server": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
},
"client": {
"usages": [
"signing",
"key encipherment",
"client auth"
],
"expiry": "87600h"
},
"peer": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF
# step 3. 创建"证书签名请求"文件
# server限定etcd所有节点监听ip
cat > server-csr.json << EOF
{
"CN": "server",
"hosts": [
"127.0.0.1",
"${IP_LIST['etcd01']}",
"${IP_LIST['etcd02']}",
"${IP_LIST['etcd03']}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "kubernetes",
"OU": "System"
}
]
}
EOF
# client不限定签名ip
cat > client-csr.json << EOF
{
"CN": "client",
"hosts": [""],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
# peer限定签名etcd所有节点的通信ip
cat > peer-csr.json << EOF
{
"CN": "peer",
"hosts": [
"${IP_LIST['etcd01']}",
"${IP_LIST['etcd02']}",
"${IP_LIST['etcd03']}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "kubernetes",
"OU": "System"
}
]
}
EOF
# step 1. 生成 CA 证书和私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
# 生成文件:ca-key.pem ca.csr ca.pem
# step 2. 生成应CSR文件请求,使用CA签名过的证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server-csr.json | cfssljson -bare server
# 生成文件:server-key.pem server.csr server.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client client-csr.json | cfssljson -bare client
# 生成文件:client-key.pem client.csr client.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=peer peer-csr.json | cfssljson -bare peer
# 生成文件:peer-key.pem peer.csr peer.pem
cd ${DEPLOY_DIR}/pki/kubernetes
# step 1. 创建根CA
# 创建 K8S CA 证书签名请求文件
cat > ${DEPLOY_DIR}/pki/kubernetes/ca-csr.json << EOF
{
"CN": "kubernetes",
"key": {
"algo": "rsa",
"size": 4096
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "kubernetes",
"OU": "System"
}
]
}
EOF
# step 2. 签名证书
# 创建CA签名配置文件
cat > ${DEPLOY_DIR}/pki/kubernetes/ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "87600h"
},
"profiles": {
"server": {
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
],
"expiry": "87600h"
},
"client": {
"usages": [
"signing",
"key encipherment",
"client auth"
],
"expiry": "87600h"
}
}
}
}
EOF
# step 3. 创建"证书签名请求"文件
# kube-apiserver
# hosts内容:
# - HA所有监听ip、vip
# - --apiserver-advertise-address指定的ip
# - service网段第一个ip
# - k8s DNS域名
# - master节点名称
cat > ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver-csr.json << EOF
{
"CN": "kubernetes",
"hosts": [
"127.0.0.1",
"${IP_LIST['master01']}",
"${IP_LIST['master02']}",
"${IP_LIST['master03']}",
"${SERVICE_CLUSTER_IP_RANGE%.*}.1",
"kubernetes",
"kubernetes.default",
"kubernetes.default.svc",
"kubernetes.default.svc.cluster",
"kubernetes.default.svc.cluster.local"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "kubernetes",
"OU": "System"
}
]
}
EOF
# api-kubelet-client
cat > ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client.json << EOF
{
"CN": "system:kubelet-api-admin",
"hosts": [
"127.0.0.1",
"node01",
"node02",
"node03",
"${IP_LIST['master01']}",
"${IP_LIST['master02']}",
"${IP_LIST['master03']}",
"${IP_LIST['node01']}",
"${IP_LIST['node02']}",
"${IP_LIST['node03']}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:kubelet-api-admin",
"OU": "System"
}
]
}
EOF
# kube-controller-manager
# 注意点:
# - CN名称必须是: system:kube-controller-manager
cat > ${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager-csr.json << EOF
{
"CN": "system:kube-controller-manager",
"hosts": [
"127.0.0.1",
"${IP_LIST['master01']}",
"${IP_LIST['master02']}",
"${IP_LIST['master03']}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:kube-controller-manager",
"OU": "System"
}
]
}
EOF
# kube-scheduler
cat > ${DEPLOY_DIR}/pki/kubernetes/kube-scheduler-csr.json << EOF
{
"CN": "system:kube-scheduler",
"hosts": [
"127.0.0.1",
"${IP_LIST['master01']}",
"${IP_LIST['master02']}",
"${IP_LIST['master03']}"
],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:kube-scheduler",
"OU": "System"
}
]
}
EOF
# kube-proxy
cat > ${DEPLOY_DIR}/pki/kubernetes/kube-proxy-csr.json << EOF
{
"CN": "system:kube-proxy",
"hosts": [""],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:node-proxier",
"OU": "System"
}
]
}
EOF
# step 1. 生成 CA 证书和私钥
cfssl gencert -initca ca-csr.json | cfssljson -bare ca
# 生成文件: ca-key.pem ca.csr ca.pem
# step 2. 生成应CSR文件请求,使用CA签名过的证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server kube-apiserver-csr.json | cfssljson -bare kube-apiserver
# 生成文件: kube-apiserver-key.pem kube-apiserver.csr kube-apiserver.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server api-kubelet-client.json | cfssljson -bare api-kubelet-client
# 生成文件: api-kubelet-client-key.pem api-kubelet-client.csr api-kubelet-client.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager
# 生成文件: kube-controller-manager-key.pem kube-controller-manager.csr kube-controller-manager.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-scheduler-csr.json | cfssljson -bare kube-scheduler
# 生成文件: kube-scheduler-key.pem kube-scheduler.csr kube-scheduler.pem
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client kube-proxy-csr.json | cfssljson -bare kube-proxy
# 生成文件:kube-proxy-key.pem kube-proxy.csr kube-proxy.pem
cd ${DEPLOY_DIR}/pki/kubernetes
# step 1. 创建"证书签名请求"文件
cat > admin-csr.json << EOF
{
"CN": "admin",
"hosts": [""],
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "CN",
"ST": "BeiJing",
"L": "BeiJing",
"O": "system:masters",
"OU": "System"
}
]
}
EOF
# step 2. 生成应CSR文件请求,使用CA签名过的证书
cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=client admin-csr.json | cfssljson -bare admin
# 生成文件:admin-key.pem admin.csr admin.pem
cfssl-certinfo -cert server.pem
# 下发证书到master
# 证书对象
# - 根证书
# - 管理员证书
# - apiserver证书
# - apiserver-etcd-client证书
for master in {master01,master02,master03};do
ssh root@${master} "mkdir -p ${K8S_PKI_DIR}"
ssh root@${master} "mkdir -p ${ETCD_PKI_DIR}"
scp ${DEPLOY_DIR}/pki/kubernetes/{ca.pem,ca-key.pem,admin.pem,admin-key.pem} ${master}:${K8S_PKI_DIR}
scp ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver.pem ${master}:${K8S_PKI_DIR}/kube-apiserver.pem
scp ${DEPLOY_DIR}/pki/kubernetes/kube-apiserver-key.pem ${master}:${K8S_PKI_DIR}/kube-apiserver-key.pem
scp ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client.pem ${master}:${K8S_PKI_DIR}/api-kubelet-client.pem
scp ${DEPLOY_DIR}/pki/kubernetes/api-kubelet-client-key.pem ${master}:${K8S_PKI_DIR}/api-kubelet-client-key.pem
scp ${DEPLOY_DIR}/pki/etcd/ca.pem ${master}:${ETCD_PKI_DIR}
scp ${DEPLOY_DIR}/pki/etcd/client.pem ${master}:${K8S_PKI_DIR}/apiserver-etcd-client.pem
scp ${DEPLOY_DIR}/pki/etcd/client-key.pem ${master}:${K8S_PKI_DIR}/apiserver-etcd-client-key.pem
done
# 下发证书到node
for node in {node01,node02,node03};do
ssh root@${node} "mkdir -p ${K8S_PKI_DIR}"
scp ${DEPLOY_DIR}/pki/kubernetes/{ca.pem,api-kubelet-client.pem,api-kubelet-client-key.pem} ${node}:${K8S_PKI_DIR}
done
# 用于加密master和node通信,kubelet中需要提供ca.pem来认证,主要用于kubectl logs/exec
# 下发证书到etcd
for etcd in {etcd01,etcd02,etcd03};do
ssh root@${etcd} "mkdir -p ${ETCD_PKI_DIR}"
scp ${DEPLOY_DIR}/pki/etcd/{ca.pem,server.pem,server-key.pem,peer.pem,peer-key.pem} ${etcd}:${ETCD_PKI_DIR}
done
参照文档: kubernetes in hard way about kubeconfig
# 创建k8s-config目录
mkdir -p ${DEPLOY_DIR}/kubeconfig
cd ${DEPLOY_DIR}/kubeconfig
export KUBE_APISERVER="https://${KUBE_API_PROXY_IP}:443"
# step 1. 创建 bootstrap token file
# Token 可以是任意的包涵128 bit的字符串,可以使用安全的随机数发生器生成。
export BOOTSTRAP_TOKEN=$(head -c 16 /dev/urandom | od -An -t x | tr -d ' ')
cat > ${DEPLOY_DIR}/kubeconfig/token.csv <<EOF
${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,"system:bootstrappers"
EOF
# 注意: 在进行后续操作前请检查 token.csv 文件,确认其中的 ${BOOTSTRAP_TOKEN} 环境变量已经被真实的值替换。
cat ${DEPLOY_DIR}/kubeconfig/token.csv
# 输出类似这种值: 31c5af9c14a8f8ddbed6564234b2644f,kubelet-bootstrap,10001,"system:bootstrappers"
# step 2. 生成 kubeconfig 和 设置 current context
# 注意点: credential必须是,用户组system:node和hostname小写化后的拼接
for node in {node01,node02,node03};do
kubectl config set-cluster kubernetes \
--certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=bootstrap-kubelet-${node}.conf
kubectl config set-credentials system:node:${node} \
--token=${BOOTSTRAP_TOKEN} \
--kubeconfig=bootstrap-kubelet-${node}.conf
kubectl config set-context default \
--cluster=kubernetes \
--user=system:node:${node} \
--kubeconfig=bootstrap-kubelet-${node}.conf
kubectl config use-context default --kubeconfig=bootstrap-kubelet-${node}.conf
done
# 依次执行了以下步骤:
# 生成文件: bootstrap-kubelet-${node}.conf,内容为集群信息(证书、apiserver地址、集群名称)
# 修改文件: bootstrap-kubelet-${node}.conf,增加token等认证信息
# 修改文件: bootstrap-kubelet-${node}.conf,增加context信息
# 修改文件: bootstrap-kubelet-${node}.conf,设定当前context为default
export KUBE_APISERVER="https://${KUBE_API_PROXY_IP}:443"
# step 1. 生成 kubeconfig
kubectl config set-cluster kubernetes \
--certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=kube-proxy.conf
# 生成文件: kube-proxy.conf,内容为集群信息(证书、apiserver地址、集群名称)
kubectl config set-credentials system:kube-proxy \
--client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-proxy.pem \
--client-key=${DEPLOY_DIR}/pki/kubernetes/kube-proxy-key.pem \
--embed-certs=true \
--kubeconfig=kube-proxy.conf
# 修改文件: kube-proxy.conf,增加认证用户和认证信息
kubectl config set-context default \
--cluster=kubernetes \
--user=system:kube-proxy \
--kubeconfig=kube-proxy.conf
# 修改文件: kube-proxy.conf,增加context信息
# 设置 current context
kubectl config use-context default --kubeconfig=kube-proxy.conf
# 修改文件: kube-proxy.conf,设定当前context为default
# step 1. 生成 kubeconfig
kubectl config set-cluster kubernetes \
--certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \
--embed-certs=true \
--server=${KUBE_APISERVER} \
--kubeconfig=admin.conf
# 生成文件: admin.conf,内容为集群信息(证书、apiserver地址、集群名称)
kubectl config set-credentials admin \
--client-certificate=${DEPLOY_DIR}/pki/kubernetes/admin.pem \
--embed-certs=true \
--client-key=${DEPLOY_DIR}/pki/kubernetes/admin-key.pem \
--kubeconfig=admin.conf
# 修改文件: admin.conf,增加认证用户和认证信息
kubectl config set-context kubernetes \
--cluster=kubernetes \
--user=admin \
--kubeconfig=admin.conf
# 修改文件: admin.conf,增加context信息
# 设定上下文
kubectl config use-context kubernetes --kubeconfig=admin.conf
# 修改文件: admin.conf,设定当前context为kubernetes
for master in {master01,master02,master03};do
kubectl config set-cluster kubernetes \
--certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \
--embed-certs=true \
--server=https://${IP_LIST[${master}]}:6443 \
--kubeconfig=kube-controller-manager-${master}.conf
# 注意点: credential必须是system:kube-controller-manager
kubectl config set-credentials system:kube-controller-manager \
--client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager.pem \
--client-key=${DEPLOY_DIR}/pki/kubernetes/kube-controller-manager-key.pem \
--embed-certs=true \
--kubeconfig=kube-controller-manager-${master}.conf
kubectl config set-context default \
--cluster=kubernetes \
--user=system:kube-controller-manager \
--kubeconfig=kube-controller-manager-${master}.conf
kubectl config use-context default --kubeconfig=kube-controller-manager-${master}.conf
done
注意点:
- 此处指定了--server,必须是自己所在节点的kube-apiserver,不然选举会卡住失败,参照issue: 49000
- 另外,systemd unit文件中,--master是用来覆盖此处配置,可以省略
for master in {master01,master02,master03};do
kubectl config set-cluster kubernetes \
--certificate-authority=${DEPLOY_DIR}/pki/kubernetes/ca.pem \
--embed-certs=true \
--server=https://${IP_LIST[${master}]}:6443 \
--kubeconfig=kube-scheduler-${master}.conf
# 注意点: credential必须是system:kube-scheduler
kubectl config set-credentials system:kube-scheduler \
--client-certificate=${DEPLOY_DIR}/pki/kubernetes/kube-scheduler.pem \
--client-key=${DEPLOY_DIR}/pki/kubernetes/kube-scheduler-key.pem \
--embed-certs=true \
--kubeconfig=kube-scheduler-${master}.conf
kubectl config set-context default \
--cluster=kubernetes \
--user=system:kube-scheduler \
--kubeconfig=kube-scheduler-${master}.conf
kubectl config use-context default --kubeconfig=kube-scheduler-${master}.conf
done
注意点:
- 此处指定了--server,必须是自己所在节点的kube-apiserver,不然选举会卡住失败,参照issue: 49000
- 另外,systemd unit文件中,--master是用来覆盖此处配置,可以省略
cd ${DEPLOY_DIR}/kubeconfig
# 将bootstrap.kubelet.<node-hostname>.conf和kube-proxy.conf分发到node节点
for node in {node01,node02,node03};do
ssh root@${node} "mkdir -p ${KUBECONFIG_DIR}"
scp ${DEPLOY_DIR}/kubeconfig/bootstrap-kubelet-${node}.conf ${node}:${KUBECONFIG_DIR}/bootstrap-kubelet.conf
scp ${DEPLOY_DIR}/kubeconfig/kube-proxy.conf ${node}:${KUBECONFIG_DIR}
done
# 将master节点的 kubeconfig 分发到所有master上
for master in {master01,master02,master03};do
ssh root@${master} "mkdir -p ${ADMIN_KUBECONFIG_DIR}"
ssh root@${master} "mkdir -p ${KUBECONFIG_DIR}"
scp ${DEPLOY_DIR}/kubeconfig/admin.conf $master:${ADMIN_KUBECONFIG_DIR}/config
scp ${DEPLOY_DIR}/kubeconfig/token.csv $master:${KUBECONFIG_DIR}
scp ${DEPLOY_DIR}/kubeconfig/kube-controller-manager-${master}.conf $master:${KUBECONFIG_DIR}/kube-controller-manager.conf
scp ${DEPLOY_DIR}/kubeconfig/kube-scheduler-${master}.conf $master:${KUBECONFIG_DIR}/kube-scheduler.conf
done
mkdir -p ${DEPLOY_DIR}/{node,master,etcd}/systemd-unit-files
需要各master节点根据自身调整ip地址
# kube-apiserver.service
for master in {master01,master02,master03};do
cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-apiserver-${master}.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
--advertise-address=${IP_LIST[${master}]} \\
--bind-address=${IP_LIST[${master}]} \\
--secure-port=6443 \\
--insecure-port=0 \\
--authorization-mode=Node,RBAC \\
--enable-admission-plugins=NodeRestriction \\
--enable-bootstrap-token-auth=true \\
--token-auth-file=${KUBECONFIG_DIR}/token.csv \\
--service-cluster-ip-range=${SERVICE_CLUSTER_IP_RANGE} \\
--service-node-port-range=${SERVICE_NODE_PORT_RANGE} \\
--client-ca-file=${K8S_PKI_DIR}/ca.pem \\
--tls-cert-file=${K8S_PKI_DIR}/kube-apiserver.pem \\
--tls-private-key-file=${K8S_PKI_DIR}/kube-apiserver-key.pem \\
--service-account-key-file=${K8S_PKI_DIR}/ca-key.pem \\
--etcd-cafile=${ETCD_PKI_DIR}/ca.pem \\
--etcd-certfile=${K8S_PKI_DIR}/apiserver-etcd-client.pem \\
--etcd-keyfile=${K8S_PKI_DIR}/apiserver-etcd-client-key.pem \\
--etcd-servers=https://${IP_LIST["etcd01"]}:2379,https://${IP_LIST["etcd02"]}:2379,https://${IP_LIST["etcd03"]}:2379 \\
--kubelet-certificate-authority=${K8S_PKI_DIR}/ca.pem \\
--kubelet-client-certificate=${K8S_PKI_DIR}/api-kubelet-client.pem \\
--kubelet-client-key=${K8S_PKI_DIR}/api-kubelet-client-key.pem \\
--allow-privileged=true \\
--apiserver-count=3 \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/var/lib/audit.log \\
--event-ttl=1h \\
--v=2
Restart=on-failure
RestartSec=5
Type=notify
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
done
# --admission-control: 1.10之后默认启用admission-control
# 默认设定的值: "NamespaceLifecycle, LimitRanger, ServiceAccount, TaintNodesByCondition, Priority,
# DefaultTolerationSeconds, DefaultStorageClass, StorageObjectInUseProtection, PersistentVolumeClaimResize,
# MutatingAdmissionWebhook, ValidatingAdmissionWebhook, RuntimeClass, ResourceQuota"
# 若需要额外配置其他admission,请参照kubernetes admission controller官方文档
# --enable-bootstrap-token-auth: 启用bootstrap-token认证,详情请参照[官方文档](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/)
# 根据下面这个issue和pr内容,以下三个配置是用来加密master和node通信的,主要是kubectl logs/exec。
# [issue: 14700](https://github.com/kubernetes/kubernetes/pull/14700)
# [PR: 31562](https://github.com/kubernetes/kubernetes/pull/31562)
# [kubelet-authentication-authorization]
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/)
# --kubelet-certificate-authority
# --kubelet-client-certificate
# --kubelet-client-key
# kube-controller-manager.service
for master in {master01,master02,master03};do
cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-controller-manager-${master}.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/local/bin/kube-controller-manager \\
--bind-address=127.0.0.1 \\
--controllers=*,bootstrapsigner,tokencleaner \\
--allocate-node-cidrs=true \\
--service-cluster-ip-range=${SERVICE_CLUSTER_IP_RANGE} \\
--cluster-cidr=${POD_CLUSTER_IP_RANGE} \\
--cluster-name=kubernetes \\
--kubeconfig=${KUBECONFIG_DIR}/kube-controller-manager.conf \\
--root-ca-file=${K8S_PKI_DIR}/ca.pem \\
--cluster-signing-cert-file=${K8S_PKI_DIR}/ca.pem \\
--cluster-signing-key-file=${K8S_PKI_DIR}/ca-key.pem \\
--use-service-account-credentials=true \\
--service-account-private-key-file=${K8S_PKI_DIR}/ca-key.pem \\
--leader-elect=true \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
done
# kube-scheduler.service
for master in {master01,master02,master03};do
cat << EOF > ${DEPLOY_DIR}/master/systemd-unit-files/kube-scheduler-${master}.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes
[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
--bind-address=127.0.0.1 \\
--kubeconfig=${KUBECONFIG_DIR}/kube-scheduler.conf \\
--leader-elect=true \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
done
如果中途修改过
--service-cluster-ip-range
(kube-apiserver,kube-controller-manager),有可能会遇到下面的错误
"message": "Cluster IP *.*.*.* is not within the service CIDR *.*.*.*/**; please recreate service"
解决方案:kubectl get services --all-namespaces
获得系统的最初创建的集群的service后,删除它kubectl delete service kubernets
然后系统会自动创建它,但是!!!不推荐生产环境已经存在应用service后这样搞!!!
for etcd in {etcd01,etcd02,etcd03};do
cat << EOF > ${DEPLOY_DIR}/etcd/systemd-unit-files/etcd-${etcd}.service
[Unit]
Description=Etcd Server
After=network.target
After=network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \\
--name=${etcd} \\
--client-cert-auth=true \\
--trusted-ca-file=${ETCD_PKI_DIR}/ca.pem \\
--cert-file=${ETCD_PKI_DIR}/server.pem \\
--key-file=${ETCD_PKI_DIR}/server-key.pem \\
--peer-client-cert-auth=true \\
--peer-trusted-ca-file=${ETCD_PKI_DIR}/ca.pem \\
--peer-cert-file=${ETCD_PKI_DIR}/peer.pem \\
--peer-key-file=${ETCD_PKI_DIR}/peer-key.pem \\
--initial-advertise-peer-urls=https://${IP_LIST[${etcd}]}:2380 \\
--listen-peer-urls=https://${IP_LIST[${etcd}]}:2380 \\
--listen-client-urls=https://${IP_LIST[${etcd}]}:2379,https://127.0.0.1:2379 \\
--advertise-client-urls=https://${IP_LIST[${etcd}]}:2379 \\
--initial-cluster-token=etcd-cluster-0 \\
--initial-cluster=etcd01=https://${IP_LIST["etcd01"]}:2380,etcd02=https://${IP_LIST["etcd02"]}:2380,etcd03=https://${IP_LIST["etcd03"]}:2380 \\
--initial-cluster-state=new \\
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
done
# kubelet.service
for node in {node01,node02,node03};do
cat << EOF > ${DEPLOY_DIR}/node/systemd-unit-files/kubelet-${node}.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=docker.service
Requires=docker.service
[Service]
WorkingDirectory=/var/lib/kubelet
ExecStart=/usr/local/bin/kubelet \\
--address=${IP_LIST[${node}]} \\
--hostname-override=${node} \\
--network-plugin=cni \\
--pod-infra-container-image=k8s.gcr.io/pause-amd64:3.0 \\
--bootstrap-kubeconfig=${KUBECONFIG_DIR}/bootstrap-kubelet.conf \\
--kubeconfig=${KUBECONFIG_DIR}/kubelet.conf \\
--client-ca-file=${K8S_PKI_DIR}/ca.pem \\
--cert-dir=${K8S_PKI_DIR} \\
--tls-cert-file=${K8S_PKI_DIR}/api-kubelet-client.pem \\
--tls-private-key-file=${K8S_PKI_DIR}/api-kubelet-client-key.pem \\
--anonymous-auth=false \\
--hairpin-mode promiscuous-bridge \\
--serialize-image-pulls=false \\
--cgroup-driver=systemd \\
--cluster-dns=${SERVICE_CLUSTER_IP_RANGE%.*}.2 \\
--cluster-domain=cluster.local \\
--v=2
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
done
# cgroup-driver和docker一致,皆为systemd
# --cert-dir指定kubelet从master那边获取的签名证书存放目录
# 根据下面这个issue和pr内容,以下配置是用来加密master和node通信的,主要是kubectl logs/exec。
# [issue: 14700](https://github.com/kubernetes/kubernetes/pull/14700)
# [PR: 31562](https://github.com/kubernetes/kubernetes/pull/31562)
# [kubelet-authentication-authorization]
# (https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-authentication-authorization/)
# --client-ca-file
# 根据[issue: 63164](https://github.com/kubernetes/kubernetes/issues/63164)
# tls bootsraping里面自动在--cert-dir下面生成的key,只是用于kubelet -> apiserver
# 而如果需要apiserver -> kubelet的认证,需要手动指定以下参数,如果不指定,kubelet会自动生成一个ca
# --tls-cert-file
# --tls-private-key-file
for node in {node01,node02,node03};do
cat << EOF > ${DEPLOY_DIR}/node/systemd-unit-files/kube-proxy-${node}.service
[Unit]
Description=Kubernetes Kube-Proxy Server
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
After=network.target
[Service]
ExecStart=/usr/local/bin/kube-proxy \\
--logtostderr=true \\
--v=0 \\
--master=https://${KUBE_API_PROXY_IP}:443 \\
--bind-address=${IP_LIST[${node}]} \\
--hostname-override=${node} \\
--kubeconfig=${KUBECONFIG_DIR}/kube-proxy.conf \\
--cluster-cidr=${POD_CLUSTER_IP_RANGE}
Restart=on-failure
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target
EOF
done
# 下发master unit文件
for master in {master01,master02,master03};do
rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-apiserver-${master}.service \
${master}:/usr/lib/systemd/system/kube-apiserver.service
rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-controller-manager-${master}.service \
${master}:/usr/lib/systemd/system/kube-controller-manager.service
rsync -av ${DEPLOY_DIR}/master/systemd-unit-files/kube-scheduler-${master}.service \
${master}:/usr/lib/systemd/system/kube-scheduler.service
done
# 下发node unit文件
for node in {node01,node02,node03};do
rsync -av ${DEPLOY_DIR}/node/systemd-unit-files/kubelet-${node}.service \
${node}:/usr/lib/systemd/system/kubelet.service
rsync -av ${DEPLOY_DIR}/node/systemd-unit-files/kube-proxy-${node}.service \
${node}:/usr/lib/systemd/system/kube-proxy.service
done
# 下发etcd unit文件
for etcd in {etcd01,etcd02,etcd03};do
rsync -av ${DEPLOY_DIR}/etcd/systemd-unit-files/etcd-${etcd}.service ${etcd}:/usr/lib/systemd/system/etcd.service
done
for etcd in {etcd01,etcd02,etcd03};do
ssh root@${etcd} "mkdir -p /var/lib/etcd"
ssh root@${etcd} "systemctl daemon-reload"
ssh root@${etcd} "systemctl enable etcd"
done
# 手动去etcd机器上启动etcd,必须同时启动,集群才能启动成功
systemctl start etcd
etcdctl \
--endpoints https://${IP_LIST["etcd01"]}:2379,https://${IP_LIST["etcd02"]}:2379,https://${IP_LIST["etcd03"]}:2379 \
--ca-file=${DEPLOY_DIR}/pki/etcd/ca.pem \
--cert-file=${DEPLOY_DIR}/pki/etcd/peer.pem \
--key-file=${DEPLOY_DIR}/pki/etcd/peer-key.pem \
cluster-health
for master in {master01,master02,master03};do
ssh root@${master} "systemctl daemon-reload"
ssh root@${master} "systemctl start kube-apiserver kube-controller-manager kube-scheduler"
ssh root@${master} "systemctl enable kube-apiserver kube-controller-manager kube-scheduler"
done
下面执行的内容牵扯到kubelet-tls-bootstrap的内容,可以参考官方文档
创建ClusterRoleBinding允许kubelet创建CSR(certificate signing requests)
# enable bootstrapping nodes to create CSR
cat << EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: create-csrs-for-bootstrapping
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:node-bootstrapper
apiGroup: rbac.authorization.k8s.io
EOF
创建ClusterRoleBinding允许kubelet请求和接收证书
# Approve all CSRs for the group "system:bootstrappers"
cat << EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: auto-approve-csrs-for-group
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:nodeclient
apiGroup: rbac.authorization.k8s.io
EOF
创建ClusterRoleBinding允许kubelet重签证书
# Approve renewal CSRs for the group "system:nodes"
cat << EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: auto-approve-renewals-for-nodes
subjects:
- kind: Group
name: system:nodes
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
apiGroup: rbac.authorization.k8s.io
EOF
for node in {node01,node02,node03};do
ssh root@${node} "mkdir -p /var/lib/kubelet"
ssh root@${node} "systemctl daemon-reload && systemctl start kubelet kube-proxy && systemctl enable kubelet kube-proxy"
done
# 查看csr请求
kubectl get csr
# 如果自动认证签证证书失败,有需要人工批准的请求,可以执行approve csr请求
# kubectl get csr | awk '/Pending/ {print $1}' | xargs kubectl certificate approve
kubectl get componentstatuses
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node01 Ready <none> 40m v1.9.1
node02 Ready <none> 3m v1.9.1
node03 Ready <none> 3m v1.9.1
关于CNI网络组件的选择,可以自行搜索文档,此处选择使用flannel。
官方kubeadm引导k8s集群的文档中,使用了这个yaml文件: https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml 上面这个文件中默认使用了10.244.0.0/16作为pod的网络,如果希望自定义,需要在kubeadm init时增加--pod-network-cidr选项
但是因为我们是二进制安装,此处我们下载上面的yaml文件,并手动修改pod网络的参数,并仅保留amd64硬件的daemonset,其他硬件平台的删除
mkdir -p ${DEPLOY_DIR}/kube-addon
# customize kube-flannel.yml
# !!!!!!!!!!!!!!!!
# !!! 有时候下面这个命令,会导致文件里面有乱码,为啥子我也没弄清楚,最后执行完检查下
# !!! 如果有问题,手动粘贴以下内容,记得把变量${POD_CLUSTER_IP_RANGE}替换为它的值
# !!!!!!!!!!!!!!!!
cat << EOF > ${DEPLOY_DIR}/kube-addon/kube-flannel.yml
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: psp.flannel.unprivileged
annotations:
seccomp.security.alpha.kubernetes.io/allowedProfileNames: docker/default
seccomp.security.alpha.kubernetes.io/defaultProfileName: docker/default
apparmor.security.beta.kubernetes.io/allowedProfileNames: runtime/default
apparmor.security.beta.kubernetes.io/defaultProfileName: runtime/default
spec:
privileged: false
volumes:
- configMap
- secret
- emptyDir
- hostPath
allowedHostPaths:
- pathPrefix: "/etc/cni/net.d"
- pathPrefix: "/etc/kube-flannel"
- pathPrefix: "/run/flannel"
readOnlyRootFilesystem: false
# Users and groups
runAsUser:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
fsGroup:
rule: RunAsAny
# Privilege Escalation
allowPrivilegeEscalation: false
defaultAllowPrivilegeEscalation: false
# Capabilities
allowedCapabilities: ['NET_ADMIN']
defaultAddCapabilities: []
requiredDropCapabilities: []
# Host namespaces
hostPID: false
hostIPC: false
hostNetwork: true
hostPorts:
- min: 0
max: 65535
# SELinux
seLinux:
# SELinux is unsed in CaaSP
rule: 'RunAsAny'
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
rules:
- apiGroups: ['extensions']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames: ['psp.flannel.unprivileged']
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: flannel
namespace: kube-system
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-system
labels:
tier: node
app: flannel
data:
cni-conf.json: |
{
"cniVersion": "0.2.0",
"name": "cbr0",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "${POD_CLUSTER_IP_RANGE}",
"Backend": {
"Type": "vxlan"
}
}
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: kube-flannel-ds-amd64
namespace: kube-system
labels:
tier: node
app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: beta.kubernetes.io/os
operator: In
values:
- linux
- key: beta.kubernetes.io/arch
operator: In
values:
- amd64
hostNetwork: true
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni
image: quay.io/coreos/flannel:v0.11.0-amd64
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: quay.io/coreos/flannel:v0.11.0-amd64
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
limits:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
EOF
# 包含以下资源
# - PodSecurityPolicy
# - ClusterRole
# - ClusterRoleBinding
# - ServiceAccount
# - ConfigMap
# - DaemonSet(for amd64 platform)
# apply cni addon: flannel
kubectl apply -f ${DEPLOY_DIR}/kube-addon/kube-flannel.yml
--kube-subnet-mgr
,使用了这个选项,flannel不会去etcd中获取网络配置信息,而是通过/etc/kube-flannel/net-conf.json
来获取网络配置
kubectl get pods --namespace kube-system
NAME READY STATUS RESTARTS AGE
kube-flannel-ds-amd64-g2snw 1/1 Running 0 62s
kube-flannel-ds-amd64-pzmf5 1/1 Running 0 62s
kube-flannel-ds-amd64-sxtdz 1/1 Running 0 62s
高版本里面基本都推荐使用CoreDNS替代,参考文档K8S-COREDNS & COREDNS-GITHUB
创建coredns部署脚本
# coredns脚本安装需要jq命令支持
yum install epel-release -y
yum install jq -y
# 下载部署coredns需要的文件
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/deploy.sh -O ${DEPLOY_DIR}/kube-addon/deploy.sh
wget https://raw.githubusercontent.com/coredns/deployment/master/kubernetes/coredns.yaml.sed -O ${DEPLOY_DIR}/kube-addon/coredns.yaml.sed
# 官方脚本里面的function名称不符合bash的标准,你敢信?
sed -i "s/translate-kube-dns-configmap/translate_kube_dns_configmap/g" ${DEPLOY_DIR}/kube-addon/deploy.sh
sed -i "s/kube-dns-federation-to-coredns/kube_dns_federation_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh
sed -i "s/kube-dns-upstreamnameserver-to-coredns/kube_dns_upstreamnameserver_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh
sed -i "s/kube-dns-stubdomains-to-coredns/kube_dns_stubdomains_to_coredns/g" ${DEPLOY_DIR}/kube-addon/deploy.sh
# 部署coredns
# -i 指定dns ip
sh ${DEPLOY_DIR}/kube-addon/deploy.sh -i ${SERVICE_CLUSTER_IP_RANGE%.*}.2| kubectl apply -f -
看了k8s的pv文档后,因为前面安装k8s的环境是bare metal,所以无法使用云提供的方案,最终只能从glusterfs和ceph里面选择一个。经过网络上简单的搜索之后,发现glusterfs性能上比ceph稍高一点,另外它支持一个heketi,可以用restful的方式管理glusterfs的volume。虽然我没有实际部署对比过,但是倾向于glusterfs。
另外,之前不知道使用k8s持久化存储之前需要先安装持久化存储,汗,走了一点弯路。
mount.glusterfs
命令# 检查raw disk环境
fdisk -l
# 查看结果中是否有未被使用的磁盘
# 检查和加载内核模块
lsmod | grep dm_snapshot || modprobe dm_snapshot
lsmod | grep dm_mirror || modprobe dm_mirror
lsmod | grep dm_thin_pool || modprobe dm_thin_pool
# 检查加载是否成功
lsmod | egrep '^(dm_snapshot|dm_mirror|dm_thin_pool)'
# 输出内容
# dm_thin_pool 66358 0
# dm_snapshot 39103 0
# dm_mirror 22289 0
# 安装mount.glusterfs命令
yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-libs-7.1-1.el7.x86_64.rpm
yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-7.1-1.el7.x86_64.rpm
yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-client-xlators-7.1-1.el7.x86_64.rpm
yum install -y https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-7/glusterfs-fuse-7.1-1.el7.x86_64.rpm
# 默认安装的是glusterfs 3.12,为了和下面gk-deploy脚本里面安装的版本一致,手动安装7.1版本
# 查看glusterfs版本
glusterfs --version
glusterfs 7.1
mount.glusterfs -V
glusterfs 7.1
# step 1. 下载安装文件
# 以下测试使用的,最新commit是:
# Latest commit
# 7246eb4
# on Jul 19, 2019
# 这个时候的版本和kubenetes 1.17不兼容,有很多需要修改的东西
# 我个人做了很多修改和debug,后面应该会修复,所以下面的内容
# 请酌情来判断是否需要执行
git clone https://github.com/gluster/gluster-kubernetes.git
cd gluster-kubernetes/deploy
# step 2. 准备topology文件
# ***************************
# 重点关注
# - hostsnames.manage里面填写节点的hostname
# - hostnames.storage里面填写节点的ip
# - devices里面填写磁盘的名称
# ***************************
cat << EOF > topology.json
{
"clusters": [
{
"nodes": [
{
"node": {
"hostnames": {
"manage": [
"node01"
],
"storage": [
"${IP_LIST['node01']}"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
},
{
"node": {
"hostnames": {
"manage": [
"node02"
],
"storage": [
"${IP_LIST['node02']}"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
},
{
"node": {
"hostnames": {
"manage": [
"node03"
],
"storage": [
"${IP_LIST['node03']}"
]
},
"zone": 1
},
"devices": [
"/dev/sdb"
]
}
]
}
]
}
EOF
# 安装前确认节点都ready状态
kubectl get nodes
# step 3. ISSUCE解决(在官方的git中已经有看到解决的pr,但我当前使用的时间点还需要自己来修改)
# 1) k8s 1.17换了api版本
sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/deploy-heketi-deployment.yaml
sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/gluster-s3-template.yaml
sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/glusterfs-daemonset.yaml
sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" kube-templates/heketi-deployment.yaml
sed -ir "s|apiVersion: extensions/v1beta1|apiVersion: apps/v1|g" ocp-templates/glusterfs-template.yaml
# 2) error: error validating "STDIN": error validating data: ValidationError(DaemonSet.spec): missing required
# field "selector" in io.k8s.api.apps.v1.DaemonSetSpec; if you choose to ignore these errors, turn validation off with --validate=false
# k8s 1.17需要指定pod selector
# 确认以下内容,如果不存在,请手动增加
vim kube-templates/glusterfs-daemonset.yaml
spec:
selector:
matchLabels:
name: glusterfs
template:
metadata:
labels:
name: glusterfs
vim kube-templates/deploy-heketi-deployment.yaml
spec:
selector:
matchLabels:
name: deploy-heketi
template:
metadata:
labels:
name: deploy-heketi
vim kube-templates/gluster-s3-template.yaml
- kind: Deployment
spec:
selector:
matchLabels:
name: gluster-s3
template:
metadata:
labels:
name: gluster-s3
vim kube-templates/heketi-deployment.yaml
spec:
selector:
matchLabels:
name: heketi
template:
metadata:
labels:
name: heketi
# 3) Determining heketi service URL ... Error: unknown flag: --show-all
# See 'kubectl get --help' for usage.
# Failed to communicate with heketi service.
# kubectl v1.17 没有--show-all这个选项
vim gk-deploy
# 将下面的内容
# heketi_pod=$(${CLI} get pod --no-headers --show-all --selector="heketi" | awk '{print $1}')
# 修改为
# heketi_pod=$(${CLI} get pod --no-headers --selector="heketi" | awk '{print $1}')
# step 4. 部署heketi and GlusterFS
ADMIN_KEY=adminkey
USER_KEY=userkey
./gk-deploy -g -y -v --admin-key ${ADMIN_KEY} --user-key ${USER_KEY}
# 如果第一次没安装成功,需要二次安装,使用下面命令清除之前的安装资源
# 删除资源和服务
# ./gk-deploy -g --abort --admin-key adminkey --user-key userkey
# 查看lv名称
# lvs
# 删除lv
# lvremove /dev/vg
# 清除磁盘(在节点机器上执行)
# wipefs -a /dev/sdc
# step 5. 检查heketi和glusterfs运行情况
export HEKETI_CLI_SERVER=$(kubectl get svc/heketi --template 'http://{{.spec.clusterIP}}:{{(index .spec.ports 0).port}}')
echo $HEKETI_CLI_SERVER
curl $HEKETI_CLI_SERVER/hello
# Hello from Heketi
# 如果timeout的话,看看是不是master没搞成node节点,没加入kube-proxy
# 可以获取到地址之后,到node节点上执行curl操作
# step 6. 创建storageclass,来自动为pvc创建pv
SECRET_KEY=`echo -n "${ADMIN_KEY}" | base64`
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Secret
metadata:
name: heketi-secret
namespace: default
data:
# base64 encoded password. E.g.: echo -n "mypassword" | base64
key: ${SECRET_KEY}
type: kubernetes.io/glusterfs
EOF
cat << EOF | kubectl apply -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: glusterfs-storage
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "${HEKETI_CLI_SERVER}"
restuser: "admin"
secretNamespace: "default"
secretName: "heketi-secret"
volumetype: "replicate:3"
EOF
# 注意生成pvc资源时,需要指定storageclass为上面配置的"glusterfs-storage"
kubectl get nodes,pods
NAME STATUS ROLES AGE VERSION
node/node01 Ready <none> 5d3h v1.17.0
node/node02 Ready <none> 5d3h v1.17.0
node/node03 Ready <none> 5d3h v1.17.0
NAME READY STATUS RESTARTS AGE
pod/glusterfs-bhprz 1/1 Running 0 45m
pod/glusterfs-jt64n 1/1 Running 0 45m
pod/glusterfs-vkfp5 1/1 Running 0 45m
pod/heketi-779bc95979-272qk 1/1 Running 0 38m