Skip to main content

One post tagged with "DevOps"

View All Tags

本文将详细介绍如何从零开始手动搭建Kubernetes集群环境。我们将一步一步展示整个过程,包括环境准备、软件版本选择、详细步骤和配置说明。

服务器要求

Kubernetes版本

  • 版本: v1.22.15

节点要求

  • 节点数: 至少3台
  • CPU: 至少2个
  • 内存: 至少2G

修改时区

某些系统的时区可能不匹配,需要进行修改:

timedatectl set-timezone Asia/Shanghai

环境说明

系统类型IP地址节点角色CPU内存主机名
CentOS-7.9192.168.200.11master>=2>=2Gcluster1
CentOS-7.9192.168.200.22master,worker>=2>=2Gcluster2
CentOS-7.9192.168.200.33worker>=2>=2Gcluster3

使用Vagrant搭建虚拟机节点

  • Vagrant: 最新版本
  • VirtualBox: 7.0
  • vagrant-vbguest: 0.21 (用于挂载host和guest同步目录)
vagrant plugin install vagrant-vbguest --plugin-version 0.21

Vagrantfile配置

# -*- mode: ruby -*-
# vi: set ft=ruby :

nodes = [
{
:name => "cluster1",
:eth1 => "192.168.200.11",
:mem => "4096",
:cpu => "2"
},
{
:name => "cluster2",
:eth1 => "192.168.200.22",
:mem => "4096",
:cpu => "2"
},
{
:name => "cluster3",
:eth1 => "192.168.200.33",
:mem => "4096",
:cpu => "2"
},
]

Vagrant.configure("2") do |config|
config.vm.box = "centos/7"

nodes.each do |opts|
config.vm.define opts[:name] do |config|
config.vm.hostname = opts[:name]

config.vm.provider "virtualbox" do |v|
v.customize ["modifyvm", :id, "--memory", opts[:mem]]
v.customize ["modifyvm", :id, "--cpus", opts[:cpu]]
end

config.vm.synced_folder "../share", "/vagrant_data"
config.vm.network :public_network, ip: opts[:eth1]
end
end
end

系统设置(所有节点)

需要root权限

  1. 设置hostname (/etc/hosts)
  2. 安装依赖包
yum update -y
yum install -y socat conntrack ipvsadm ipset jq sysstat curl iptables libseccomp yum-utils
  1. 关闭防火墙, selinux, swap, 重置 iptables
# 关闭selinux
setenforce 0
sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config

# 关闭防火墙
systemctl stop firewalld && systemctl disable firewalld

# 设置iptables规则
iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT

# 关闭swap
vi /etc/fstab
# 永久禁用注释掉swap
#/swapfile none swap defaults 0 0
# 临时禁用
swapoff -a

# 关闭dnsmasq
service dnsmasq stop && systemctl disable dnsmasq
  1. Kubernetes参数设置
cat > /etc/sysctl.d/kubernetes.conf <<EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
vm.overcommit_memory = 1
EOF

sysctl -p /etc/sysctl.d/kubernetes.conf
  1. 配置免密登录
# 生成公秘钥对, 如果没有可用的
ssh-keygen -t rsa

# 查看公钥内容
cat ~/.ssh/id_rsa.pub

# 每一台节点机器上配置
echo "<pubkey content>" >> ~/.ssh/authorized_keys
  1. 配置IP映射(每个节点)
cat > /etc/hosts <<EOF
192.168.200.11 cluster1
192.168.200.22 cluster2
192.168.200.33 cluster3
EOF
  1. 下载K8s组件包
export VERSION=v1.22.15

# 下载master节点组件
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kube-apiserver
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kube-controller-manager
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kube-scheduler
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kubectl

# 下载worker节点组件
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kube-proxy
wget https://storage.googleapis.com/kubernetes-release/release/${VERSION}/bin/linux/amd64/kubelet

# 下载etcd组件
wget https://github.com/etcd-io/etcd/releases/download/v3.4.10/etcd-v3.4.10-linux-amd64.tar.gz
tar -xvf etcd-v3.4.10-linux-amd64.tar.gz
mv etcd-v3.4.10-linux-amd64/etcd* .
rm -fr etcd-v3.4.10-linux-amd64*
  1. 分发软件包
# 把master相关组件分发到master节点
MASTERS=(cluster1 cluster2)
for instance in ${MASTERS[@]}; do
scp kube-apiserver kube-controller-manager kube-scheduler kubectl root@${instance}:/usr/local/bin/
done

# 把worker相关组件分发到worker节点
WORKERS=(cluster2 cluster3)
for instance in ${WORKERS[@]}; do
scp kubelet kube-proxy root@${instance}:/usr/local/bin/
done

# 把etcd组件分发到etcd节点
ETCDS=(cluster1 cluster2 cluster3)
for instance in ${ETCDS[@]}; do
scp etcd etcdctl root@${instance}:/usr/local/bin/
done

生成证书

准备工作

安装cfssl

wget https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -O /usr/local/bin/cfssl
wget https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -O /usr/local/bin/cfssljson
chmod +x /usr/local/bin/cfssl*

生成根证书

创建根证书配置文件:

cat > ca-config.json <<EOF
{
"signing": {
"default": {
"expiry": "876000h"
},
"profiles": {
"kubernetes": {
"usages": ["signing", "key encipherment", "server auth", "client auth"],
"expiry": "876000h"
}
}
}
}
EOF

cat > ca-csr.json <<EOF
{
"CN": "Kubernetes",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "US",
"L": "Portland",
"O": "Kubernetes",
"OU": "CA",
"ST": "Oregon"
}
]
}
EOF

生成证书和私钥:

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

输出文件:

  • ca.pem
  • ca.csr
  • ca-key.pem

生成其他证书

按照类似步骤生成adminkubeletkube-controller-managerkube-proxykube-schedulerkube-apiserverService Accountproxy-client等证书。

部署ETCD集群

配置etcd证书文件:

mkdir -p /etc/etcd /var/lib/etcd
chmod 700 /var/lib/etcd
cp ca.pem kubernetes-key.pem kubernetes.pem /etc/etcd/

配置etcd.service文件:

ETCD_NAME=$(hostname -s)
ETCD_IP=192.168.200.11
ETCD_NAMES=(cluster1 cluster2 cluster3)
ETCD_IPS=(192.168.200.11 192.168.200.22 192.168.200.33)
cat <<EOF > /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos

[Service]
Type=notify
ExecStart=/usr/local/bin/etcd \\
--name ${ETCD_NAME} \\
--cert-file=/etc/etcd/kubernetes.pem \\
--key-file=/etc/etcd/kubernetes-key.pem \\
--peer-cert-file=/etc/etcd/kubernetes.pem \\
--peer-key-file=/etc/etcd/kubernetes-key.pem \\
--trusted-ca-file=/etc/etcd/ca.pem \\
--peer-trusted-ca-file=/etc/etcd/ca.pem \\
--peer-client-cert-auth \\
--client-cert-auth \\
--initial-advertise-peer-urls https://${ETCD_IP}:2380 \\
--listen-peer-urls https://${ETCD_IP}:2380 \\
--listen-client-urls https://${ETCD_IP}:2379,https://127.0.0.1:2379 \\
--advertise-client-urls https://${ETCD_IP}:2379 \\
--initial-cluster-token etcd-cluster-0 \\
--initial-cluster ${ETCD_NAMES[0]}=https://${ETCD_IPS[0]}:2380,${ETCD_NAMES[1]}=https://${ETCD_IPS[1]}:2380,${ETCD_NAMES[2]}=https://${ETCD_IPS[2]}:2380 \\
--initial-cluster-state new \\
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

启动etcd集群:

systemctl daemon-reload && systemctl enable etcd && systemctl start etcd

验证etcd集群:

ETCDCTL_API=3 etcdctl member list \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/etcd/ca.pem \
--cert=/etc/etcd/kubernetes.pem \
--key=/etc/etcd/kubernetes-key.pem

部署Kubernetes控制面板

cluster1cluster2上部署kube-apiserverkube-controller-managerkube-scheduler

配置API Server

mkdir -p /etc/kubernetes/ssl

mv ca.pem ca-key.pem kubernetes-key.pem kubernetes.pem \
service-account-key.pem service-account.pem \
proxy-client.pem proxy-client-key.pem \
/etc/kubernetes/ssl

IP=192.168.200.11
APISERVER_COUNT=2
ETCD_ENDPOINTS=(192.168.200.11 192.168.200.22 192.168.200.33)

cat <<EOF > /etc/systemd/system/kube-apiserver.service
[Unit]
Description=Kubernetes API Server
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-apiserver \\
--advertise-address=${IP} \\
--allow-privileged=true \\
--apiserver-count=${APISERVER_COUNT} \\
--audit-log-maxage=30 \\
--audit-log-maxbackup=3 \\
--audit-log-maxsize=100 \\
--audit-log-path=/var/log/audit.log \\
--authorization-mode=Node,RBAC \\
--bind-address=0.0.0.0 \\
--client-ca-file=/etc/kubernetes/ssl/ca.pem \\
--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \\
--etcd-cafile=/etc/kubernetes/ssl/ca.pem \\
--etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\
--etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\
--etcd-servers=https://${ETCD_ENDPOINTS[0]}:2379,https://${ETCD_ENDPOINTS[1]}:2379,https://${ETCD_ENDPOINTS[2]}:2379 \\
--event-ttl=1h \\
--kubelet-certificate-authority=/etc/kubernetes/ssl/ca.pem \\
--kubelet-client-certificate=/etc/kubernetes/ssl/kubernetes.pem \\
--kubelet-client-key=/etc/kubernetes/ssl/kubernetes-key.pem \\
--service-account-issuer=api \\
--service-account-key-file=/etc/kubernetes/ssl/service-account.pem \\
--service-account-signing-key-file=/etc/kubernetes/ssl/service-account-key.pem \\
--api-audiences=api,vault,factors \\
--service-cluster-ip-range=10.233.0.0/16 \\
--service-node-port-range=30000-32767 \\
--proxy-client-cert-file=/etc/kubernetes/ssl/proxy-client.pem \\
--proxy-client-key-file=/etc/kubernetes/ssl/proxy-client-key.pem \\
--runtime-config=api/all=true \\
--requestheader-client-ca-file=/etc/kubernetes/ssl/ca.pem \\
--requestheader-allowed-names=aggregator \\
--requestheader-extra-headers-prefix=X-Remote-Extra- \\
--requestheader-group-headers=X-Remote-Group \\
--requestheader-username-headers=X-Remote-User \\
--tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\
--tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\
--v=1
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

配置kube-controller-manager

mv kube-controller-manager.kubeconfig /etc/kubernetes/

cat <<EOF > /etc/systemd/system/kube-controller-manager.service
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-controller-manager \\
--bind-address=0.0.0.0 \\
--cluster-cidr=10.200.0.0/16 \\
--cluster-name=kubernetes \\
--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \\
--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \\
--cluster-signing-duration=876000h0m0s \\
--kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \\
--leader-elect=true \\
--root-ca-file=/etc/kubernetes/ssl/ca.pem \\
--service-account-private-key-file=/etc/kubernetes/ssl/service-account-key.pem \\
--service-cluster-ip-range=10.233.0.0/16 \\
--use-service-account-credentials=true \\
--v=1
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

配置kube-scheduler

mv kube-scheduler.kubeconfig /etc/kubernetes

cat <<EOF > /etc/systemd/system/kube-scheduler.service
[Unit]
Description=Kubernetes Scheduler
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-scheduler \\
--authentication-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
--authorization-kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
--kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \\
--leader-elect=true \\
--bind-address=0.0.0.0 \\
--port=0 \\
--v=1
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

启动服务

systemctl daemon-reload
systemctl enable kube-apiserver
systemctl enable kube-controller-manager
systemctl enable kube-scheduler
systemctl start kube-apiserver
systemctl start kube-controller-manager
systemctl start kube-scheduler

服务验证

netstat -ntlp

正常输出如下:

tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      18829/sshd
tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN 853/master
tcp 0 0 192.168.200.11:2379 0.0.0.0:* LISTEN 30516/etcd
tcp 0 0 127.0.0.1:2379 0.0.0.0:* LISTEN 30516/etcd
tcp 0 0 192.168.200.11:2380 0.0.0.0:* LISTEN 30516/etcd
tcp6 0 0 ::1:25 :::* LISTEN 853/master
tcp6 0 0 :::6443 :::* LISTEN 30651/kube-apiserve
tcp6 0 0 :::10257 :::* LISTEN 30666/kube-controll
tcp6 0 0 :::10259 :::* LISTEN 30679/kube-schedule

配置kubectl

kubectl用于管理Kubernetes集群的客户端工具。

mkdir ~/.kube/
mv ~/admin.kubeconfig ~/.kube/config
kubectl get nodes

授权apiserver调用kubelet API,在执行kubectl exec/run/logsapiserver会转发到kubelet

kubectl create clusterrolebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user kubernetes

部署Kubernetes Worker Node

在每个节点上部署:

  • kubelet
  • kube-proxy
  • container runtime
  • cni
  • nginx-proxy

Container Runtime (containerd)

VERSION=1.4.3
wget https://github.com/containerd/containerd/releases/download/v${VERSION}/cri-containerd-cni-${VERSION}-linux-amd64.tar.gz
tar -xvf cri-containerd-cni-${VERSION}-linux-amd64.tar.gz
cp etc/crictl.yaml /etc/
cp etc/systemd/system/containerd.service /etc/systemd/system/
cp -r usr /

containerd配置文件

mkdir -p /etc/containerd
containerd config default > /etc/containerd/config.toml

启动containerd

systemctl enable containerd
systemctl start containerd
systemctl status containerd

配置kubelet

准备配置文件:

mkdir -p /etc/kubernetes/ssl/

mv ${HOSTNAME}-key.pem ${HOSTNAME}.pem ca.pem ca-key.pem /etc/kubernetes/ssl/
mv ${HOSTNAME}.kubeconfig /etc/kubernetes/kubeconfig
IP=192.168.200.22

cat <<EOF > /etc/kubernetes/kubelet-config.yaml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
enabled: true
x509:
clientCAFile: "/etc/kubernetes/ssl/ca.pem"
authorization:
mode: Webhook
clusterDomain: "cluster.local"
clusterDNS:
- "169.254.25.10"
podCIDR: "10.200.0.0/16"
address: ${IP}
readOnlyPort: 0
staticPodPath: /etc/kubernetes/manifests
healthzPort: 10248
healthzBindAddress: 127.0.0.1
kubeletCgroups: /systemd/system.slice
resolvConf: "/etc/resolv.conf"
runtimeRequestTimeout: "15m"
kubeReserved:
cpu: 200m
memory: 512M
tlsCertFile: "/etc/kubernetes/ssl/${HOSTNAME}.pem"
tlsPrivateKeyFile: "/etc/kubernetes/ssl/${HOSTNAME}-key.pem"
EOF

配置服务:

cat <<EOF > /etc/systemd/system/kubelet.service
[Unit]
Description=Kubernetes Kubelet
Documentation=https://github.com/kubernetes/kubernetes
After=containerd.service
Requires=containerd.service

[Service]
ExecStart=/usr/local/bin/kubelet \\
--config=/etc/kubernetes/kubelet-config.yaml \\
--container-runtime=remote \\
--container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \\
--image-pull-progress-deadline=2m \\
--kubeconfig=/etc/kubernetes/kubeconfig \\
--network-plugin=cni \\
--node-ip=${IP} \\
--register-node=true \\
--v=2
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

配置nginx-proxy

nginx-proxy用于Worker节点访问apiserver的代理,是apiserver高可用方案的一部分。

mkdir -p /etc/nginx
MASTER_IPS=(10.155.19.223 10.155.19.64)

cat <<EOF > /etc/nginx/nginx.conf
error_log stderr notice;

worker_processes 2;
worker_rlimit_nofile 130048;
worker_shutdown_timeout 10s;

events {
multi_accept on;
use epoll;
worker_connections 16384;
}

stream {
upstream kube_apiserver {
least_conn;
server ${MASTER_IPS[0]}:6443;
server ${MASTER_IPS[1]}:6443;
...
server ${MASTER_IPS[N]}:6443;
}

server {
listen 127.0.0.1:6443;
proxy_pass kube_apiserver;
proxy_timeout 10m;
proxy_connect_timeout 1s;
}
}

http {
aio threads;
aio_write on;
tcp_nopush on;
tcp_nodelay on;

keepalive_timeout 5m;
keepalive_requests 100;
reset_timedout_connection on;
server_tokens off;
autoindex off;

server {
listen 8081;
location /healthz {
access_log off;
return 200;
}
location /stub_status {
stub_status on;
access_log off;
}
}
}
EOF

Nginx manifest

cat <<EOF > /etc/kubernetes/manifests/nginx-proxy.yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-proxy
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
k8s-app: kube-nginx
spec:
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
nodeSelector:
kubernetes.io/os: linux
priorityClassName: system-node-critical
containers:
- name: nginx-proxy
image: docker.io/library/nginx:1.19
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 25m
memory: 32M
securityContext:
privileged: true
livenessProbe:
httpGet:
path: /healthz
port: 8081
readinessProbe:
httpGet:
path: /healthz
port: 8081
volumeMounts:
- mountPath: /etc/nginx
name: etc-nginx
readOnly: true
volumes:
- name: etc-nginx
hostPath:
path: /etc/nginx
EOF

配置kube-proxy

配置文件:

mv kube-proxy.kubeconfig /etc/kubernetes/
cat <<EOF > /etc/kubernetes/kube-proxy-config.yaml
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
bindAddress: 0.0.0.0
clientConnection:
kubeconfig: "/etc/kubernetes/kube-proxy.kubeconfig"
clusterCIDR: "10.200.0.0/16"
mode: ipvs
EOF

kube-proxy服务文件:

cat <<EOF > /etc/systemd/system/kube-proxy.service
[Unit]
Description=Kubernetes Kube Proxy
Documentation=https://github.com/kubernetes/kubernetes

[Service]
ExecStart=/usr/local/bin/kube-proxy \\
--config=/etc/kubernetes/kube-proxy-config.yaml
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

服务启动

systemctl daemon-reload
systemctl enable kubelet kube-proxy
systemctl start kubelet kube-proxy
journalctl -f -u kubelet
journalctl -f -u kube-proxy

手动下载镜像pause

在每个节点下载:

crictl pull registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/pause:3.2
ctr -n k8s.io i tag registry.cn-hangzhou.aliyuncs.com/kubernetes-kubespray/pause:3.2 k8s.gcr.io/pause:3.2

主动拉取nginx镜像

crictl pull docker.io/library/nginx:1.19

网络插件Calico

下载并修改Calico配置文件:

curl https://raw.githubusercontent.com/projectcalico/calico/v3.24.5/manifests/calico.yaml -O

修改IP自动发现方式:

- name: IP
valueFrom:
fieldRef:
fieldPath: status.hostIP

修改CIDR:

- name: CALICO_IPV4POOL_CIDR
value: "10.200.0.0/16"

启动Calico:

kubectl apply -f calico.yaml

DNS插件CoreDNS

设置CoreDNS的cluster-ip并下载配置文件:

COREDNS_CLUSTER_IP=10.233.0.10

创建CoreDNS:

apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . /etc/resolv.conf {
prefer_udp
}
cache 30
loop
reload
loadbalance
}

替换cluster-ip并应用:

sed -i "s/\${COREDNS_CLUSTER_IP}/${COREDNS_CLUSTER_IP}/g" coredns.yaml
kubectl apply -f coredns.yaml

通过这些优化,文档不仅更易于阅读,也更有可能在搜索引擎中获得更高的排名。

鱼雪