以前聽說要自行搭建 Kubernetes是十分困難,現在剛好有時間在家測試一下。幸好網上有很多教學可以參考,而且最終成功了。我把安裝過程做個記錄(詳見下文 - Setup Log)
以下是安裝過程得到的一些經驗總結:
- 目前使用 Windows 11 + vmware 安裝,RAM 只有16GB, CPU 是 i7-3770s
- 以前 K8S 叫作 master node, 現在叫作 Control-Plane node
- Control-Plane node 分配的RAM不能太少,2GB會出錯誤,建議4GB(我猜3GB也剛好可以)。Worker node 分配 2GB RAM沒有問題
- VM 的Network要使用 bridged,還要在 Router 為VM分配固定IP
- CentOS 7 VM Power on 後,如安裝時沒有啟動Network,開機後要自行修改 ONBOOT=Yes,也留意有有設定好時區
- 這一次想先用舊的 CentOS 7,但應該最新穩定版,如 CentOS 9
- 安裝時所參考的文章有點過時,較好的文章是基於 Ubuntu 的,所以不算很順利。主要的參考文章為:Kubernetes (K8S) 自建地端伺服器 (on-premise) 建置實錄 - 清新下午茶 (jks.coffee)
- 遇到的最大問題是 Flannel 和 CoreDNS 的POD啟動失敗,最後發現問題在於CNI,要手動修改
/etc/systemd/system/cri-docker.service
配置文件,還要用上配合的pause
image版本,還要利用kubeadm reset
重置再來一遍才可以 - 目前因為RAM有限只能加一個Worker Node,之後可以在NAS加另一隻Worker Node 做測試
- 因為cluster api關係,所以要在每個node的firewall開放port: 443/tcp。但因為測試,所以關掉 firewalld 也可以的。
- 留意cluster init (kubeadm init) 和設定其他plugin時,IP Range 必需要配對好,很多時出現問題也因為IP 區段相衝所致。
Setup log
Install CentOS on vmware
- Download ISO, I use CentOS-7-x86_64-Minimal-2009.iso
- We can also use CentOS 9 minimal
- Create VM and install
- Name: CentOS 7-2009 Min
- RAM: 2GB
- Processor: 1x2 Core
- HDD: 20GB
- Network: Bridged
- Guest OS: Linux, CentOS 7 64Bit
- Start VM, login as root
- If the network is not set during installation:
- Edit
/etc/sysconfig/network-scripts/ifcfg-<interface>
- Change ONBOOT to Yes:
ONBOOT=Yes
- or use
nmtui
command, see: How to setup network after RHEL/CentOS 7 minimal installation | LinTut
- Edit
- If the timezone is not set during installation:
- print the timezone:
timedatectl
- set to HK:
timedatectl set-timezone Asia/Hong_Kong
- print the timezone:
- Perform yum update and upgrade, to latest version
- Poweroff
- If the network is not set during installation:
- Create CentOS VM for K8S Node base template
- Select VM: CentOS 7-2009 Min
- <Right-click> -> Manage -> Clone
- Clone from: Current state in the VM
- Clone method: Create a linked clone
- Name: K8S-Node-base
- Power on
- Install basic tools
yum install -y nc git net-tools
Preparation for k8s installation: disable swap
# Check swap status free -h # Turn off swap temporary swapoff -a # Turn off swap permanently vi /etc/fstab # Edit and comment out: /dev/mapper/centos-swap swap... # also set the vm.swappiness sysctl -w vm.swappiness=0 # Reboot reboot # check swap again free -h sysctl vm.swappiness
- NOTE: donno why
vm.swappiness
cannot be set to 0. It will revert to the default value after reboot.
Install Docker
Ref: How To Install and Use Docker on CentOS 7 | DigitalOcean
# Install docker curl -fsSL https://get.docker.com/ | sh # Start docker and check systemctl start docker systemctl status docker docker ps # Enable the docker systemctl enable docker
Update the daemon.json (avoid network conflict)
vi /etc/docker/daemon.json
{ "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "tag": "{{.Name}}", "max-size": "2m", "max-file": "2" }, "default-address-pools": [ { "base": "172.31.0.0/16", "size": 24 } ], "bip": "172.7.0.1/16" }
note: change the default-address pool if needed
Reboot and check if docker is started and working
Check if docker is using systemd for cgroupdriver
# restart docker if needed systemctl daemon-reload systemctl restart docker docker info | grep -i cgroup # Note: it is not version 2 but version 1
Now let's install k8s stuffs: kubelet, kubeadm and kubectl (Ref: Installing kubeadm | Kubernetes)
Login as root and excute the follow commands
# Set SELinux in permissive mode (effectively disabling it) setenforce 0 sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config # This overwrites any existing configuration in /etc/yum.repos.d/kubernetes.repo cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/ enabled=1 gpgcheck=1 gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni EOF # Reset the iptables iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT # Set firewall (NOTE: 443/tcp is very important!) firewall-cmd --add-port=443/tcp --add-port=6443/tcp --add-port=10250/tcp --permanent firewall-cmd --list-all # Enable overlay and br_netfilter $ cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf overlay br_netfilter EOF modprobe overlay modprobe br_netfilter # Set sysctrl.d for k8s cat <<EOF > /etc/sysctl.d/k8s.conf vm.swappiness = 0 net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF # Apply it: sysctl --system # Check if overlay and br_netfilter are working: lsmod | grep br_netfilter lsmod | grep overlay # Check the values are set to 1 properly sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward # Install kubelet, kubeadm and kubectl yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes # Create kubelet config # NOTE: --cgroup-driver is deprecated, not required) #cat <<EOF > /etc/sysconfig/kubelet #KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" #EOF echo -n "" > /etc/sysconfig/kubelet # Enable kubelet systemctl enable --now kubelet # NOTE: kubelet service start failure...
Install Container Runtime Interface (CRI) – cri-dockerd
Ref:
- Kubernetes (K8S) 自建地端伺服器 (on-premise) 建置實錄 - 清新下午茶 (jks.coffee)
- GitHub - Mirantis/cri-dockerd: dockerd as a compliant Container Runtime Interface for Kubernetes
- Centos 7 部署Kubernetes集群 (基于cri-dockerd) - 西瓜君~ - 博客园 (cnblogs.com)
# Download cri-dockerd curl -O -L https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.9/cri-dockerd-0.3.9.amd64.tgz # Extract tar xzvf cri-dockerd-0.3.9.amd64.tgz # Install install -o root -g root -m 0755 cri-dockerd/cri-dockerd /usr/bin/cri-dockerd rm -Rf cri-dockerd git clone https://github.com/Mirantis/cri-dockerd.git install cri-dockerd/packaging/systemd/* /etc/systemd/system # IMPORTANT - Update /etc/systemd/system/cri-docker.service vi /etc/systemd/system/cri-docker.service # Change from: # ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// # To: # ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.k8s.io/pause:3.9 # NOTE: if "kubeadm init" complains the pause:3.9 is outdated, please change the above version to the suggested one. # Enable systemd systemctl daemon-reload systemctl enable --now cri-docker.service systemctl enable --now cri-docker.socket # Clean up rm -Rf cri-dockerd cri-dockerd-0.3.9.amd64.tgz reboot
Take a VM snapshot for cloning later
Setup the Control-plane nodes: Ctrl (a.k.a. master)
Clone the VM for the master node from previous VM.
Note:
- Control-plane node needs 4G Ram, 2GB does not work
- Reserve the static IP for this VM, Important!
Start the VM, login as root then:
# Reset the machine ID rm /etc/machine-id systemd-machine-id-setup # Change hostname to: k8s-ctrl vi /etc/hostname # Re-configurate the opensshd rm /etc/ssh/ssh_host_* -f && systemctl restart sshd # Reboot and connect ssh again reboot # Get the mac address of the physical interface ip link show $(arp | sed -n '2p' | awk '{print $NF}') |sed -n '2p' |awk '{print $2}' # -OR- ip link show $(ls -l /sys/class/net/ | grep -v virtual | awk 'NR==2 {print $9}') |sed -n '2p' |awk '{print $2}' # Set the static IP (in router's DHCP) for this VM to keep the IP unchanged # Prepare the variables: IPADDR=$(ip route get 8.8.8.8 | head -1 | awk '{print $NF}') NODENAME=$(hostname -s) POD_CIDR="10.244.0.0/16" # Perform Kubeadm init: # Note: Service CIDR can be set if needed: --service-cidr=10.96.0.0/16 kubeadm init \ --apiserver-advertise-address=$IPADDR \ --control-plane-endpoint=$IPADDR \ --node-name $NODENAME \ --pod-network-cidr=$POD_CIDR \ --cri-socket unix:///var/run/cri-dockerd.sock \ --ignore-preflight-errors=all
Wait until the initialzation is finished. Jot down the join-command:
Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of control-plane nodes by copying certificate authorities and service account keys on each node and then running the following as root: kubeadm join 192.168.11.44:6443 --token xxx \ --discovery-token-ca-cert-hash sha256:xxxx\ --control-plane Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.11.44:6443 --token xxxx \ --discovery-token-ca-cert-hash sha256:xxxx
We can print the join-string anytime if needed:
kubeadm token create --print-join-command
Update the .bash_profile
vi ~/.bash_profile # Add: # export KUBECONFIG=/etc/kubernetes/admin.conf # Reboot and check
Since CNI (Flannel to be used) has not been installed, kubelet will report error
systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Wed 2024-01-24 00:51:13 EST; 4min 48s ago Docs: https://kubernetes.io/docs/ Main PID: 1032 (kubelet) Tasks: 11 Memory: 113.7M CGroup: /system.slice/kubelet.service └─1032 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime-endpoint=unix:///var/run/cri-dockerd.sock --pod-infra-contai... Jan 24 00:55:13 k8s-ctrl kubelet[1032]: E0124 00:55:13.138304 1032 kubelet.go:2855] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized"
Install helm
# Install over script curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh # Check installation helm version # clean up rm ./get_helm.sh
Install Flannel CNI
# NOTE: POD_CIDR is the same as previous step POD_CIDR="10.244.0.0/16" # Create namespace "kube-flannel" kubectl create ns kube-flannel # Assign privileged for namespace "kube-flannel" kubectl label --overwrite ns kube-flannel pod-security.kubernetes.io/enforce=privileged # Add flannel to helm repo helm repo add flannel https://flannel-io.github.io/flannel/ # Install flannel via helm helm install flannel --set podCidr="$POD_CIDR" --namespace kube-flannel flannel/flannel # Check if container is started: docker ps --filter name=^.*flannel.*$ # If ok, a runnung flannel container will be found
Setup the Worker nodes: 1
Clone the VM for the work node from previous base VM.
Note:
- Worker Node can set to 2GB
- Reserve the static IP for this VM, Important!
Start the VM, login as root then:
# Reset the machine ID rm /etc/machine-id systemd-machine-id-setup # Change hostname to: k8s-node-1 vi /etc/hostname # Re-configurate the opensshd rm /etc/ssh/ssh_host_* -f && systemctl restart sshd # Reboot and connect ssh again reboot # Get the mac address of the physical interface ip link show $(arp | sed -n '2p' | awk '{print $NF}') |sed -n '2p' |awk '{print $2}' # -OR- ip link show $(ls -l /sys/class/net/ | grep -v virtual | awk 'NR==2 {print $9}') |sed -n '2p' |awk '{print $2}' # Set the static IP (in router's DHCP) for this VM to keep the IP unchanged # Prepare the variables: IPADDR=$(ip route get 8.8.8.8 | head -1 | awk '{print $NF}') NODENAME=$(hostname -s) POD_CIDR="10.244.0.0/16"
Join the cluster
# In case you need to get the join-command: # Run it on control-plain node, don't forget to append the cri-socket options kubeadm token create --print-join-command # Join the worker node to cluster kubeadm join 192.168.11.44:6443 --token xxx \ --discovery-token-ca-cert-hash sha256:xxxx \ --cri-socket unix:///var/run/cri-dockerd.sock
Check if the node is added successfully. Perform the following commands in control-plane node (master node)
kubectl get nodes # NAME STATUS ROLES AGE VERSION # k8s-ctrl Ready control-plane 3h55m v1.28.6 # k8s-node-1 Ready <none> 2m45s v1.28.6
Setup the Worker nodes: n-th
Repeat the above steps for additional nodes
Reset / Start-over
In case there is somthing wrong and wanna reset the cluster and start over:
# In EACH control-plane and worker nodes: kubeadm reset -f --cri-socket unix:///var/run/cri-dockerd.sock rm -rf /etc/cni/net.d # Reset the iptables iptables -F && iptables -X && iptables -F -t nat && iptables -X -t nat && iptables -P FORWARD ACCEPT # Optionally rm $HOME/.kube/config