Deploy Kubernetes cluster via kubeadm

Overview

1. Install kubeadm(kubelet kubectl) and docker

https://docs.docker.com/engine/install/ubuntu/

Set up docker apt repository:

 1# Add Docker's official GPG key:
 2sudo apt-get update
 3sudo apt-get install ca-certificates curl
 4sudo install -m 0755 -d /etc/apt/keyrings
 5sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
 6sudo chmod a+r /etc/apt/keyrings/docker.asc
 7
 8# Add the repository to Apt sources:
 9echo \
10  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
11  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
12  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
13sudo apt-get update

Install latest version of docker:

1sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Download the public signing key for the Kubernetes package repositories:

1# If the directory `/etc/apt/keyrings` does not exist, it should be created before the curl command, read the note below.
2# sudo mkdir -p -m 755 /etc/apt/keyrings
3curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.30/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

Add the Kubernetes apt repository:

1# This overwrites any existing configuration in /etc/apt/sources.list.d/kubernetes.list
2echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.30/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

Install kubeadm, kubelet, and kubectl:

1sudo apt-get update
2sudo apt-get install -y kubelet kubeadm kubectl
3sudo apt-mark hold kubelet kubeadm kubectl
4
5# Enable kubelet service
6sudo systemctl enable --now kubelet

Check kubelet status:

 1systemctl status kubelet                                                                                                                                                                         ok | base py | with ubuntu@VM-24-14-ubuntu | at 23:35:20
 2● kubelet.service - kubelet: The Kubernetes Node Agent
 3     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
 4    Drop-In: /etc/systemd/system/kubelet.service.d
 5             └─10-kubeadm.conf
 6     Active: active (running) since Wed 2024-09-04 23:35:20 CST; 4s ago
 7       Docs: https://kubernetes.io/docs/home/
 8   Main PID: 51001 (kubelet)
 9      Tasks: 7 (limit: 4613)
10     Memory: 71.2M
11     CGroup: /system.slice/kubelet.service
12             └─51001 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml
13
14Sep 04 23:35:20 VM-24-14-ubuntu systemd[1]: Started kubelet: The Kubernetes Node Agent.
15Sep 04 23:35:26 VM-24-14-ubuntu kubelet[51001]: E0904 23:35:26.374002   51001 server.go:206] "Failed to load kubelet config file" err="failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/ku>
16Sep 04 23:35:26 VM-24-14-ubuntu systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
17Sep 04 23:35:26 VM-24-14-ubuntu systemd[1]: kubelet.service: Failed with result 'exit-code'.

Cause we don't have a kubernetes cluster, so the kubelet service will fail to start. It's normal.

2. Deploy Master Node

kubeadm.yaml

 1apiVersion: kubeadm.k8s.io/v1beta3
 2kind: InitConfiguration
 3nodeRegistration:
 4  criSocket: "/var/run/dockershim.sock"
 5  kubeletExtraArgs:
 6    cgroup-driver: "systemd"
 7  ignorePreflightErrors:
 8    - IsPrivilegedUser
 9
10---
11apiVersion: kubeadm.k8s.io/v1beta3
12kind: ClusterConfiguration
13kubernetesVersion: "v1.22.4"
14apiServer:
15  extraArgs:
16    runtime-config: "api/all=true"
17controllerManager:
18  extraArgs:
19    "node-cidr-mask-size": "24"  # 24 is more common for Kubernetes networks, adjusting from 20
20imageRepository: "registry.cn-hangzhou.aliyuncs.com/google_containers"
21clusterName: "example-cluster"
22networking:
23  # Specify the pod network CIDR. For Calico, the default is 192.168.0.0/16.
24  podSubnet: "192.168.0.0/16"
25  # You can also specify the service subnet if needed, but it's optional.
26  serviceSubnet: "10.96.0.0/12"
27  dnsDomain: "cluster.local"
28  dnsIP: "10.96.0.10"
29---
30apiVersion: kubelet.config.k8s.io/v1beta1
31kind: KubeletConfiguration
32# Adjust the kubelet cgroup driver to systemd for better compatibility with the control plane
33cgroupDriver: "systemd"
34---
35apiVersion: kubeproxy.config.k8s.io/v1alpha1
36kind: KubeProxyConfiguration
37# kube-proxy specific options
38mode: "ipvs"  # Optionally switch to IPVS for better performance in larger clusters

Initialize the cluster:

1sudo kubeadm init --config kubeadm.yaml
  1. Pre-flight checks: The process begins by running several pre-flight checks, such as verifying system compatibility and pulling required container images for the Kubernetes control plane. This ensures that the environment is ready to initialize a cluster.

  2. Certificate generation: Several certificates are generated for secure communication between the various Kubernetes components. For example:

    • The ca certificate is created to serve as the root certificate authority.
    • The apiserver certificate is generated and signed, allowing the API server to serve secure traffic for both DNS names (like kubernetes.default) and specific IPs (10.96.0.1 and 10.0.24.14).
    • Certificates for etcd are generated (etcd/server, etcd/peer) for secure communication within etcd, including peer communication and client access.
  3. Kubeconfig creation: The configuration files (kubeconfig) for various components are written into /etc/kubernetes. These files contain the necessary credentials and context for the components like admin, controller-manager, and scheduler to interact securely with the cluster.

  4. Kubelet setup: The kubelet configuration is written into /var/lib/kubelet/config.yaml, and the kubelet service is started. The kubelet is responsible for managing the lifecycle of pods and communicating with the control plane.

  5. Control plane pod manifests: Static pod manifests are created for core control plane components (API server, controller manager, scheduler, and etcd) in /etc/kubernetes/manifests. These files define how these core components should run as static pods on the node.

  6. Waiting for control plane: Kubelet boots up the control plane components as static pods, and kubeadm waits for the API server, scheduler, and controller-manager to become healthy. This takes around 13.5 seconds in this case.

  7. Cluster configuration: Kubeadm then uploads the configuration into the cluster, such as storing the kubeadm-config in a ConfigMap in the kube-system namespace. It also configures kubelet using another ConfigMap (kubelet-config-1.22) for the cluster-wide configuration of kubelets.

  8. Marking control plane node: The node (vm-24-14-ubuntu) is labeled and tainted as a control-plane node. This includes marking it with labels like node-role.kubernetes.io/control-plane and applying a taint (NoSchedule), ensuring that this node will not schedule regular workloads.

  9. Bootstrap token setup: A bootstrap token is generated and configured to allow worker nodes to join the cluster. This token is used in the kubeadm join command that is provided later.

  10. RBAC setup for bootstrap: Kubernetes configures RBAC rules for the bootstrap token, enabling it to allow new nodes to authenticate and request certificate signing requests (CSRs) for joining the cluster. It also configures certificate rotation for all nodes.

  11. Core add-ons: Essential add-ons like CoreDNS (for service discovery within the cluster) and kube-proxy (for managing network rules) are applied to the cluster.

  12. Final instructions: After the control plane is successfully initialized, the user is instructed to configure kubectl by copying the admin.conf file and setting up the pod network. Additionally, instructions are provided for adding worker nodes to the cluster using the kubeadm join command, which includes the control plane IP, bootstrap token, and certificate hash.

If all environment set correctly, you can see the following output:

 1W0905 02:46:58.306339  199271 strict.go:55] error unmarshaling configuration schema.GroupVersionKind{Group:"kubeadm.k8s.io", Version:"v1beta3", Kind:"ClusterConfiguration"}: error unmarshaling JSON: while decoding JSON: json: unknown field "dnsIP"
 2[init] Using Kubernetes version: v1.22.4
 3[preflight] Running pre-flight checks
 4[preflight] Pulling images required for setting up a Kubernetes cluster
 5[preflight] This might take a minute or two, depending on the speed of your internet connection
 6[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
 7[certs] Using certificateDir folder "/etc/kubernetes/pki"
 8[certs] Generating "ca" certificate and key
 9[certs] Generating "apiserver" certificate and key
10[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local vm-24-14-ubuntu] and IPs [10.96.0.1 10.0.24.14]
11[certs] Generating "apiserver-kubelet-client" certificate and key
12[certs] Generating "front-proxy-ca" certificate and key
13[certs] Generating "front-proxy-client" certificate and key
14[certs] Generating "etcd/ca" certificate and key
15[certs] Generating "etcd/server" certificate and key
16[certs] etcd/server serving cert is signed for DNS names [localhost vm-24-14-ubuntu] and IPs [10.0.24.14 127.0.0.1 ::1]
17[certs] Generating "etcd/peer" certificate and key
18[certs] etcd/peer serving cert is signed for DNS names [localhost vm-24-14-ubuntu] and IPs [10.0.24.14 127.0.0.1 ::1]
19[certs] Generating "etcd/healthcheck-client" certificate and key
20[certs] Generating "apiserver-etcd-client" certificate and key
21[certs] Generating "sa" key and public key
22[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
23[kubeconfig] Writing "admin.conf" kubeconfig file
24[kubeconfig] Writing "kubelet.conf" kubeconfig file
25[kubeconfig] Writing "controller-manager.conf" kubeconfig file
26[kubeconfig] Writing "scheduler.conf" kubeconfig file
27[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
28[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
29[kubelet-start] Starting the kubelet
30[control-plane] Using manifest folder "/etc/kubernetes/manifests"
31[control-plane] Creating static Pod manifest for "kube-apiserver"
32[control-plane] Creating static Pod manifest for "kube-controller-manager"
33[control-plane] Creating static Pod manifest for "kube-scheduler"
34[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
35[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
36[apiclient] All control plane components are healthy after 8.503051 seconds
37[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
38[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
39[upload-certs] Skipping phase. Please see --upload-certs
40[mark-control-plane] Marking the node vm-24-14-ubuntu as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
41[mark-control-plane] Marking the node vm-24-14-ubuntu as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
42[bootstrap-token] Using token: 2qah04.0eb5es8dlcv2ouvj
43[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
44[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
45[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
46[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
47[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
48[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
49[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
50[addons] Applied essential addon: CoreDNS
51[addons] Applied essential addon: kube-proxy
52
53Your Kubernetes control-plane has initialized successfully!
54
55To start using your cluster, you need to run the following as a regular user:
56
57  mkdir -p $HOME/.kube
58  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
59  sudo chown $(id -u):$(id -g) $HOME/.kube/config
60
61Alternatively, if you are the root user, you can run:
62
63  export KUBECONFIG=/etc/kubernetes/admin.conf
64
65You should now deploy a pod network to the cluster.
66Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
67  https://kubernetes.io/docs/concepts/cluster-administration/addons/
68
69Then you can join any number of worker nodes by running the following on each as root:
70
71kubeadm join 10.0.24.14:6443 --token 2qah04.0eb5es8dlcv2ouvj \
72	--discovery-token-ca-cert-hash sha256:4f83a591d2e11e7a18fc7402b86e2618c74dd9267abd4894ba72dc4918b1c8db

3. Deploy Worker Node

1k get nodes                                                                                                                                              INT | took 28m 32s | base py | with ubuntu@VM-24-14-ubuntu | at 01:40:53
2NAME              STATUS     ROLES                  AGE   VERSION
3vm-24-14-ubuntu   NotReady   control-plane,master   61m   v1.22.4
4
5kd nodes 
6...
7  Ready            False   Thu, 05 Sep 2024 01:40:02 +0800   Thu, 05 Sep 2024 00:39:31 +0800   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
8...

The worker node is not ready, because we haven't deployed the pod network yet. Using cilium as the pod network:

1kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.1/manifests/tigera-operator.yaml
2kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.28.1/manifests/custom-resources.yaml

After a while, the worker node will be ready:

1kno
2NAME              STATUS   ROLES                  AGE     VERSION
3vm-24-14-ubuntu   Ready    control-plane,master   2m11s   v1.22.4

Calico's default network walkthrough

  • Calico is running in VXLAN CrossSubnet mode, meaning VXLAN tunneling is only used when nodes are in different subnets.
  • BGP is used for intra-subnet communication, where nodes are within the same subnet, and VXLAN is bypassed.
  • Pod-to-Node traffic is routed through physical interfaces and often undergoes NAT when necessary.
Scenario Traffic Path Route Interface Used Explanation
Pod A to Pod B (Same Node) Pod A -> veth pair -> Node Network Namespace -> veth pair -> Pod B No routing needed veth interfaces (caliX) Pod-to-Pod traffic within the same node is handled entirely within the node's network namespace using veth pairs. It does not go through physical or virtual routing.
Pod A to Pod B (Different Node, Same Subnet) Pod A -> Node A (BGP routing) -> Node B -> Pod B 192.168.x.0/24 via 10.0.0.3 dev eth0 eth0 (physical interface) Traffic between nodes in the same subnet uses BGP to exchange routes. The traffic is routed directly through the physical network (eth0) without any VXLAN encapsulation.
Pod A to Pod B (Different Node, Different Subnet) Pod A -> VXLAN Encapsulation (Node A) -> Network -> VXLAN Decapsulation (Node B) -> Pod B 192.168.y.0/24 via 10.0.0.3 dev vxlan.calico vxlan.calico When nodes are in different subnets, Calico uses VXLAN to encapsulate the traffic, and the route will include vxlan.calico. The traffic is encapsulated at Node A and decapsulated at Node B.
Pod A to Node A Pod A -> Node A (no encapsulation, handled locally) -> Node A No routing needed eth0 or caliX Traffic from Pod to the node it is running on is handled locally and does not traverse the physical network or any other routing.
Pod A to Node B (Same Subnet) Pod A -> Node A (BGP routing) -> Node B 10.0.0.3 via dev eth0 eth0 (physical interface) Traffic from a Pod on Node A to Node B in the same subnet will use BGP routes and flow directly through the physical interface (eth0), without encapsulation.
Pod A to External Network (e.g., 8.8.8.8) Pod A -> Node A (NAT conversion) -> Node A external interface -> Internet default via 10.0.24.1 dev eth0 src 10.0.24.14 eth0 (NAT-enabled) Traffic from Pod A to the internet is routed via Node A's external interface after undergoing NAT conversion (Pod IP is replaced with Node IP). The route uses the node’s default route to reach external destinations.

4. Reset kubeadm cluster and re-deploy via rke tool

kubeadm reset: sudo kubeadm reset

 1[reset] Reading configuration from the cluster...
 2[reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
 3[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
 4[reset] Are you sure you want to proceed? [y/N]: y
 5[preflight] Running pre-flight checks
 6The 'update-cluster-status' phase is deprecated and will be removed in a future release. Currently it performs no operation
 7[reset] Stopping the kubelet service
 8[reset] Unmounting mounted directories in "/var/lib/kubelet"
 9[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
10[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]
11[reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni]
12
13The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d
14
15The reset process does not reset or clean up iptables rules or IPVS tables.
16If you wish to reset iptables, you must do so manually by using the "iptables" command.
17
18If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
19to reset your system's IPVS tables.
20
21The reset process does not clean your kubeconfig files and you must remove them manually.
22Please, check the contents of the $HOME/.kube/config file.

Install kubernetes cluster via rke: https://rke.docs.rancher.com/installation

RKE (Rancher Kubernetes Engine) is an open-source, lightweight Kubernetes installer developed by Rancher. It allows users to quickly and easily deploy and manage Kubernetes clusters in any environment, including bare-metal servers, cloud providers, and virtual machines.

1wget https://github.com/rancher/rke/releases/download/v1.5.12/rke_linux-amd64
2mv rke_linux-amd64 rke
3chmod +x rke
4./rke --version
5
6rke version v1.5.12

Create a cluster configuration file cluster.yml using a interactive method:

 1./rke config --name cluster.yml            
 2
 3[+] Cluster Level SSH Private Key Path [~/.ssh/id_rsa]:
 4[+] Number of Hosts [1]:
 5[+] SSH Address of host (1) [none]: 10.0.24.14
 6[+] SSH Port of host (1) [22]:
 7[+] SSH Private Key Path of host (10.0.24.14) [none]: ~/.ssh/id_rsa
 8[+] SSH User of host (10.0.24.14) [ubuntu]:
 9[+] Is host (10.0.24.14) a Control Plane host (y/n)? [y]: y
10[+] Is host (10.0.24.14) a Worker host (y/n)? [n]: y
11[+] Is host (10.0.24.14) an etcd host (y/n)? [n]: y
12[+] Override Hostname of host (10.0.24.14) [none]:
13[+] Internal IP of host (10.0.24.14) [none]:
14[+] Docker socket path on host (10.0.24.14) [/var/run/docker.sock]:
15[+] Network Plugin Type (flannel, calico, weave, canal, aci) [canal]:
16[+] Authentication Strategy [x509]:
17[+] Authorization Mode (rbac, none) [rbac]:
18[+] Kubernetes Docker image [rancher/hyperkube:v1.28.12-rancher1]:
19[+] Cluster domain [cluster.local]:
20[+] Service Cluster IP Range [10.43.0.0/16]:
21[+] Enable PodSecurityPolicy [n]:
22[+] Cluster Network CIDR [10.42.0.0/16]:
23[+] Cluster DNS Service IP [10.43.0.10]:
24[+] Add addon manifest URLs or YAML files [no]:

Start and check cluster's status:

 1./rke up
 2
 3kno    
 4NAME         STATUS   ROLES                      AGE     VERSION
 510.0.24.14   Ready    controlplane,etcd,worker   5m43s   v1.28.12
 6
 7kp -A   
 8NAMESPACE       NAME                                      READY   STATUS      RESTARTS   AGE
 9ingress-nginx   ingress-nginx-admission-create-vbzdm      0/1     Completed   0          5m4s
10ingress-nginx   ingress-nginx-admission-patch-x7shj       0/1     Completed   1          5m4s
11ingress-nginx   nginx-ingress-controller-pk94t            1/1     Running     0          5m4s
12kube-system     calico-kube-controllers-5b564d9b7-5lbrn   1/1     Running     0          5m34s
13kube-system     canal-lxk8w                               2/2     Running     0          5m34s
14kube-system     coredns-54cc789d79-mrbpb                  1/1     Running     0          5m23s
15kube-system     coredns-autoscaler-6ff6bf758-hrxmh        1/1     Running     0          5m23s
16kube-system     metrics-server-657c74b5d8-jjxzd           1/1     Running     0          5m14s
17kube-system     rke-coredns-addon-deploy-job-rmw2r        0/1     Completed   0          5m27s
18kube-system     rke-ingress-controller-deploy-job-2z4d6   0/1     Completed   0          5m7s
19kube-system     rke-metrics-addon-deploy-job-9bs7c        0/1     Completed   0          5m17s
20kube-system     rke-network-plugin-deploy-job-2pzvs       0/1     Completed   0          5m37s

Note that under vxlanMode: Never mode, this is the most significant change: cross-subnet traffic now uses direct routing without VXLAN encapsulation. The physical network interface (eth0) will handle this traffic, just like it does for traffic between nodes in the same subnet.

 1kubectl get ippools.crd.projectcalico.org default-ipv4-ippool -o yaml                                                                                        ok | base py | at local kube | with ubuntu@VM-24-14-ubuntu | at 23:36:25
 2
 3apiVersion: crd.projectcalico.org/v1
 4kind: IPPool
 5metadata:
 6  annotations:
 7    projectcalico.org/metadata: '{"uid":"d5446eb8-75fb-4979-9a7f-c41818d7d80a","creationTimestamp":"2024-09-05T15:32:38Z"}'
 8  creationTimestamp: "2024-09-05T15:32:38Z"
 9  generation: 1
10  name: default-ipv4-ippool
11  resourceVersion: "575"
12  uid: a9945c61-3c31-42a4-a8d6-b835da17b59b
13spec:
14  allowedUses:
15  - Workload
16  - Tunnel
17  blockSize: 26
18  cidr: 172.16.0.0/16
19  ipipMode: Never
20  natOutgoing: true
21  nodeSelector: all()
22  vxlanMode: Never