Below is the process to setup a 3 node cluster, with 1 master and 2 slaves.
master at 192.168.0.13
slaves at 192.168.0.14 and 192.168.0.15
================ set static ip ================
Setup static ip for the k8s nodes, so the ip doesn't change over time.
Go to /etc/sysconfig/network-scripts/ifcfg-xxx, the xxx is the interface name, could be different on different machines
Change dhcp to static and put in IPADDR, Netmask and gateway
#BOOTPROTO="dhcp"
BOOTPROTO="static"
IPADDR=192.168.0.13
NETMASK=255.255.255.0
GATEWAY=102.168.0.1
Restart
================ Install Docker ================
Add the official docker repo for docker-ce to yum repos.
sudo curl https://download.docker.com/linux/centos/docker-ce.repo > /home/ben/docker-ce.repo
sudo mv /home/ben/docker-ce.repo /etc/yum.repos.d/docker-ce.repo
Import GPG key
sudo curl https://download.docker.com/linux/centos/gpg > /home/ben/docker-key
sudo rpm --import /home/ben/docker-key
on centos8 seems need to install containerd.io first before installing docker-ce, otherwise it fails
yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm
install docker-ce
sudo yum install docker-ce
Check the run status:
sudo systemctl status docker
if disabled, enable it and start
sudo systemctl enable docker
sudo systemctl start docker
================ close firewall, swap and selinux ================
Close the firewall as it is a trouble maker within the cluster. Doesn't mean it is unsafe. The entry into the cluster still needs to be secured.
systemctl disable firewalld
Close the swap as memory swap impacts the cluster's performance and stability.
Commen out the line for swap in /etc/fstab
#
# /etc/fstab
# Created by anaconda on Wed Jul 15 01:48:12 2020
#
# Accessible filesystems, by reference, are maintained under '/dev/disk/'.
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.
#
# After editing this file, run 'systemctl daemon-reload' to update systemd
# units generated from this file.
#
/dev/mapper/cl-root / xfs defaults 0 0
UUID=a60d7d0b-2d55-4b1e-8c64-928952b3285b /boot ext4 defaults 1 2
#/dev/mapper/cl-swap swap swap defaults 0 0
Disable Selinux. This is a security component but it barely understands namespaces or containers.
Also set SELinux in permissive mode (effectively disabling it). The sed command does a lot of file content operations.
The 's' means substitution. '/' is delimeter. the first '/xxx' is substitued with the second '/xxx'. The /etc/selinux/config may not exist.
sudo setenforce 0
sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config
================ install k8s components================
Include the official kubernetes repository into /etc/yum.repo.d/kubernetes.repo
May need to check the official website to make sure the below is up to date.
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
Install k8s components
sudo yum install -y kubelet kubeadm kubectl
The kubelet is the primary "node agent" that runs on each node. It can register the node with the apiserver. It works in terms of a PodSpec which is a YAML or JSON object that describes a pod.
The kubeadm performs the actions necessary to get a minimum viable cluster up and running. By design, it cares only about bootstrapping, not about provisioning machines. Nice-to-have addons, like the Kubernetes Dashboard, monitoring solutions, and cloud-specific addons, is not in scope.
The kubectl command line tool lets you control Kubernetes clusters.
Try start kubelet. However it won't be able to start... "code=existed, status=255".
sudo systemctl enable kubelet
sudo systemctl start kubelet
The kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do.
Check systemd logs:
journalctl -u kubelet.service
It says "1752 server.go:199] failed to load Kubelet config file /var/lib/kubelet/config.yaml".
The config.yaml is not there. if you copy or manually enter the config.yaml file, it would complain about another config file, e.g. "unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf".
This will be solved when "kubeadm init" is run, which would set up all the config files.
================ configure cgroup manager ================
The docker and kubelet are all systemd services. When systemd init the process, it consumes a root control group (cgroup) and acts as a cgroup manager. A control group cgroup is used to constrain resources allocated to processes and a cgroup manager (systemd in this case) has a view of the allocated resources. So if there are two cgroup managers, there are two different views of resources which could cause k8s nodes become unstable under resource pressure.
So, it's a good practice to configure both docker and kubelet to use the same cgroup manager.
By default the cgroup manager is 'systemd' for kubelet, and 'cgroupfs' for docker.
We can either set docker's cgroup manager to systemd, or the other way around, set kubelet's cgroup manager to cgroupfs.
The official website suggests using systemd.
To view docker's cgroup manager:
sudo docker info |grep -i cgroup
The cgroup config for kubelet is in /var/lib/kubelet/config.yaml. But this file is not created until 'kubeadm init' is run.
To change docker's cgroup, update the docker's daemon.json file. If not exists, create one.
/etc/docker/daemon.json
Append the below:
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
Then restart docker. Check cgroup again using docker info.
sudo systemctl daemon-reload
sudo systemctl restart docker
Note, This step might not be needed anymore. When using Docker, kubeadm will automatically detect the cgroup driver and set it in the /var/lib/kubelet/config.yaml file during runtime.
================ pull k8s docker images ================
Before running 'kubeadm init' which pulls images and sets up a lot of things, you may pull the images separately to testing connections.
This is not mandatory, as 'kubeadm init' would pulls images anyway.
kubeadm config images pull
Check the images are pulled.
sudo docker image ls
Should see the below or similar.
REPOSITORY TAG IMAGE ID CREATED SIZE
k8s.gcr.io/kube-proxy v1.18.6 c3d62d6fe412 5 days ago 117MB
k8s.gcr.io/kube-apiserver v1.18.6 56acd67ea15a 5 days ago 173MB
k8s.gcr.io/kube-controller-manager v1.18.6 ffce5e64d915 5 days ago 162MB
k8s.gcr.io/kube-scheduler v1.18.6 0e0972b2b5d1 5 days ago 95.3MB
k8s.gcr.io/pause 3.2 80d28bedfe5d 5 months ago 683kB
k8s.gcr.io/coredns 1.6.7 67da37a9a360 5 months ago 43.8MB
k8s.gcr.io/etcd 3.4.3-0 303ce5db0e90 8 months ago 288MB
When running 'kubeadm init' later, it will find out the images are already there.
================ replicate (virtual) machines ================
repeat the above process for other machines/nodes
if it is virtual machine, simply clone the image. make sure it generate new MAC addresses for all network adapters.
Make sure the master node has at least 2 cpus, and the slave nodes have at least 1 cpu.
if cloning virtual machines, also update the static ip address in /etc/sysconfig/network-scripts/ifcfg-xxx accordingly.
Set hostname in the /network file on each machine separately
sudo vi /etc/sysconfig/network
on master node:
HOSTNAME = k8smaster
on slave1:
HOSTNAME = k8sslave1
on slave2:
HOSTNAME = k8sslave2
Set hosts to map name to ip
sudo vi /etc/hosts
k8smaster=192.168.0.13
k8sslave1=192.168.0.14
k8sslave2=192.168.0.15
================ init master node ================
The control-plane node (i.e. master node) is the machine where the control plane components run, including etcd (the cluster database) and the API Server (which the kubectl command line tool communicates with).
'kubeadm init' first runs a series of prechecks to ensure that the machine is ready to run Kubernetes. These prechecks expose warnings and exit on errors. kubeadm init then downloads and installs the cluster control plane components. This may take several minutes. Also it installs certificates and writes config files, e.g. /var/lib/kubelet/config.yaml.
sudo kubeadm init
There are many parameter options for kubeadm init. check official document for details.
If successful, the message is like:
...
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.0.13:6443 --token vxc0bx.cjm8anog3tecx4al \
--discovery-token-ca-cert-hash sha256:d22e29b852c07e3ae42d72f6104a4b37fccf0bba2c3cbfe0b6441307a255c58f
So run the commands as shown in the message for a non-root user.
This copies some keys from admin.conf to a local conf file in the home folder.
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
Remember to make a record of the kubeadm join command in the output message. You will need that token to join nodes to the cluster.
If you forgot it, 'kubeadm token list' will find the token again. The token expires in 24 hours, so you may need to 'kubeadm token create'.
if 'kubeadm init' failes or anything requires to run 'kubeadm init' again, you must 'kubeadm reset'.
Note, in case kubeadm fails and it requires to delete all stopped containers / preloaded images, use the following commands.
Removed all stopped docker containers. The 'sudo docker ls -a -q' returns all containers ids (-q shows id only). But 'docker rm' doesn't delete running container, so only stopped containers are deleted.
sudo docker container rm $(sudo docker container ls -a -q)
or
sudo docker rm $(docker ps -a -q)
Similar to stop all containers
sudo docker container stop $(sudo docker container ls -a -q)
Removed all images
sudo docker image rm $(sudo docker image ls -q)
================ setup pod network ================
Now firslty check the pod status. Everything is pending.
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-66bff467f8-f6d8b 0/1 Pending 0 7m29s
kube-system coredns-66bff467f8-qzzrh 0/1 Pending 0 7m29s
kube-system etcd-k8smaster 0/1 Pending 0 7m38s
kube-system kube-apiserver-k8smaster 0/1 Pending 0 7m38s
kube-system kube-controller-manager-k8smaster 0/1 Pending 0 7m38s
kube-system kube-proxy-c4ssf 0/1 Pending 0 7m29s
kube-system kube-scheduler-k8smaster 0/1 Pending 0 7m38s
There are Calio, Cilium , Kube-router etc. pod network plugins.
As at 2020-07-21, Calico is the only Container Network Interface CNI plugin that the kubeadm project performs e2e tests against
Install Calio pod network. Check official k8s website to confirm the url & version to be used.
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
Once done, wait for ~a minute and check pod status again. The CoreDNS must be running to be able to proceed.
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-76d4774d89-dznlv 1/1 Running 0 2m26s
kube-system calico-node-vw6xv 1/1 Running 0 2m27s
kube-system coredns-66bff467f8-f6d8b 1/1 Running 0 11m
kube-system coredns-66bff467f8-qzzrh 1/1 Running 0 11m
kube-system etcd-k8smaster 1/1 Running 0 11m
kube-system kube-apiserver-k8smaster 1/1 Running 0 11m
kube-system kube-controller-manager-k8smaster 1/1 Running 0 11m
kube-system kube-proxy-c4ssf 1/1 Running 0 11m
kube-system kube-scheduler-k8smaster 1/1 Running 0 11m
all good! and calico is also in the list now.
================ control plane node (master) isolation ================
By default, your cluster will not schedule Pods on the control-plane node, i.e. master node, for security reasons.
That is a bit of waste of resources if this is for development purpose. So lets allow it to run pods on master node.
If this is for production, don't allow it to run on master node.
kubectl taint nodes --all node-role.kubernetes.io/master-
This will remove the node-role.kubernetes.io/master taint from any nodes that have it, including the control-plane node, meaning that the scheduler will then be able to schedule Pods everywhere.
================ join slave nodes to the cluster ================
Go to the slave nodes, and run the command from master node's 'kubeadm init' output.
sudo kubeadm join 192.168.0.13:6443 --token vxc0bx.cjm8anog3tecx4al \
--discovery-token-ca-cert-hash sha256:d22e29b852c07e3ae42d72f6104a4b37fccf0bba2c3cbfe0b6441307a255c58f
If successul, the message is like:
...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
Run the 'kubectl get nodes' from master. Will see the slave nodes ready.
kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8smaster Ready master 4h46m v1.18.6
k8sslave1 Ready <none> 4m50s v1.18.6
k8sslave2 Ready <none> 3m53s v1.18.6
Finally check pod status again. Should be all good. More calico nodes and kube-proxies joined.
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-76d4774d89-dznlv 1/1 Running 0 4h38m
kube-system calico-node-7ss9l 1/1 Running 0 5m34s
kube-system calico-node-8rkkz 1/1 Running 0 6m32s
kube-system calico-node-vw6xv 1/1 Running 0 4h38m
kube-system coredns-66bff467f8-f6d8b 1/1 Running 0 4h48m
kube-system coredns-66bff467f8-qzzrh 1/1 Running 0 4h48m
kube-system etcd-k8smaster 1/1 Running 0 4h48m
kube-system kube-apiserver-k8smaster 1/1 Running 0 4h48m
kube-system kube-controller-manager-k8smaster 1/1 Running 0 4h48m
kube-system kube-proxy-c4ssf 1/1 Running 0 4h48m
kube-system kube-proxy-f9c9f 1/1 Running 0 6m32s
kube-system kube-proxy-gt6jb 1/1 Running 0 5m34s
kube-system kube-scheduler-k8smaster 1/1 Running 0 4h48m
================ tear down a cluster a start over again ================
drain each node, slave and then master
kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
run reset on master, seems need to it on slaves too. the reset unlinks and removes config files, stops containers, releases ports, etc. otherwise wont be able to run init again.
sudo kubeadm reset
delete each node, slave and then master
kubectl delete node <node name>
Take a break and have a tea.
Next one is setting up dashboard.