k8s setup a cluster

Below is the process to setup a 3 node cluster, with 1 master and 2 slaves.

master at 192.168.0.13

slaves at 192.168.0.14 and 192.168.0.15

================ set static ip ================

Setup static ip for the k8s nodes, so the ip doesn't change over time.

Go to /etc/sysconfig/network-scripts/ifcfg-xxx, the xxx is the interface name, could be different on different machines

Change dhcp to static and put in IPADDR, Netmask and gateway

#BOOTPROTO="dhcp"

BOOTPROTO="static"

IPADDR=192.168.0.13

NETMASK=255.255.255.0

GATEWAY=102.168.0.1

Restart

================ Install Docker ================

Add the official docker repo for docker-ce to yum repos.

sudo curl https://download.docker.com/linux/centos/docker-ce.repo > /home/ben/docker-ce.repo

sudo mv /home/ben/docker-ce.repo /etc/yum.repos.d/docker-ce.repo

Import GPG key

sudo curl https://download.docker.com/linux/centos/gpg > /home/ben/docker-key

sudo rpm --import /home/ben/docker-key

on centos8 seems need to install containerd.io first before installing docker-ce, otherwise it fails

yum install -y https://download.docker.com/linux/centos/7/x86_64/stable/Packages/containerd.io-1.2.6-3.3.el7.x86_64.rpm

install docker-ce

sudo yum install docker-ce

Check the run status:

sudo systemctl status docker

if disabled, enable it and start

sudo systemctl enable docker

sudo systemctl start docker

================ close firewall, swap and selinux ================

Close the firewall as it is a trouble maker within the cluster. Doesn't mean it is unsafe. The entry into the cluster still needs to be secured.

systemctl disable firewalld

Close the swap as memory swap impacts the cluster's performance and stability.

Commen out the line for swap in /etc/fstab

# /etc/fstab

# Created by anaconda on Wed Jul 15 01:48:12 2020

# Accessible filesystems, by reference, are maintained under '/dev/disk/'.

# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info.

# After editing this file, run 'systemctl daemon-reload' to update systemd

# units generated from this file.

/dev/mapper/cl-root / xfs defaults 0 0

UUID=a60d7d0b-2d55-4b1e-8c64-928952b3285b /boot ext4 defaults 1 2

#/dev/mapper/cl-swap swap swap defaults 0 0

Disable Selinux. This is a security component but it barely understands namespaces or containers.

Also set SELinux in permissive mode (effectively disabling it). The sed command does a lot of file content operations.

The 's' means substitution. '/' is delimeter. the first '/xxx' is substitued with the second '/xxx'. The /etc/selinux/config may not exist.

sudo setenforce 0

sudo sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

================ install k8s components================

Include the official kubernetes repository into /etc/yum.repo.d/kubernetes.repo

May need to check the official website to make sure the below is up to date.

[kubernetes]

name=Kubernetes

baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64

enabled=1

gpgcheck=1

repo_gpgcheck=1

gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg

Install k8s components

sudo yum install -y kubelet kubeadm kubectl

The kubelet is the primary "node agent" that runs on each node. It can register the node with the apiserver. It works in terms of a PodSpec which is a YAML or JSON object that describes a pod.

The kubeadm performs the actions necessary to get a minimum viable cluster up and running. By design, it cares only about bootstrapping, not about provisioning machines. Nice-to-have addons, like the Kubernetes Dashboard, monitoring solutions, and cloud-specific addons, is not in scope.

The kubectl command line tool lets you control Kubernetes clusters.

Try start kubelet. However it won't be able to start... "code=existed, status=255".

sudo systemctl enable kubelet

sudo systemctl start kubelet

The kubelet is now restarting every few seconds, as it waits in a crashloop for kubeadm to tell it what to do.

Check systemd logs:

journalctl -u kubelet.service

It says "1752 server.go:199] failed to load Kubelet config file /var/lib/kubelet/config.yaml".

The config.yaml is not there. if you copy or manually enter the config.yaml file, it would complain about another config file, e.g. "unable to load bootstrap kubeconfig: stat /etc/kubernetes/bootstrap-kubelet.conf".

This will be solved when "kubeadm init" is run, which would set up all the config files.

================ configure cgroup manager ================

The docker and kubelet are all systemd services. When systemd init the process, it consumes a root control group (cgroup) and acts as a cgroup manager. A control group cgroup is used to constrain resources allocated to processes and a cgroup manager (systemd in this case) has a view of the allocated resources. So if there are two cgroup managers, there are two different views of resources which could cause k8s nodes become unstable under resource pressure.

So, it's a good practice to configure both docker and kubelet to use the same cgroup manager.

By default the cgroup manager is 'systemd' for kubelet, and 'cgroupfs' for docker.

We can either set docker's cgroup manager to systemd, or the other way around, set kubelet's cgroup manager to cgroupfs.

The official website suggests using systemd.

To view docker's cgroup manager:

sudo docker info |grep -i cgroup

The cgroup config for kubelet is in /var/lib/kubelet/config.yaml. But this file is not created until 'kubeadm init' is run.

To change docker's cgroup, update the docker's daemon.json file. If not exists, create one.

/etc/docker/daemon.json

Append the below:

{

"exec-opts": ["native.cgroupdriver=systemd"]

}

Then restart docker. Check cgroup again using docker info.

sudo systemctl daemon-reload

sudo systemctl restart docker

Note, This step might not be needed anymore. When using Docker, kubeadm will automatically detect the cgroup driver and set it in the /var/lib/kubelet/config.yaml file during runtime.

================ pull k8s docker images ================

Before running 'kubeadm init' which pulls images and sets up a lot of things, you may pull the images separately to testing connections.

This is not mandatory, as 'kubeadm init' would pulls images anyway.

kubeadm config images pull

Check the images are pulled.

sudo docker image ls

Should see the below or similar.

REPOSITORY TAG IMAGE ID CREATED SIZE

k8s.gcr.io/kube-proxy v1.18.6 c3d62d6fe412 5 days ago 117MB

k8s.gcr.io/kube-apiserver v1.18.6 56acd67ea15a 5 days ago 173MB

k8s.gcr.io/kube-controller-manager v1.18.6 ffce5e64d915 5 days ago 162MB

k8s.gcr.io/kube-scheduler v1.18.6 0e0972b2b5d1 5 days ago 95.3MB

k8s.gcr.io/pause 3.2 80d28bedfe5d 5 months ago 683kB

k8s.gcr.io/coredns 1.6.7 67da37a9a360 5 months ago 43.8MB

k8s.gcr.io/etcd 3.4.3-0 303ce5db0e90 8 months ago 288MB

When running 'kubeadm init' later, it will find out the images are already there.

================ replicate (virtual) machines ================

repeat the above process for other machines/nodes

if it is virtual machine, simply clone the image. make sure it generate new MAC addresses for all network adapters.

Make sure the master node has at least 2 cpus, and the slave nodes have at least 1 cpu.

if cloning virtual machines, also update the static ip address in /etc/sysconfig/network-scripts/ifcfg-xxx accordingly.

Set hostname in the /network file on each machine separately

sudo vi /etc/sysconfig/network

on master node:

HOSTNAME = k8smaster

on slave1:

HOSTNAME = k8sslave1

on slave2:

HOSTNAME = k8sslave2

Set hosts to map name to ip

sudo vi /etc/hosts

k8smaster=192.168.0.13

k8sslave1=192.168.0.14

k8sslave2=192.168.0.15

================ init master node ================

The control-plane node (i.e. master node) is the machine where the control plane components run, including etcd (the cluster database) and the API Server (which the kubectl command line tool communicates with).

'kubeadm init' first runs a series of prechecks to ensure that the machine is ready to run Kubernetes. These prechecks expose warnings and exit on errors. kubeadm init then downloads and installs the cluster control plane components. This may take several minutes. Also it installs certificates and writes config files, e.g. /var/lib/kubelet/config.yaml.

sudo kubeadm init

There are many parameter options for kubeadm init. check official document for details.

If successful, the message is like:

...

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.

Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:

https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.0.13:6443 --token vxc0bx.cjm8anog3tecx4al \

--discovery-token-ca-cert-hash sha256:d22e29b852c07e3ae42d72f6104a4b37fccf0bba2c3cbfe0b6441307a255c58f

So run the commands as shown in the message for a non-root user.

This copies some keys from admin.conf to a local conf file in the home folder.

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

Alternatively if you are the root user, you can run:

export KUBECONFIG=/etc/kubernetes/admin.conf

Remember to make a record of the kubeadm join command in the output message. You will need that token to join nodes to the cluster.

If you forgot it, 'kubeadm token list' will find the token again. The token expires in 24 hours, so you may need to 'kubeadm token create'.

if 'kubeadm init' failes or anything requires to run 'kubeadm init' again, you must 'kubeadm reset'.

Note, in case kubeadm fails and it requires to delete all stopped containers / preloaded images, use the following commands.

Removed all stopped docker containers. The 'sudo docker ls -a -q' returns all containers ids (-q shows id only). But 'docker rm' doesn't delete running container, so only stopped containers are deleted.

sudo docker container rm $(sudo docker container ls -a -q)

sudo docker rm $(docker ps -a -q)

Similar to stop all containers

sudo docker container stop $(sudo docker container ls -a -q)

Removed all images

sudo docker image rm $(sudo docker image ls -q)

================ setup pod network ================

Now firslty check the pod status. Everything is pending.

kubectl get pods --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE

kube-system coredns-66bff467f8-f6d8b 0/1 Pending 0 7m29s

kube-system coredns-66bff467f8-qzzrh 0/1 Pending 0 7m29s

kube-system etcd-k8smaster 0/1 Pending 0 7m38s

kube-system kube-apiserver-k8smaster 0/1 Pending 0 7m38s

kube-system kube-controller-manager-k8smaster 0/1 Pending 0 7m38s

kube-system kube-proxy-c4ssf 0/1 Pending 0 7m29s

kube-system kube-scheduler-k8smaster 0/1 Pending 0 7m38s

There are Calio, Cilium , Kube-router etc. pod network plugins.

As at 2020-07-21, Calico is the only Container Network Interface CNI plugin that the kubeadm project performs e2e tests against

Install Calio pod network. Check official k8s website to confirm the url & version to be used.

kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml

Once done, wait for ~a minute and check pod status again. The CoreDNS must be running to be able to proceed.

kubectl get pods --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE

kube-system calico-kube-controllers-76d4774d89-dznlv 1/1 Running 0 2m26s

kube-system calico-node-vw6xv 1/1 Running 0 2m27s

kube-system coredns-66bff467f8-f6d8b 1/1 Running 0 11m

kube-system coredns-66bff467f8-qzzrh 1/1 Running 0 11m

kube-system etcd-k8smaster 1/1 Running 0 11m

kube-system kube-apiserver-k8smaster 1/1 Running 0 11m

kube-system kube-controller-manager-k8smaster 1/1 Running 0 11m

kube-system kube-proxy-c4ssf 1/1 Running 0 11m

kube-system kube-scheduler-k8smaster 1/1 Running 0 11m

all good! and calico is also in the list now.

================ control plane node (master) isolation ================

By default, your cluster will not schedule Pods on the control-plane node, i.e. master node, for security reasons.

That is a bit of waste of resources if this is for development purpose. So lets allow it to run pods on master node.

If this is for production, don't allow it to run on master node.

kubectl taint nodes --all node-role.kubernetes.io/master-

This will remove the node-role.kubernetes.io/master taint from any nodes that have it, including the control-plane node, meaning that the scheduler will then be able to schedule Pods everywhere.

================ join slave nodes to the cluster ================

Go to the slave nodes, and run the command from master node's 'kubeadm init' output.

sudo kubeadm join 192.168.0.13:6443 --token vxc0bx.cjm8anog3tecx4al \

--discovery-token-ca-cert-hash sha256:d22e29b852c07e3ae42d72f6104a4b37fccf0bba2c3cbfe0b6441307a255c58f

If successul, the message is like:

...

This node has joined the cluster:

* Certificate signing request was sent to apiserver and a response was received.

* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

Run the 'kubectl get nodes' from master. Will see the slave nodes ready.

kubectl get nodes

NAME STATUS ROLES AGE VERSION

k8smaster Ready master 4h46m v1.18.6

k8sslave1 Ready <none> 4m50s v1.18.6

k8sslave2 Ready <none> 3m53s v1.18.6

Finally check pod status again. Should be all good. More calico nodes and kube-proxies joined.

kubectl get pods --all-namespaces

NAMESPACE NAME READY STATUS RESTARTS AGE

kube-system calico-kube-controllers-76d4774d89-dznlv 1/1 Running 0 4h38m

kube-system calico-node-7ss9l 1/1 Running 0 5m34s

kube-system calico-node-8rkkz 1/1 Running 0 6m32s

kube-system calico-node-vw6xv 1/1 Running 0 4h38m

kube-system coredns-66bff467f8-f6d8b 1/1 Running 0 4h48m

kube-system coredns-66bff467f8-qzzrh 1/1 Running 0 4h48m

kube-system etcd-k8smaster 1/1 Running 0 4h48m

kube-system kube-apiserver-k8smaster 1/1 Running 0 4h48m

kube-system kube-controller-manager-k8smaster 1/1 Running 0 4h48m

kube-system kube-proxy-c4ssf 1/1 Running 0 4h48m

kube-system kube-proxy-f9c9f 1/1 Running 0 6m32s

kube-system kube-proxy-gt6jb 1/1 Running 0 5m34s

kube-system kube-scheduler-k8smaster 1/1 Running 0 4h48m

================ tear down a cluster a start over again ================

drain each node, slave and then master

kubectl drain <node name> --delete-local-data --force --ignore-daemonsets

run reset on master, seems need to it on slaves too. the reset unlinks and removes config files, stops containers, releases ports, etc. otherwise wont be able to run init again.

sudo kubeadm reset

delete each node, slave and then master

kubectl delete node <node name>

Take a break and have a tea.

Next one is setting up dashboard.

Page updated

Google Sites

Report abuse