As we know from here that monitoring is a highly important activity in the production environment.
A CNCF app can aid SRE/DevOPs by providing data which helps them to understand the health of app and trigger action whenever needed.
K8s provides below knobs for this.
Event
Status
Event and Status both are seemingly similar functionality. This document targets to
Identify minute differences between these two
Understand usability (Which of above should be used on what condition).
Status is something that tells current state of the k8s resource/object e.g. Node is "ready" or not, pod is "running" or not etc.
Status is meant for communication between multiple K8s entities. For example,
K8s deployment uses uses POD status.
K8s deployment handler monitors it to see if any action is needed
Almost every Kubernetes object includes status. This comes due to declarative nature of K8s object.
The status describes the current state of the object, supplied and updated by the Kubernetes and its components.
The Kubernetes control plane continually and actively manages every object’s actual state to match the desired state you supplied.
Status object doesn't provide incremental update. In other words, it overwrites previous data. For example, LB IP in the service object is overwritten with new patch command.
Event is something that tells the the state transitions e.g. Pod cannot reach to "running" state directly, it will start from "pending" or "ContainerCreation" state. these transitions are logged in K8s as Events(Refer below example)
Kubernetes events are objects that provide insight into what is happening inside a cluster, such as what decisions were made by scheduler or why some pods were evicted from the node.
K8s events are stored in master’s disk
K8s event contains involved object. This object is provides the linkage between event and the associated object (POD for example)
K8s events are cleared after certain period of time (1 hour as default)
In below regular example of POD, you see that
Current state of POD is running which matches to the desired status of POD
Event section describes how it reached to running state. Also it mentions any transaction status (for example, image pull error/succes)
root@ubuntu-232:~/deepak# kubectl describe pod apache-f5c9d84dc-7z2b4
Name: apache-f5c9d84dc-7z2b4
Namespace: default
Node: ubuntu-231/10.106.73.231
Start Time: Tue, 24 Mar 2020 05:16:17 +0530
Labels: app=apache
pod-template-hash=917584087
Annotations: <none>
Status: Running. → Provided current state
IP: 10.244.1.233
Controlled By: ReplicaSet/apache-f5c9d84dc
Containers:
apache:
Container ID: docker://618857e5275da8709f4b7b1682c4cb26642e603b92f75c16f2d65a5378f7c515
Image: httpd:latest
Image ID: docker-pullable://httpd@sha256:946c54069130dbf136903fe658fe7d113bd8db8004de31282e20b262a3e106fb
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 24 Mar 2020 05:16:40 +0530
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-58pzw (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-58pzw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-58pzw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: → Provides state transaction and whether any error happened during this transaction including success
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 37s default-scheduler Successfully assigned apache-f5c9d84dc-7z2b4 to ubuntu-231
Normal SuccessfulMountVolume 36s kubelet, ubuntu-231 MountVolume.SetUp succeeded for volume "default-token-58pzw"
Normal Pulling 18s kubelet, ubuntu-231 pulling image "httpd:latest"
Normal Pulled 14s kubelet, ubuntu-231 Successfully pulled image "httpd:latest"
Normal Created 13s kubelet, ubuntu-231 Created container
Normal Started 12s kubelet, ubuntu-231 Started container
In summary, status and event both are needed based on use-case (Below example cases)
For the use-case, where CNCF container is acting as controller to meet desired state, status should be used for mentioning current status of the object.
As example, CIC acts as controller for applying desired state of CRD/ingress object. In this case, status is updated as per current state of configuration
For any need of mentioning state transaction or any error while state transaction, event is useful
For example,
in case CIC container boots up has any issue, then it can be mentioned in the event
Useful CIC container state transaction can be notified in the event
Any error in configuring Service/Ingress/ConfigMap/CRD can be notified via combination of status and event.
Please refer https://www.appdynamics.com/blog/product/monitoring-kubernetes-events/
It watches for specific Kubernetes events and pushes notifications on these events to various endpoints such as Slack and PagerDuty. Please refer here
Quote "One of the most useful events we monitor at Blue Matador happens when a node begins evicting pods. The Evicted reason is actually not included in the source file"
It uses reason field in Pod status
Quote "Many workloads rely on network-attached storage to provide persistent storage to pods. Running Kubernetes in a cloud such as AWS or GCP can ease the pain of managing this storage, but what do you do if it does not work? The FailedMount and FailedAttachVolume events can help you debug issues with storage."
It uses K8s event
It uses K8s event for mount issue in AWS
https://www.bluematador.com/blog/kubernetes-events-explained
https://medium.com/@copyconstruct/monitoring-and-observability-8417d1952e1c
https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/
https://books.google.co.in/books?id=6VKjDwAAQBAJ&dq=Almost+every+Kubernetes+object+includes+status
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#containerstate-v1-core
https://stackoverflow.com/questions/46419163/what-will-happen-to-evicted-pods-in-kubernetes
https://kubernetes.io/docs/tasks/debug-application-cluster/events-stackdriver/