OpenShift: Liveness Probe & Readiness Probe

Chetan_Tiwary_ · ‎08-30-2023

Within the Kubernetes and hence OpenShift framework, there are two primary techniques for monitoring the health of your applications through probes:

HTTP Get: With the HTTP Get method, the health assessment of the container hinges on an HTTP request. The verdict is favorable if the HTTP response code falls within the range of 200 to 399.

Container Command : Employing the container command mechanism involves executing a specific command within the container itself. The probe is deemed successful if the executed test command concludes with an exit status of 0.

TCP Socket : The TCP socket test entails an attempt to establish a socket connection with the container. The container's health is established solely if the probe can effectively establish this connection.

How do we configure these probes ? Use Liveness or Readiness Probe as it demands -

Liveness Probe : Why do we need a liveness probe ?

Many applications have an HTTP web component, as well as an asynchronous "jobs" component. The jobs component does need a liveness check because it is possible for the process running the jobs to die. If this happens, the container is worthless and jobs will accumulate in the queue. In such cases, it's generally prudent to initiate a container restart, making a liveness probe exceptionally suitable for this purpose.
If our jobs process encounters a deadlock, it might still appear to be alive because the process is running, but it's clearly in a failed state and should be restarted.
If the application starts up takes more than a few seconds, we should put in a liveness probe to ensure that the container initializes without error rather than crashing.

CONCEPT : liveness probe is a Kubernetes feature that helps to ensure that containers are still running. If the liveness probe fails ( may be due to deadlock) , the kubelet will kill the container. The pod will then respond based on its restart policy.

For example, a liveness probe on a pod with a restartPolicy of Always will kill and restart the container. A liveness probe on a pod with a restartPolicy of OnFailure will only restart the container if the liveness probe fails more than a certain number of times.

In the context of OpenShift Virtualization, if the liveness probe identifies an unsound operational state, the system takes the initiative to eliminate the Virtual Machine Instance (VMI) resource and then proceeds to redeploy a fresh instance.

Liveness probes serve a vital role in identifying unresponsive applications that lack the ability to recover without a comprehensive reboot of the virtual machine.

Add a liveness probe that tests the mariadb database running inside the vm resource by sending requests to the default 3306 TCP socket using oc edit to add the liveness probe in the VM resource under the container spec : or you can use the OpenShift web console to do this by going to Virtualization → VirtualMachines --> select the VM --> yaml tab

livenessProbe:
  tcpSocket:
    port: 3306
  initialDelaySeconds: 10
  periodSeconds: 5

You must delete the corresponding VMI resource before Red Hat OpenShift Virtualization recognizes the new probe. OpenShift Virtualization restarts a new instance from the VM resource that you modified.

Readiness Probe : If your application handles incoming network requests and it is being fed the network requests even though we donot know whether the application is up or not - it might keep on dropping the requests ( yet it has not crashed ) indefinitely. Surely, this is not an ideal condition for our application. Enter Readiness Probe - our saviour !

CONCEPT : A readiness probe is a Kubernetes feature that allows you to determine if a container is ready to accept service requests. If the readiness probe fails for a container, the kubelet will remove the pod from the list of available service endpoints. This means that no new service requests will be sent to the pod until the readiness probe passes.

After a failure, the readiness probe will continue to examine the pod. If the pod becomes available, the kubelet will add the pod to the list of available service endpoints. This means that service requests will be sent to the pod again.

Same way - If the readiness probe fails, then OpenShift prevents client traffic from reaching the application by removing the VM's IP address from the service resource. When the application is available again, OpenShift adds the VM's IP address back to the service so that the application can receive client traffic.

Configure Readiness Probe : You can add probes by editing the VM resource. Use the oc edit command to add new readinessProbe to the VM resource under the container spec: or use the OpenShift web console to do this by going to Virtualization → VirtualMachines --> select the VM --> yaml tab

readinessProbe:
  httpGet:
    scheme: HTTPS
    path: /healthz
    port: 8443
  initialDelaySeconds: 10
  periodSeconds: 5
  timeoutSeconds: 10  
  failureThreshold: 3
  successThreshold: 3

You must delete the corresponding VMI resource before Red Hat OpenShift Virtualization recognizes the new probe. OpenShift Virtualization restarts a new instance from the VM resource that you modified.

What are these thresholds and behaviour control mechanisms :

initialDelaySeconds: The time, in seconds, after the container starts before the probe can be scheduled. The default is 0.

periodSeconds: The delay, in seconds, between performing probes. The default is 10. This value must be greater than timeoutSeconds.

timeoutSeconds: The number of seconds of inactivity after which the probe times out and the container is assumed to have failed. The default is 1. This value must be lower than periodSeconds.

successThreshold: The number of times that the probe must report success after a failure to reset the container status to successful. The value must be 1 for a liveness probe. The default is 1.

failureThreshold: The number of times that the probe is allowed to fail. The default is 3. After the specified attempts:

for a liveness probe --> the container is restarted

for a readiness probe --> the pod is marked Unready

for a startup probe --> the container is killed and is subject to the pod’s restartPolicy

Reference : https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes...