Liveliness probe test in google cloud clustered kubernetes environment - kubernetes

I want to test liveliness probe in google cloud clustered kubernetes environment. How can I bring a pod or container down to test liveliness probes ?
The problem is that replica sets will automatically bring the pods up, if I delete any on those.

On Kubernetes, pods are mortal, and the number of live pods at any given time is guaranteed by the replicasets (which are wrapped by the deployments). So, to take your pods down, you can scale down your deployment to the number you need, or even to zero, like this:
kubectl scale deployment your-deployment-name --replicas=0
However, if you are trying to test and verify that the kubernetes service resource not sending packets to the non live or non ready pod, here's what you can do: You can create another pod with same labels as your real application pods, such that label selectors in the service would match this new pod as well. Configure the pod to have an invalid liveness/readiness probes, so it will not be considered live/ready. Then, hit your service with requests etc. to verify that it never hits the new pod you created.

The question is (quote) "...How can I bring a pod or container down to test liveliness probes ?". The type of probe isn't specified but I'll assume it is HTTP GET or TCP Socket.
Assuming you have proper access to the node/host on which the pod is running:
Start a single pod.
Verify that the liveness probe checks out - that's it, it is working.
Find out on which node the pod is running. This, for example, will return the IP address:
kubectl -n <namespace> get pod <pod-name> -o jsonpath={.status.hostIP}
Log onto the node.
Find the PID of the application process. For example, list all processes (ps aux) and look for the specific process or grep by (part of the) name: ps aux | grep -i <name>. Take the number in the second column. For example, the PID in this ps aux partial output is 13314:
nobody 13314 0.0 0.6 145856 38644 ? Ssl 13:24 0:00 /bin/prometheus --storage....
While on the node, suspend (pause/stop) the process by executing kill -STOP <PID>. For example, for the PID from above:
kill -STOP 13314
At this point:
If there is no liveness probe defined, the pod should still be in Running status and not restarted even though it won't be responding to attempts for connections. To resume the stopped process, execute kill -CONT <PID>.
A properly configured HTTP GET or TCP Socket liveness probe should fail because connection with the application can't be established.
Notice that this method may also work for "exec.command" probes depending what those commands do.
It is to note, also, that most applications run as PID 1 in a (Docker) container. As the Docker docs explain "...A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so". That is probably the reason why the approach won't work from inside the container.

Related

EKS: kubectl exec does not respect streamingConnectionIdleTimeout

Using EKS with Kubernetes 1.21, managed nodegroups in a private subnet. I'm trying to set the cluster up so that kubectl exec times out after inactivity regardless of the workload being execed into, and without any client configuration.
I'm aware of https://github.com/containerd/containerd/issues/5563, except we're on 1.21 with Docker runtime, not containerd yet.
I set streamingConnectionIdleTimeout: 3600s on the kubelet in the launch template:
cat /etc/kubernetes/kubelet/kubelet-config.json | jq '.streamingConnectionIdleTimeout = "3600s"' > /etc/kubernetes/kubelet/kubelet-config.json
/etc/eks/bootstrap.sh {{CLUSTER_NAME}}
And confirmed with curl -sSL "http://localhost:8001/api/v1/nodes/(node name)/proxy/configz".
However, kubectl exec still does not time out.
I confirmed /proc/sys/net/ipv4/tcp_keepalive_time = 7200 on both the client and the node, so we should be hitting the streaming connection idle timeout before Linux starts sending keepalive probes.
Reading through How kubectl exec Works, it seems possible that the EKS managed control plane is keeping the connection alive. There are people online who have the opposite problem - their connection times out regardless of streamingConnectionIdleTimeout - and they solve it by adjusting the timeout on the load balancer in front of their k8s API server. However, there are no knobs (that I know of) to tweak in that regard on the EKS managed control plane.
I would appreciate any input on this topic.

Job still running even when deleting nodes

I created a two nodes clusters and I created a new job using the busybox image that sleeps for 300 secs. I checked on which node this job is running using
kubectl get pods -o wide
I deleted the node but surprisingly the job was still finishing to run on the same node. Any idea if this is a normal behavior? If not how can I fix it?
Jobs aren't scheduled or running on nodes. The role of a job is just to define a policy by making sure that a pod with certain specifications exists and ensure that it runs till the completion of the task whether it completed successfully or not.
When you create a job, you are declaring a policy that the built-in job-controller will see and will create a pod for. Then the built-in kube-scheduler will see this pod without a node and patch the pod to it with a node's identity. The kubelet will see a pod with a node matching it's own identity and hence a container will be started. As the container will be still running, the control-plane will know that the node and the pod still exist.
There are two ways of breaking a node, one with a drain and the second without a drain. The process of breaking a node without draining is identical to a network cut or a server crash. The api-server will keep the node resource for a while, but it 'll cease being Ready. The pods will be then terminated slowly. However, when you drain a node, it looks as if you are preventing new pods from scheduling on to the node and deleting the pods using kubectl delete pod.
In both ways, the pods will be deleted and you will be having a job that hasn't run to completion and doesn't have a pod, therefore job-controller will make a new pod for the job and the job's failed-attempts will be increased by 1, and the loop will start over again.

Kubernetes liveness probe logging recovery

I am trying to test a liveness probe while learning kubernetes.
I have set up a minikube and configured a pod with a liveness probe.
Testing the script (E.g. via docker exec) it seems to report success and failure as required.
The probe leads to failures events which I can view via kubectl describe podname
but it does not report recovery from failures.
This answer says that liveness probe successes are not reported by default.
I have been trying to increase the log level with no success by running variations like:
minikube start --extra-config=apiserver.v=4
minikube start --extra-config=kubectl.v=4
minikube start --v=4
As suggested here & here.
What is the proper way to configure the logging level for a kubelet?
can it be modified without restarting the pod or minikube?
An event will be reported if a failure causes the pod to be restarted.
For kubernetes itself I understand that using it to decide whether to restart the pod is sufficient.
Why aren't events recorded for recovery from a failure which does not require a restart?
This is how I would expect probes to work in a health monitoring system.
How would recovery be visible if the same probe was used in prometheus or similar?
For an expensive probe I would not want it to be run multiple times.
(granted one probe could cache the output to a file making the second probe cheaper)
I have been trying to increase the log level with no success by
running variations like:
minikube start --extra-config=apiserver.v=4
minikube start --extra-config=kubectl.v=4
minikube start --v=4
#Bruce, none of the options mentioned by you will work as they are releted with other components of Kubernetes cluster and in the answer you referred to it was clearly said:
The output of successful probes isn't recorded anywhere unless your
Kubelet has a log level of at least --v=4, in which case it'll be in
the Kubelet's logs.
So you need to set -v=4 specifically for kubelet. In the official docs you can see that it can be started with specific flags including the one, changing default verbosity level of it's logs:
-v, --v Level
number for the log level verbosity
Kubelet runs as a system service on each node so you can check it's status by simply issuing:
systemctl status kubelet.service
and if you want to see it's logs issue the command:
journalctl -xeu kubelet.service
Try:
minikube start --extra-config=kubelet.v=4
however I'm not 100% sure if Minikube is able to pass this parameter so you'll need to verify it on your own. If it doesn't work you should still be able to add it in kubelet configuration file, specifying parameters with which it is started (don't forget to restart your kubelet.service after submitting the changes, you simply need to run systemctl restart kubelet.service)
Let me know if it helps and don't hesitate to ask additional questions if something is not completely clear.

Suspending a container in a kubernetes pod

I would like to suspend the main process in a docker container running in a kubernetes pod. I have attempted to do this by running
kubectl exec <pod-name> -c <container-name> kill -STOP 1
but the signal will not stop the container. Investigating other approaches, it looks like docker stop --signal=SIGSTOP or docker pause might work. However, as far as I know, kubectl exec always runs in the context of a container, and these commands would need to be run in the pod outside the context of the container. Does kubectl's interface allow for anything like this? Might I achieve this behavior through a call to the underlying kubernetes API?
You could set the replicaset to 0 which would set the number of working deployments to 0. This isn't quite a Pause but it does Stop the deployment until you set the number of deployments to >0.
kubectl scale --replicas=0 deployment/<pod name> --namespace=<namespace>
So kubernetes does not support suspending pods because it's a VM kinda behavior, and since starting a new one is cheaper it just schedules a new pod in case of failure. In effect your pods should be stateless. And any application that needs to store state, should have a persistent volume mounted inside the pod.
The simple mechanics(and general behavior) of Kubernetes is if the process inside the contaiener fails kuberentes will restart it by creating a new pod.
If you also comment what you are trying to achieve as an end goal I think I can help you better.

kubernetes pods are restarting with new ID

The pods i am working with are being managed by kubernetes. When I use the docker restart command to restart a pod, sometimes the pod gets a new id and sometimes the old one. When the pod gets a new id, its state first goes friom running ->error->crashloopbackoff. Can anyone please tell me why is this happening. Also how frequently does kubernetes does the health check
Kubernetes currently does not use the docker restart command for many reasons (e.g., preserving the logs of older containers). Kubelet, the daemon on the node, creates a new container if the existing container terminated. In any case, users should not perform container lifecycle operations (e.g., stop, restart) on kubernetes-managed containers directly using docker, as it could cause unexpected behaviors.
EDIT: If you want kubernetes to restart your container automatically, set RestartPolicy in your pod spec to "Always" or "OnFailure". For more details, see http://kubernetes.io/docs/user-guide/pod-states/