Cilium pods stuck in Terminating state when running helm delete - kubernetes

I have cilium installed in my test cluster (AWS, with the AWS CNI deleted because we use the cilium CNI plugin) and whenever I delete the cilium namespace (or run helm delete), the hubble-ui pod gets stuck in terminating state. The pod has a couple of containers, but I notice that one container named backend exits with code 137 when the namespace is deleted, leaving the hubble-ui pod and the namespace that the pod is in, stuck in Terminating state. From what I am reading online, containers exit with 137 when they attempt to use more memory that they have been allocated. In my test cluster, no resource limits have been defined (spec.containers.[*].resources = {}) on the pod or namespace. There is no error message displayed as reason for the error. I am using the cilium helm package v1.12.3, but this issue has been going on even before we updated the helm package version.
I would like to know what is causing this issue as it is breaking my CI pipeline.
How can I ensure a graceful exit of the backend container? (as opposed to clearing finalizers).

So it appears that there is a bug in the backend application/container for the hubble-ui service. Kubernetes sends a SIGTERM signal to the container and it fails to respond. I verified this by getting a shell into the container and sending SIGTERM and SIGINT, which is what the application seems to listen for in order to exit and it just doesn’t respond to either signal.
Next, I added a preStop hook that looks like below and the pod behaved itself
...
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "kill -SIGILL 1; true"]

Related

Kubelet + prometheus: how to query if a pod is crashing?

I want to set up alerts when any pod in my Kubernetes cluster is in a CrashloopBackOff state. I'm running Kubelet on Azure Kubernetes Services and have set up a Prometheus Operator which exposes metrics/cadvisor.
Other similar questions on this topic, such as this and this are not relevant to Kubelet setups. The recommended kube_pod_container_status_waiting_reason{}/kube_pod_status_phase{phase="Pending|Unknown|Failed"} and similar queries are not available to me with Kubelet on AKS.
Kubelet has somewhat limited metrics, here is what I have tried:
Container state:
container_tasks_state{container='my_container', kubernetes_azure_com_cluster='my_cluster'}
This seems like it should be the right solution, but the state is always 0, whether Running or in CrashloopBackOff. This seems to be a known bug.
Time from start:
time() - container_start_time_seconds{kubernetes_azure_com_cluster='my_cluster', container='my_container'}
We can here notify when the time the container is live is low. Any pod with a repeat alert is crashing. Inelegant as healthy containers will also notify until they've lived long enough, also my alert channel becomes very noisy.
Detect exited containers:
kubelet_running_containers{kubernetes_azure_com_cluster='my_cluster', container_state='exited'}
Can detect a crashing container, but containers may also exit gracefully, so a notification on container exits is not very useful. We essentially get a 'container exited' alert and then need to manually check whether it was a crash or graceful exit.
Number of running pods:
kubelet_running_pods{kubernetes_azure_com_cluster='my_cluster'}
Does not change on a container crash.
Scrape error:
container_scrape_error{kubernetes_azure_com_cluster='my_cluster'}
Again, does not change on a container crash.
Which query will allow me to discover if a pod has entered the CrashloopBackOff state?

How to rollout without killing processes in K8s?

I'm using:
kubectl rollout restart deployment my_cool_workers
This terminates the workers and start new ones.
However I want to rollout in a way where if something is running on a specific worker I want to let the task finish - I don't want to kill the tasks (so the worker should finish the tasks but not accepting new)
Meaning - rollout new workers -> old workers no longer accept traffic -> when old worker is no longer running anything terminate it.
How can this be done?
If a Pod gets killed, manually via kubectl or by any k8s controller like during a deployment, it will instantly change from Running into Terminating state. At the same time, the SIGTERM signal will be sent to all containers inside that Pod.
Starting from Kubernetes 1.19 you can debug running pods using Ephemeral Containers and kubectl debug command.
While in Terminating state, containers of a Pod are not restarted if they end. Whenever a container inside a Pod stops while in Running state, the container is restarted. This is done because a Pod should always be running unless an error occurred.
For more information refer to this document.

Kubernetes: view logs of crashed Airflow worker pod

Pods on our k8s cluster are scheduled with Airflow's KubernetesExecutor, which runs all Tasks in a new pod.
I have a such a Task for which the pod instantly (after 1 or 2 seconds) crashes, and for which of course I want to see the logs.
This seems hard. As soon the pod crashes, it gets deleted, along with the ability to retrieve crash logs. I already tried all of:
kubectl logs -f <pod> -p: cannot be used since these pods are named uniquely
(courtesy of KubernetesExecutor).
kubectl logs -l label_name=label_value: I
struggle to apply the labels to the pod (if this is a known/used way of working, I'm happy to try further)
An shared nfs is mounted on all pods on a fixed log directory. The failing pod however, does not log to this folder.
When I am really quick I run kubectl logs -f -l dag_id=sample_dag --all-containers (dag_idlabel is added byAirflow)
between running and crashing and see Error from server (BadRequest): container "base" in pod "my_pod" is waiting to start: ContainerCreating. This might give me some clue but:
these are only but the last log lines
this is really backwards
I'm basically looking for the canonical way of retrieving logs from transient pods
You need to enable remote logging. Code sample below is for using S3. In airflow.cfg set the following:
remote_logging = True
remote_log_conn_id = my_s3_conn
remote_base_log_folder = s3://airflow/logs
The my_s3_conn can be set in airflow>Admin>Connections. In the Conn Type dropdown, select S3.

Suspending a container in a kubernetes pod

I would like to suspend the main process in a docker container running in a kubernetes pod. I have attempted to do this by running
kubectl exec <pod-name> -c <container-name> kill -STOP 1
but the signal will not stop the container. Investigating other approaches, it looks like docker stop --signal=SIGSTOP or docker pause might work. However, as far as I know, kubectl exec always runs in the context of a container, and these commands would need to be run in the pod outside the context of the container. Does kubectl's interface allow for anything like this? Might I achieve this behavior through a call to the underlying kubernetes API?
You could set the replicaset to 0 which would set the number of working deployments to 0. This isn't quite a Pause but it does Stop the deployment until you set the number of deployments to >0.
kubectl scale --replicas=0 deployment/<pod name> --namespace=<namespace>
So kubernetes does not support suspending pods because it's a VM kinda behavior, and since starting a new one is cheaper it just schedules a new pod in case of failure. In effect your pods should be stateless. And any application that needs to store state, should have a persistent volume mounted inside the pod.
The simple mechanics(and general behavior) of Kubernetes is if the process inside the contaiener fails kuberentes will restart it by creating a new pod.
If you also comment what you are trying to achieve as an end goal I think I can help you better.

kubernetes pods are restarting with new ID

The pods i am working with are being managed by kubernetes. When I use the docker restart command to restart a pod, sometimes the pod gets a new id and sometimes the old one. When the pod gets a new id, its state first goes friom running ->error->crashloopbackoff. Can anyone please tell me why is this happening. Also how frequently does kubernetes does the health check
Kubernetes currently does not use the docker restart command for many reasons (e.g., preserving the logs of older containers). Kubelet, the daemon on the node, creates a new container if the existing container terminated. In any case, users should not perform container lifecycle operations (e.g., stop, restart) on kubernetes-managed containers directly using docker, as it could cause unexpected behaviors.
EDIT: If you want kubernetes to restart your container automatically, set RestartPolicy in your pod spec to "Always" or "OnFailure". For more details, see http://kubernetes.io/docs/user-guide/pod-states/