When does a Pod get destroyed?

When does a Pod get destroyed? - kubernetes

Pod lifecycle is managed by Kubelet in data plane.
As per the definition: If the liveness probe fails, the kubelet kills the container
Pod is just a container with dedicated network namespace & IPC namespace with a sandbox container.
Say, if the Pod is single app container Pod, then upon liveness failure:
Does kubelet kill the Pod?
or
Does kubelet kill the container (only) within the Pod?

The kubelet uses liveness probes to know when to restart a container (NOT the entire Pod). If the liveness probe fails, the kubelet kills the container, and then the container may be restarted, however it depends on its restart policy.
I've created a simple example to demonstrate how it works.
First, I've created an app-1 Pod with two containers (web and db).
The web container has a liveness probe configured, which always fails because the /healthz path is not configured.
$ cat app-1.yml
apiVersion: v1
kind: Pod
metadata:
labels:
run: app-1
name: app-1
spec:
containers:
- image: nginx
name: web
livenessProbe:
httpGet:
path: /healthz
port: 8080
httpHeaders:
- name: Custom-Header
value: Awesome
- image: postgres
name: db
env:
- name: POSTGRES_PASSWORD
value: example
After applying the above manifest and waiting some time, we can describe the app-1 Pod to check that only the web container has been restarted and the db container is running without interruption:
NOTE: I only provided important information from the kubectl describe pod app-1 command, not the entire output.
$ kubectl apply -f app-1.yml
pod/app-1 created
$ kubectl describe pod app-1
Name: app-1
...
Containers:
web:
...
Restart Count: 4 <--- Note that the "web" container was restarted 4 times
Liveness: http-get http://:8080/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
...
db:
...
Restart Count: 0 <--- Note that the "db" container works fine
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
...
Normal Killing 78s (x2 over 108s) kubelet Container web failed liveness probe, will be restarted
...
We can connect to the db container to see if it is running:
NOTE: We can use the db container even when restarting the web container.
$ kubectl exec -it app-1 -c db -- bash
root#app-1:/#
In contrast, after connecting to the web container, we can observe that the liveness probe restarts this container:
$ kubectl exec -it app-1 -c web -- bash
root#app-1:/# command terminated with exit code 137

A pod is indeed the smallest element in Kubernetes, but that does not mean it is in fact "empty" without a container.
In order to spawn a pod and therefore the container elements needed to attach further containers a very small container is created using the pause image.
This is used to allocate an IP that is then used for the pod.
Afterward the init-containers or runtime container declared for the pod are started.
If the lifeness probe fails, the container is restarted. The pod survives this. This is even important: You might want to get the logs of the crashed/restarted container afterwards. This would not be possible, if the pod was destroyed and recreated.

Related

How to resolve ERROR: epmd error for host nxdomain (non-existing domain)?

I am trying to setup RabbitMQ Operator and RabbitMQ Cluster on K8S cluster on bare metal using this link
K8S Cluster has got 1 master and 1 worker node
RabbitMQ Cluster pod log
[root#re-ctrl01 containers]# kubectl logs definition-server-0 -n rabbitmq-system
BOOT FAILED (Tailored output)
===========
ERROR: epmd error for host definition-server-0.definition-nodes.rabbitmq-system: nxdomain (non-existing domain)
11:51:13.733 [error] Supervisor rabbit_prelaunch_sup had child prelaunch started with rabbit_prelaunch:run_prelaunch_first_phase() at undefined exit with reason {epmd_error,"definition-server-0.definition-nodes.rabbitmq-system",nxdomain} in context start_error. Crash dump is being written to: erl_crash.dump...
[root#re-ctrl01 containers]# kubectl describe pod definition-server-0 -n rabbitmq-system
Name: definition-server-0
Namespace: rabbitmq-system
Priority: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 44s default-scheduler Successfully assigned rabbitmq-system/definition-server-0 to re-ctrl01.local
Normal Pulled 43s kubelet Container image "rabbitmq:3.8.16-management" already present on machine
Normal Created 43s kubelet Created container setup-container
Normal Started 43s kubelet Started container setup-container
Normal Pulled 42s kubelet Container image "rabbitmq:3.8.16-management" already present on machine
Normal Created 42s kubelet Created container rabbitmq
Normal Started 42s kubelet Started container rabbitmq
Warning Unhealthy 4s (x3 over 24s) kubelet Readiness probe failed: dial tcp 10.244.0.xxx:5672: connect: connection refused
I added the following entries to /etc/hosts file of worker node because I am NOT sure whether the entry has to be added to master or worker
[root#re-worker01 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
127.0.0.1 re-worker01.local re-worker01 definition-server-0.definition-nodes.rabbitmq-system
I am stuck with this issue for almost 2 days. I googled and found similar issues but none resolved my issue
I see multiple issues in pod logs and describe output and I am unable to find out the root cause
Where can I find erl_crash.dump file on K8S ?
Is this really a hostname related issue ?
10.244.0.xxx:5672: connect: connection refused - Is this issue is because of 'epmd' or something else ?

I managed to resolve the issue after spending lot of time
I added the host definition-server-0.definition-nodes.rabbitmq-system to /etc/hosts file of RabbitMQ Cluster pod using hostAliases
YAML to add hostAliases is given below
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
name: definition
namespace: rabbitmq-system
spec:
replicas: 1
override:
statefulSet:
spec:
template:
spec:
containers: []
hostAliases:
- ip: "127.0.0.1"
hostnames:
- "definition-server-0"
- "definition-server-0.definition-nodes.rabbitmq-system"

Are healthchecks defined per container or per pod in Kubernetes?

In Google Cloud blog they say that if Readiness probe fails, then traffic will not be routed to a pod. And if Liveliness probe fails, a pod will be restarted.
Kubernetes docs they say that the kubelet uses Liveness probes to know if a container needs to be restarted. And Readiness probes are used to check if a container is ready to start accepting requests from clients.
My current understanding is that a pod is considered Ready and Alive when all of its containers are ready. This in turn implies that if 1 out of 3 containers in a pod fails, then the entire pod will be considered as failed (not Ready / not Alive). And if 1 out of 3 containers was restarted, then it means that the entire pod was restarted. Is this correct?

A Pod is ready only when all of its containers are ready.
When a Pod is ready, it should be added to the load balancing pools of all matching Services because it means that this Pod is able to serve requests.
As you can see in the Readiness Probe documentation:
The kubelet uses readiness probes to know when a container is ready to start accepting traffic.
Using readiness probe can ensure that traffic does not reach a container that is not ready for it.
Using liveness probe can ensure that container is restarted when it fail ( the kubelet will kill and restart only the specific container).
Additionally, to answer your last question, I will use an example:
And if 1 out of 3 containers was restarted, then it means that the entire pod was restarted. Is this correct?
Let's have a simple Pod manifest file with livenessProbe for one container that always fails:
---
# web-app.yml
apiVersion: v1
kind: Pod
metadata:
labels:
run: web-app
name: web-app
spec:
containers:
- image: nginx
name: web
- image: redis
name: failed-container
livenessProbe:
httpGet:
path: /healthz # I don't have this endpoint configured so it will always be failed.
port: 8080
After creating web-app Pod and waiting some time, we can check how the livenessProbe works:
$ kubectl describe pod web-app
Name: web-app
Namespace: default
Containers:
web:
...
State: Running
Started: Tue, 09 Mar 2021 09:56:59 +0000
Ready: True
Restart Count: 0
...
failed-container:
...
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Ready: False
Restart Count: 7
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
...
Normal Killing 9m40s (x2 over 10m) kubelet Container failed-container failed liveness probe, will be restarted
...
As you can see, only the failed-container container was restarted (Restart Count: 7).
More information can be found in the Liveness, Readiness and Startup Probes documentation.

For Pods with multiple containers, we do have an option to restart only single containers conditions applied it have required access.
Command :
kubectl exec POD_NAME -c CONTAINER_NAME "Command used for restarting the container"
Such that required POD is not deleted and k8s doesn't need to recreate the POD.

Kubernetes Pod's containers not running when using sh commands

Pod containers are not ready and stuck under Waiting state over and over every single time after they run sh commands (/bin/sh as well).
As example all pod's containers seen at https://v1-17.docs.kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/#define-container-environment-variables-with-data-from-multiple-configmaps they just go on "Complete" status after executing the sh command, or if I set "restartPolicy: Always" they have the "Waiting" state for the reason CrashLoopBackOff.
(Containers work fine if I do not set any command on them.
If I use the sh command within container, after creating them I can read using "kubectl logs" the env variable was set correctly.
The expected behaviour is to get pod's containers running after they execute the sh command.
I cannot find references regarding this particular problem and I need little help if possible, thank you very much in advance!
Please disregard I tried different images, the problem happens either way.
environment: Kubernetes v 1.17.1 on qemu VM
yaml:
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
data:
how: very
---
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: nginx
ports:
- containerPort: 88
command: [ "/bin/sh", "-c", "env" ]
env:
# Define the environment variable
- name: SPECIAL_LEVEL_KEY
valueFrom:
configMapKeyRef:
# The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
name: special-config
# Specify the key associated with the value
key: how
restartPolicy: Always
describe pod:
kubectl describe pod dapi-test-pod
Name: dapi-test-pod
Namespace: default
Priority: 0
Node: kw1/10.1.10.31
Start Time: Thu, 21 May 2020 01:02:17 +0000
Labels: <none>
Annotations: cni.projectcalico.org/podIP: 192.168.159.83/32
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"dapi-test-pod","namespace":"default"},"spec":{"containers":[{"command...
Status: Running
IP: 192.168.159.83
IPs:
IP: 192.168.159.83
Containers:
test-container:
Container ID: docker://63040ec4d0a3e78639d831c26939f272b19f21574069c639c7bd4c89bb1328de
Image: nginx
Image ID: docker-pullable://nginx#sha256:30dfa439718a17baafefadf16c5e7c9d0a1cde97b4fd84f63b69e13513be7097
Port: 88/TCP
Host Port: 0/TCP
Command:
/bin/sh
-c
env
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 21 May 2020 01:13:21 +0000
Finished: Thu, 21 May 2020 01:13:21 +0000
Ready: False
Restart Count: 7
Environment:
SPECIAL_LEVEL_KEY: <set to the key 'how' of config map 'special-config'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-zqbsw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-zqbsw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-zqbsw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 13m default-scheduler Successfully assigned default/dapi-test-pod to kw1
Normal Pulling 12m (x4 over 13m) kubelet, kw1 Pulling image "nginx"
Normal Pulled 12m (x4 over 13m) kubelet, kw1 Successfully pulled image "nginx"
Normal Created 12m (x4 over 13m) kubelet, kw1 Created container test-container
Normal Started 12m (x4 over 13m) kubelet, kw1 Started container test-container
Warning BackOff 3m16s (x49 over 13m) kubelet, kw1 Back-off restarting failed container

You can use this manifest; The command ["/bin/sh", "-c"] says "run a shell, and execute the following instructions". The args are then passed as commands to the shell. Multiline args make it simple and easy to read. Your pod will display its environment variables and also start the NGINX process without stopping:
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
data:
how: very
---
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: nginx
ports:
- containerPort: 88
command: ["/bin/sh", "-c"]
args:
- env;
nginx -g 'daemon off;';
env:
# Define the environment variable
- name: SPECIAL_LEVEL_KEY
valueFrom:
configMapKeyRef:
# The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
name: special-config
# Specify the key associated with the value
key: how
restartPolicy: Always

This happens because the process in the container you are running has completed and the container shuts down, and so kubernetes marks the pod as completed.
If the command that is defined in the docker image as part of CMD, or if you've added your own command as you have done, then the container shuts down after the command completed. It's the same reason why when you run Ubuntu using plain docker, it starts up then shuts down directly afterwards.
For pods, and their underlying docker container to continue running, you need to start a process that will continue running. In your case, running the env command completes right away.
If you set the pod to restart Always, then kubernetes will keep trying to restart it until it's reached it's back off threshold.
One off commands like you're running are useful for utility type things. I.e. do one thing then get rid of the pod.
For example:
kubectl run tester --generator run-pod/v1 --image alpine --restart Never --rm -it -- /bin/sh -c env
To run something longer, start a process that continues running.
For example:
kubectl run tester --generator run-pod/v1 --image alpine -- /bin/sh -c "sleep 30"
That command will run for 30 seconds, and so the pod will also run for 30 seconds. It will also use the default restart policy of Always. So after 30 seconds the process completes, Kubernetes marks the pod as complete, and then restarts it to do the same things again.
Generally pods will start a long running process, like a web server. For Kubernetes to know if that pod is healthy, so it can do it's high availability magic and restart it if it cashes, it can use readiness and liveness probes.

Kubernetes , liveness probe is failing but pod in Running state

I'm trying to do a blue green deployment with kubernetes , I have followed it , https://www.ianlewis.org/en/bluegreen-deployments-kubernetes , that is ok.
I have added a liveness probe to execute a healthcheck ,
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: flask-1.3
spec:
replicas: 2
template:
metadata:
labels:
name: app
version: "1.3"
spec:
containers:
- name: appflask
image: 192.168.99.100:5000/fapp:1.2
livenessProbe:
httpGet:
path: /index2
port: 5000
failureThreshold: 1
periodSeconds: 1
initialDelaySeconds: 1
ports:
- name: http
containerPort: 5000
the path "index2" doesnt exist , I want to test a failed deployment. the problem is when I execute:
kubectl get pods -o wide
for some seconds one of the pods is in state "RUNNING"
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
flask-1.3-6c644b8648-878qz 0/1 CrashLoopBackOff 6 6m19s 10.244.1.250 node <none> <none>
flask-1.3-6c644b8648-t6qhv 0/1 CrashLoopBackOff 7 6m19s 10.244.2.230 nod2e <none> <none>
after some seconds one pod is RUNNING when live is failing always:
NAME READY STATUS RESTARTS AGE
IP NODE NOMINATED NODE READINESS GATES
flask-1.3-6c644b8648-878qz 1/1 Running 7 6m20s 10.244.1.250 node <none> <none>
flask-1.3-6c644b8648-t6qhv 0/1 CrashLoopBackOff 7 6m20s 10.244.2.230 nod2e <none> <none>
And after RUNNING it back to CrashLoopBackOff, the question is , why for some seconds it keeps RUNNING if the livenesprobe go to fail always?
thanks in advance

You should be looking at Readiness probe instead, or both of them.
Readiness and liveness probes can be used in parallel for the same container. Using both can ensure that traffic does not reach a container that is not ready for it, and that containers are restarted when they fail.
Liveness probe checks if your application is in a healthy state in your already running pod.
Readiness probe will actually check if your pod is ready to receive traffic. Thus, if there is no /index2 endpoint, it will never appear as Running

What's happening to you is this:
When you first start the pod (or the container), it will start and will get into the "running" state. Now, if there are no processes running in the container, or if there is a non-continuous process (say sleep 100), when this process finishes, kubernetes is going to consider this pod completed.
Now, since you have a deployment, that is going to keep a certain amount of replicas running, it re-creates the pod. But again, there are no processes running, so again, it get into completed. This is an infinite loop.
If you want to keep the pod up and running, even though you have no processes inside running, you can pass the parameter tty: true, in your yaml file.
apiVersion: v1
kind: Pod
metadata:
name: debian
labels:
app: debian
spec:
containers:
- name: debian
image: debian
tty: true # this line will keep the terminal open
If you run the pod above without tty: true, same this is going to happen.

Kubernetes Keeps Restarting Pods of StatefulSet in Minikube With "Need to kill pod"

Minikube version v0.24.1
kubernetes version 1.8.0
The problem that I am facing is that I have several statefulsets created in minikube each with one pod.
Sometimes when I start up minikube my pods will start up initially then keep being restarted by kubernetes. They will go from the creating container state, to running, to terminating over and over.
Now I've seen kubernetes kill and restart things before if kubernetes detects disk pressure, memory pressure, or some other condition like that, but that's not the case here as these flags are not raised and the only message in the pod's event log is "Need to kill pod".
What's most confusing is that this issue doesn't happen all the time, and I'm not sure how to trigger it. My minikube setup will work for a week or more without this happening then one day I'll start minikube up and the pods for my statefulsets just keep restarting. So far the only workaround I've found is to delete my minikube instance and set it up again from scratch, but obviously this is not ideal.
Seen here is a sample of one of the statefulsets whose pod keeps getting restarted. Seen in the logs kubernetes is deleting the pod and starting it again. This happens repeatedly. I'm unable to figure out why it keeps doing that and why it only gets into this state sometimes.
$ kubectl describe statefulsets mongo --namespace=storage
Name: mongo
Namespace: storage
CreationTimestamp: Mon, 08 Jan 2018 16:11:39 -0600
Selector: environment=test,role=mongo
Labels: name=mongo
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"apps/v1beta1","kind":"StatefulSet","metadata":{"annotations":{},"labels":{"name":"mongo"},"name":"mongo","namespace":"storage"},"...
Replicas: 1 desired | 1 total
Pods Status: 1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: environment=test
role=mongo
Containers:
mongo:
Image: mongo:3.4.10-jessie
Port: 27017/TCP
Command:
mongod
--replSet
rs0
--smallfiles
--noprealloc
Environment: <none>
Mounts:
/data/db from mongo-persistent-storage (rw)
mongo-sidecar:
Image: cvallance/mongo-k8s-sidecar
Port: <none>
Environment:
MONGO_SIDECAR_POD_LABELS: role=mongo,environment=test
KUBERNETES_MONGO_SERVICE_NAME: mongo
Mounts: <none>
Volumes: <none>
Volume Claims:
Name: mongo-persistent-storage
StorageClass:
Labels: <none>
Annotations: volume.alpha.kubernetes.io/storage-class=default
Capacity: 5Gi
Access Modes: [ReadWriteOnce]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulDelete 23m (x46 over 1h) statefulset delete Pod mongo-0 in StatefulSet mongo successful
Normal SuccessfulCreate 3m (x62 over 1h) statefulset create Pod mongo-0 in StatefulSet mongo successful

After some more digging there seems to have been a bug which can affect statefulsets that creates multiple controllers for the same statefulset:
https://github.com/kubernetes/kubernetes/issues/56355
This issue seems to have been fixed and the fix seems to have been backported to version 1.8 of kubernetes and included in version 1.9, but minikube doesn't yet have the fixed version. A workaround if your system enters this state is to list the controller revisions like so:
$ kubectl get controllerrevisions --namespace=storage
NAME CONTROLLER REVISION AGE
mongo-68bd5cbcc6 StatefulSet/mongo 1 19h
mongo-68bd5cbcc7 StatefulSet/mongo 1 7d
and delete the duplicate controllers for each statefulset.
$ kubectl delete controllerrevisions mongo-68bd5cbcc6 --namespace=storage
or to simply use version 1.9 of kubernetes or above that includes this bug fix.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse