preStop hook doesn't get executed - kubernetes

I am testing lifecycle hooks, and post-start works pretty well, but I think pre-stop never gets executed. There is another answer, but it is not working, and actually if it would work, it would contradict k8s documentation. So, from the docs:
PreStop
This hook is called immediately before a container is terminated due
to an API request or management event such as liveness probe failure,
preemption, resource contention and others. A call to the preStop hook
fails if the container is already in terminated or completed state.
So, the API request makes me think I can simply do kubectl delete pod POD, and I am good.
More from the docs (pod shutdown process):
1.- User sends command to delete Pod, with default grace period (30s)
2.- The Pod in the API server is updated with the time beyond which the Pod is considered “dead” along with the grace period.
3.- Pod shows up as “Terminating” when listed in client commands
4.- (simultaneous with 3) When the Kubelet sees that a Pod has been marked as terminating because the time in 2 has been set, it begins the pod shutdown process.
4.1.- If one of the Pod’s containers has defined a preStop hook, it is invoked inside of the container. If the preStop hook is still running after the grace period expires, step 2 is then invoked with a small (2 second) extended grace period.
4.2.- The container is sent the TERM signal. Note that not all containers in the Pod will receive the TERM signal at the same time and may each require a preStop hook if the order in which they shut down matters.
...
So, since when you do kubectl delete pod POD, the pod gets on Terminating, I assume I can do it.
From the other answer, I can't do this, but the way is to do a rolling-update. Well, I tried in all possible ways and it didn't work either.
My tests:
I have a deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-deploy
spec:
replicas: 1
template:
metadata:
name: lifecycle-demo
labels:
lifecycle: demo
spec:
containers:
- name: nginx
image: nginx
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- echo "Hello at" `date` > /usr/share/post-start
preStop:
exec:
command:
- /bin/sh"
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
volumeMounts:
- name: hooks
mountPath: /usr/share/
volumes:
- name: hooks
hostPath:
path: /usr/hooks/
I expect the pre-stop and post-start files to be created in /usr/hooks/, on the host (node where the pod is running). post-start is there, but pre-stop, never.
I tried kubectl delete pod POD, and it didn't work.
I tried kubectl replace -f deploy.yaml, with a different image, and when I do kubectl get rs, I can see the new replicaSet created, but the file isn't there.
I tried kubectl set image ..., and again, I can see the new replicaSet created, but the file isn't there.
I even tried putting them in a completely separated volumes, as I thought may be when I kill the pod and it gets re-created it re-creates the folder where the files should be created, so it deletes the folder and the pre-stop file, but that was not the case.
Note: It always get re-created on the same node. I made sure on that.
What I have not tried is to bomb the container and break it by setting low CPU limit, but that's not what I need.
Any idea what are the circumstances under which preStop hook would get triggered?

Posting this as community wiki for a better visibility.
There is a typo in the second "/bin/sh"; for preStop. There is an extra double quote ("). It was letting me to create the deployment, but was the cause it was not creating the file. All works fine now.
The exact point where the issue lied was here:
preStop:
exec:
command:
- /bin/sh" # <- this quotation
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
To be correct it should look like that:
preStop:
exec:
command:
- /bin/sh
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
For the time of writing this community wiki post, this Deployment manifest is outdated. Following changes were needed to be able to run this manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: good-deployment
spec:
selector:
matchLabels:
lifecycle: demo
replicas: 1
template:
metadata:
labels:
lifecycle: demo
spec:
containers:
- name: nginx
image: nginx
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- echo "Hello at" `date` > /usr/share/post-start
preStop:
exec:
command:
- /bin/sh
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
volumeMounts:
- name: hooks
mountPath: /usr/share/
volumes:
- name: hooks
hostPath:
path: /usr/hooks/
Changes were following:
1. apiVersion
+--------------------------------+---------------------+
| Old | New |
+--------------------------------+---------------------+
| apiVersion: extensions/v1beta1 | apiVersion: apps/v1 |
+--------------------------------+---------------------+
StackOverflow answer for more reference:
Stackoverflow.com: Questions: No matches for kind “Deployment” in version extensions/v1beta1
2. selector
Added selector section under spec:
spec:
selector:
matchLabels:
lifecycle: demo
Additional links with reference:
What is spec - selector - matchLabels used for while creating a deployment?
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Deployment: Selector

Posting this as community wiki for a better visibility.
When a pod should be terminated:
A SIGTERM signal is sent to the main process (PID 1) in each container, and a “grace period” countdown starts (defaults to 30 seconds for k8s pod - see below to change it).
Upon the receival of the SIGTERM, each container should start a graceful shutdown of the running application and exit.
If a container doesn’t terminate within the grace period, a SIGKILL signal will be sent and the container violently terminated.
For a detailed explanation, please see:
Kubernetes: Termination of pods
Kubernetes: Pods lifecycle hooks and termination notice
Kubernetes: Container lifecycle hooks
Always Confirm this:
check whether preStop is taking more than 30 seconds to run (more than default graceful period time). If it is taking then increase the terminationGracePeriodSeconds to more than 30 seconds, may be 60. refer this for more info about terminationGracePeriodSeconds

I know its too late to answer, but it is worth to add here.
I spend a full day to figureout this preStop in K8S.
K8S does not print any logs in PreStop stage. PreStop is part of lifecycle, also called as hook.
Generally Hook and Probs(Liveness & Readiness) logs will not print in kubectl logs.
Read this issue, you will get to know fully.
But there is indirect way to print logs in kubectl logs cmd. Follow the last comment in the above link
Adding here also.
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- sleep 10; echo 'hello from postStart hook' >> /proc/1/fd/1

Related

Know when a Pod was killed after exceeding its termination grace period

The scenario is as follows:
Our pods have a terminationGracePeriodSeconds of 60, which gives them ~60 seconds to do any necessary cleanup before Kubernetes will kill them ungracefully. In the majority of cases the cleanup happens well within the 60 seconds. But every now and then we (manually) observe pods that didn't complete their gracefully termination and were killed by Kubernetes.
How does one monitor these situations? When I try replicating this scenario with a simple linux image and sleep, I don't see Kubernetes logging an additional event after the "Killed" one. Without an additional event this is impossible to monitor from the infrastructure side.
You can use container hooks and then you can monitor those hooks events. For example preStop hook which is called when a POD get destroyed, will fire FailedPreStopHook event if it can not complete its work until terminationGracePeriodSeconds
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
preStop:
exec:
command: ["/bin/sh","-c","nginx -s quit; while killall -0 nginx; do sleep 1; done"]
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/

Run containers which intentionally exit periodically

How can I have Kubernates automatically restart a container which purposefully exits in order to get new data from environment variables?
I have a container running on a Kubernates cluster which operates in the following fashion:
Container starts, polls for work
If it receives a task, it does some work
It polls for work again, until ...
.. the container has been running for over a certain period of time, after which it exits instead of polling for more work.
It needs to be continually restarted, as it uses environment variables which are populated by Kubernates secrets which are periodically refreshed by another process.
I've tried a Deployment, but it doesn't seem like the right fit as I get CrashLoopBackOff status, which means the worker is scheduled less and less often.
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-fonky-worker
labels:
app: my-fonky-worker
spec:
replicas: 2
selector:
matchLabels:
app: my-fonky-worker
template:
metadata:
labels:
app: my-fonky-worker
spec:
containers:
- name: my-fonky-worker-container
image: my-fonky-worker:latest
env:
- name: NOTSOSECRETSTUFF
value: cats_are_great
- name: SECRETSTUFF
valueFrom:
secretKeyRef:
name: secret-name
key: secret-key
I've also tried a CronJob, but that seems a bit hacky as it could mean that the container is left in the stopped state for several seconds.
As #Josh said you need to exit with exit 0 else it will be treated as a failed container! Here is the reference According to the first example there "Pod is running and has one Container. Container exits with success." if your restartPolicy is set to Always (which is default by the way) then the container will restart although the Pod status shows running but if you log the pod then you can see the restart of the container.
It needs to be continually restarted, as it uses environment variables which are populated by Kubernates secrets which are periodically refreshed by another process.
I would take a different approach to this. I would mount the config map as explained here this will automatically refresh the Mounted config maps data Ref. Note: please take care of the " kubelet sync period (1 minute by default) + ttl of ConfigMaps cache (1 minute by default) in kubelet" to manage the refresh rate of configmap data in the Pod.
What I see as a solution for this would be to run your container as a cronjob. but don't use startingDeadlineSeconds as your container killer.
It runs on its schedule.
In your container you can have it poll for work N times.
After N times it exits 0.
If I understood correctly in your example there are 2 problems:
Restarting container
Updating secret values
In order to keep your secrets up to date you should consider using secrets as described by Amit Kumar Gupta comment and mount secrets as volume instead of environment variable, here is an example.
As per the second problem with restarting container it depends on what is the exit code as described by garlicFrancium
From another point of view you can use init container waiting for new tasks and main container in order to proceed this tasks according to your requirements or create job scheduler.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: complete
name: complete
spec:
replicas: 1
selector:
matchLabels:
app: complete
template:
metadata:
labels:
app: complete
spec:
hostname: c1
containers:
- name: complete
command:
- "bash"
args:
- "-c"
- "wa=$(shuf -i 15-30 -n 1)&& echo $wa && sleep $wa"
image: ubuntu
imagePullPolicy: IfNotPresent
resources: {}
initContainers:
- name: wait-for
image: ubuntu
command: ['bash', '-c', 'sleep 30']
restartPolicy: Always
Please note:
When a secret being already consumed in a volume is updated, projected keys are eventually updated as well. Kubelet is checking whether the mounted secret is fresh on every periodic sync. However, it is using its local cache for getting the current value of the Secret.
The type of the cache is configurable using the (ConfigMapAndSecretChangeDetectionStrategy field in KubeletConfiguration struct). It can be either propagated via watch (default), ttl-based, or simply redirecting all requests to directly kube-apiserver. As a result, the total delay from the moment when the Secret is updated to the moment when new keys are projected to the Pod can be as long as kubelet sync period + cache propagation delay, where cache propagation delay depends on the chosen cache type (it equals to watch propagation delay, ttl of cache, or zero corespondingly).
A container using a Secret as a subPath volume mount will not receive Secret updates.
Please refer also to:
Fine Parallel Processing Using a Work Queue

Pod failure and recovery events

We are listening to multiple mailboxes on a single pod but if this pod goes down due to some reason need the other pod that is up to listen to these mailboxes. In order to keep recieving emails.
I would like to know if it is possible to find if a pod goes down like an event and trigger a script to perform above action on the go?
Approach 1:
kubernetes life cycle handler hook
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: nginx
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
preStop:
exec:
command: ["/bin/sh","-c","nginx -s quit; while killall -0 nginx; do sleep 1; done"]
https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
Approach2:
Write a script which monitors the health of for every say x seconds, when 3 consecutive health checks fail kubernetes deletes the pod. so in your script, if 3 consecutive rest call fails for health then the pod is deleted. you can trigger your event.
Approach3:
maintain 2 replicas => problem could be two pods processing same mail. you can avoid this if you use kafka.

Kubernetes Rolling Updates: Respect pod readiness before updating

My deployment's pods are doing work that should not be interrupted. Is it possible that K8s is polling an endpoint about update readiness, or inform my pod that it is about to go down so it can get its affairs in order and then declare itself ready for an update?
Ideal process:
An updated pod is ready to replace an old one
A request is sent to the old pod by k8s, telling it that it is about to be updated
Old pod gets polled about update readiness
Old pod gets its affairs in order (e.g. stop receiving new tasks, finishes existing tasks)
Old pod says it is ready
Old pod gets replaced
You could perhaps look into using container lifecycle hooks - specifically prestop in this case.
apiVersion: v1
kind: Pod
metadata:
name: your-pod
spec:
containers:
- name: your-awesome-image
image: image-name
lifecycle:
postStart:
exec:
command: ["/bin/sh", "my-app", "-start"]
preStop:
exec:
# specifically by adding the cmd you want your image to run here
command: ["/bin/sh","my-app","-stop"]

Not able to see Pod when I create a Job

When I try to create Deployment as Type Job, it's not pulling any image.
Below is .yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: copyartifacts
spec:
backoffLimit: 1
template:
metadata:
name: copyartifacts
spec:
restartPolicy: "Never"
volumes:
- name: sharedvolume
persistentVolumeClaim:
claimName: shared-pvc
- name: dockersocket
hostPath:
path: /var/run/docker.sock
containers:
- name: copyartifacts
image: alpine:3.7
imagePullPolicy: Always
command: ["sh", "-c", "ls -l /shared; rm -rf /shared/*; ls -l /shared; while [ ! -d /shared/artifacts ]; do echo Waiting for artifacts to be copied; sleep 2; done; sleep 10; ls -l /shared/artifacts; "]
volumeMounts:
- mountPath: /shared
name: sharedvolume
Can you please guide here?
Regards,
Vikas
There could be two possible reasons for not seeing pod.
The pod hasn't been created yet.
The pod has completed it's task and terminated before you have noticed.
1. Pod hasn't been created:
If pod hasn't been created yet, you have to find out why the job failed to create pod. You can view job's events to see if there are any failure event. Use following command to describe a job.
kubectl describe job <job-name> -n <namespace>
Then, check the Events: field. There might be some events showing pod creation failure with respective reason.
2. Pod has completed and terminated:
Job's are used to perform one-time task rather than serving an application that require to maintain a desired state. When the task is complete, pod goes to completed state then terminate (but not deleted). If your Job is intended for a task that does not take much time, the pod may terminate after completing the task before you have noticed.
As the pod is terminated, kubectl get pods will not show that pod. However, you will able to see the pod using kubectl get pods -a command as it hasn't been deleted.
You can also describe the job and check for completion or success event.
if you use kind created the K8s cluster, all the cluster node run as docker. If you had reboot you computer or VM, the cluster (pod) ip address may change, leeding to the cluster node internet communication failed. In this case, see the cluster manager logs, it has error message. Job created, but pod not.
try to re-create the cluster, or change the node config about ip address.