Kubernetes Rolling Updates: Respect pod readiness before updating - kubernetes

My deployment's pods are doing work that should not be interrupted. Is it possible that K8s is polling an endpoint about update readiness, or inform my pod that it is about to go down so it can get its affairs in order and then declare itself ready for an update?
Ideal process:
An updated pod is ready to replace an old one
A request is sent to the old pod by k8s, telling it that it is about to be updated
Old pod gets polled about update readiness
Old pod gets its affairs in order (e.g. stop receiving new tasks, finishes existing tasks)
Old pod says it is ready
Old pod gets replaced

You could perhaps look into using container lifecycle hooks - specifically prestop in this case.
apiVersion: v1
kind: Pod
metadata:
name: your-pod
spec:
containers:
- name: your-awesome-image
image: image-name
lifecycle:
postStart:
exec:
command: ["/bin/sh", "my-app", "-start"]
preStop:
exec:
# specifically by adding the cmd you want your image to run here
command: ["/bin/sh","my-app","-stop"]

Related

Kubernetes start pods in batches for a node

For some application,start or restart need more resources than running。for exapmle:es/flink。if a node have network jitter,all pods would restart at the same time in this node。When this happens,cpu usage becomes very high in this node。it would increase resource competition in this node。
now i want to start pods in batches for only one node。how to realize the function now?
Kubernetes have auto-healing
You can let the POD crash and Kubernetes will auto re-start them soon as get the sufficient memory or resource requirement
Or else if you want to put the wait somehow so that deployment wait and gradually start one by one
you can use the sidecar and use the POD lifecycle hooks to start the main container, this process is not best but can resolve your issue.
Basic example :
apiVersion: v1
kind: Pod
metadata:
name: sidecar-starts-first
spec:
containers:
- name: sidecar
image: my-sidecar
lifecycle:
postStart:
exec:
command:
- /bin/wait-until-ready.sh
- name: application
image: my-application
OR
You can also use the Init container to check the other container's health and start the main container POD once one POD is of another service.
Init container
i would also recommend to check the Priority class : https://kubernetes.io/docs/concepts/scheduling-eviction/pod-priority-preemption/

Kubernetes: How to update a live busybox container's 'command'

I have the following manifest that created the running pod named 'test'
apiVersion: v1
kind: Pod
metadata:
name: hello-world
labels:
app: blue
spec:
containers:
- name: funskies
image: busybox
command: ["/bin/sh", "-c", "echo 'Hello World'"]
I want to update the pod to include the additional command
apiVersion: v1
kind: Pod
metadata:
name: hello-world
labels:
app: blue
spec:
containers:
restartPolicy: Never
- name: funskies
image: busybox
command: ["/bin/sh", "-c", "echo 'Hello World' > /home/my_user/logging.txt"]
What I tried
kubectl edit pod test
What resulted
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
# pods "test" was not valid:
# * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`...
Other things I tried:
Updated the manifest and then ran apply - same issue
kubectl apply -f test.yaml
Question: What is the proper way to update a running pod?
You can't modify most properties of a Pod. Typically you don't want to directly create Pods; use a higher-level controller like a Deployment.
The Kubernetes documentation for a PodSpec notes (emphasis mine):
containers: List of containers belonging to the pod. Containers cannot currently be added or removed. There must be at least one container in a Pod. Cannot be updated.
In all cases, no matter what, a container runs a single command, and if you want to change what that command is, you need to delete and recreate the container. In Kubernetes this always means deleting and recreating the containing Pod. Usually you shouldn't use bare Pods, but if you do, you can create a new Pod with the new command and delete the old one. Deleting Pods is extremely routine and all kinds of ordinary things cause it to happen (updating Deployments, a HorizontalPodAutoscaler scaling down, ...).
If you have a Deployment instead of a bare Pod, you can freely change the template: for the Pods it creates. This includes changing their command:. This will result in the Deployment creating a new Pod with the new command, and once it's running, deleting the old Pod.
The sorts of very-short-lived single-command containers you show in the question aren't necessarily well-suited to running in Kubernetes. If the Pod isn't going to stay running and serve requests, a Job could be a better match; but a Job believes it will only be run once, and if you change the pod spec for a completed Job I don't think it will launch a new Pod. You'd need to create a new Job for this case.
I am not sure what the whole requirement is.
but you can exec to the pod and update the details
$ kubectl exec <pod-name> -it -n <namespace> -- <command to execute>
like,
$ kubectl exec pod/hello-world-xxxx-xx -it -- /bin/bash
if tty support shell, use "/bin/sh" to update the content or command.
Editing the running pod, will not retain the changes in manifest file. so in that case you have to run a new pod with the changes.

Is there a way to delete pods automatically through YAML after they have status 'Completed'?

I have a YAML file which creates a pod on execution. This pod extracts data from one of our internal systems and uploads to GCP. It takes around 12 mins to do so after which the status of the pod changes to 'Completed', however I would like to delete this pod once it has completed.
apiVersion: v1
kind: Pod
metadata:
name: xyz
spec:
restartPolicy: Never
volumes:
- name: mount-dir
hostPath:
path: /data_in/datos/abc/
initContainers:
- name: abc-ext2k8s
image: registrysecaas.azurecr.io/secaas/oracle-client11c:11.2.0.4-latest
volumeMounts:
- mountPath: /media
name: mount-dir
command: ["/bin/sh","-c"]
args: ["sqlplus -s CLOUDERA/MYY4nGJKsf#hal5:1531/dbmk #/media/ext_hal5_lk_org_localfisico.sql"]
imagePullSecrets:
- name: regcred
Is there a way to acheive this?
Typically you don't want to create bare Kubernetes pods. The pattern you're describing of running some moderate-length task in a pod, and then having it exit, matches a Job. (Among other properties, a job will reschedule a pod if the node it's on fails.)
Just switching this to a Job doesn't directly address your question, though. The documentation notes:
When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status.
So whatever task creates the pod (or job) needs to monitor it for completion, and then delete the pod (or job). (Consider using the watch API or equivalently the kubectl get -w option to see when the created objects change state.) There's no way to directly specify this in the YAML file since there is a specific intent that you can get useful information from a completed pod.
If this is actually a nightly task that you want to run at midnight or some such, you do have one more option. A CronJob will run a job on some schedule, which in turn runs a single pod. The important relevant detail here is that CronJobs have an explicit control for how many completed Jobs they keep. So if a CronJob matches your pattern, you can set successfulJobsHistoryLimit: 0 in the CronJob spec, and created jobs and their matching pods will be deleted immediately.

preStop hook doesn't get executed

I am testing lifecycle hooks, and post-start works pretty well, but I think pre-stop never gets executed. There is another answer, but it is not working, and actually if it would work, it would contradict k8s documentation. So, from the docs:
PreStop
This hook is called immediately before a container is terminated due
to an API request or management event such as liveness probe failure,
preemption, resource contention and others. A call to the preStop hook
fails if the container is already in terminated or completed state.
So, the API request makes me think I can simply do kubectl delete pod POD, and I am good.
More from the docs (pod shutdown process):
1.- User sends command to delete Pod, with default grace period (30s)
2.- The Pod in the API server is updated with the time beyond which the Pod is considered “dead” along with the grace period.
3.- Pod shows up as “Terminating” when listed in client commands
4.- (simultaneous with 3) When the Kubelet sees that a Pod has been marked as terminating because the time in 2 has been set, it begins the pod shutdown process.
4.1.- If one of the Pod’s containers has defined a preStop hook, it is invoked inside of the container. If the preStop hook is still running after the grace period expires, step 2 is then invoked with a small (2 second) extended grace period.
4.2.- The container is sent the TERM signal. Note that not all containers in the Pod will receive the TERM signal at the same time and may each require a preStop hook if the order in which they shut down matters.
...
So, since when you do kubectl delete pod POD, the pod gets on Terminating, I assume I can do it.
From the other answer, I can't do this, but the way is to do a rolling-update. Well, I tried in all possible ways and it didn't work either.
My tests:
I have a deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: my-deploy
spec:
replicas: 1
template:
metadata:
name: lifecycle-demo
labels:
lifecycle: demo
spec:
containers:
- name: nginx
image: nginx
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- echo "Hello at" `date` > /usr/share/post-start
preStop:
exec:
command:
- /bin/sh"
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
volumeMounts:
- name: hooks
mountPath: /usr/share/
volumes:
- name: hooks
hostPath:
path: /usr/hooks/
I expect the pre-stop and post-start files to be created in /usr/hooks/, on the host (node where the pod is running). post-start is there, but pre-stop, never.
I tried kubectl delete pod POD, and it didn't work.
I tried kubectl replace -f deploy.yaml, with a different image, and when I do kubectl get rs, I can see the new replicaSet created, but the file isn't there.
I tried kubectl set image ..., and again, I can see the new replicaSet created, but the file isn't there.
I even tried putting them in a completely separated volumes, as I thought may be when I kill the pod and it gets re-created it re-creates the folder where the files should be created, so it deletes the folder and the pre-stop file, but that was not the case.
Note: It always get re-created on the same node. I made sure on that.
What I have not tried is to bomb the container and break it by setting low CPU limit, but that's not what I need.
Any idea what are the circumstances under which preStop hook would get triggered?
Posting this as community wiki for a better visibility.
There is a typo in the second "/bin/sh"; for preStop. There is an extra double quote ("). It was letting me to create the deployment, but was the cause it was not creating the file. All works fine now.
The exact point where the issue lied was here:
preStop:
exec:
command:
- /bin/sh" # <- this quotation
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
To be correct it should look like that:
preStop:
exec:
command:
- /bin/sh
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
For the time of writing this community wiki post, this Deployment manifest is outdated. Following changes were needed to be able to run this manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: good-deployment
spec:
selector:
matchLabels:
lifecycle: demo
replicas: 1
template:
metadata:
labels:
lifecycle: demo
spec:
containers:
- name: nginx
image: nginx
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- echo "Hello at" `date` > /usr/share/post-start
preStop:
exec:
command:
- /bin/sh
- -c
- echo "Goodbye at" `date` > /usr/share/pre-stop
volumeMounts:
- name: hooks
mountPath: /usr/share/
volumes:
- name: hooks
hostPath:
path: /usr/hooks/
Changes were following:
1. apiVersion
+--------------------------------+---------------------+
| Old | New |
+--------------------------------+---------------------+
| apiVersion: extensions/v1beta1 | apiVersion: apps/v1 |
+--------------------------------+---------------------+
StackOverflow answer for more reference:
Stackoverflow.com: Questions: No matches for kind “Deployment” in version extensions/v1beta1
2. selector
Added selector section under spec:
spec:
selector:
matchLabels:
lifecycle: demo
Additional links with reference:
What is spec - selector - matchLabels used for while creating a deployment?
Kubernetes.io: Docs: Concepts: Workloads: Controllers: Deployment: Selector
Posting this as community wiki for a better visibility.
When a pod should be terminated:
A SIGTERM signal is sent to the main process (PID 1) in each container, and a “grace period” countdown starts (defaults to 30 seconds for k8s pod - see below to change it).
Upon the receival of the SIGTERM, each container should start a graceful shutdown of the running application and exit.
If a container doesn’t terminate within the grace period, a SIGKILL signal will be sent and the container violently terminated.
For a detailed explanation, please see:
Kubernetes: Termination of pods
Kubernetes: Pods lifecycle hooks and termination notice
Kubernetes: Container lifecycle hooks
Always Confirm this:
check whether preStop is taking more than 30 seconds to run (more than default graceful period time). If it is taking then increase the terminationGracePeriodSeconds to more than 30 seconds, may be 60. refer this for more info about terminationGracePeriodSeconds
I know its too late to answer, but it is worth to add here.
I spend a full day to figureout this preStop in K8S.
K8S does not print any logs in PreStop stage. PreStop is part of lifecycle, also called as hook.
Generally Hook and Probs(Liveness & Readiness) logs will not print in kubectl logs.
Read this issue, you will get to know fully.
But there is indirect way to print logs in kubectl logs cmd. Follow the last comment in the above link
Adding here also.
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- sleep 10; echo 'hello from postStart hook' >> /proc/1/fd/1

Not able to see Pod when I create a Job

When I try to create Deployment as Type Job, it's not pulling any image.
Below is .yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: copyartifacts
spec:
backoffLimit: 1
template:
metadata:
name: copyartifacts
spec:
restartPolicy: "Never"
volumes:
- name: sharedvolume
persistentVolumeClaim:
claimName: shared-pvc
- name: dockersocket
hostPath:
path: /var/run/docker.sock
containers:
- name: copyartifacts
image: alpine:3.7
imagePullPolicy: Always
command: ["sh", "-c", "ls -l /shared; rm -rf /shared/*; ls -l /shared; while [ ! -d /shared/artifacts ]; do echo Waiting for artifacts to be copied; sleep 2; done; sleep 10; ls -l /shared/artifacts; "]
volumeMounts:
- mountPath: /shared
name: sharedvolume
Can you please guide here?
Regards,
Vikas
There could be two possible reasons for not seeing pod.
The pod hasn't been created yet.
The pod has completed it's task and terminated before you have noticed.
1. Pod hasn't been created:
If pod hasn't been created yet, you have to find out why the job failed to create pod. You can view job's events to see if there are any failure event. Use following command to describe a job.
kubectl describe job <job-name> -n <namespace>
Then, check the Events: field. There might be some events showing pod creation failure with respective reason.
2. Pod has completed and terminated:
Job's are used to perform one-time task rather than serving an application that require to maintain a desired state. When the task is complete, pod goes to completed state then terminate (but not deleted). If your Job is intended for a task that does not take much time, the pod may terminate after completing the task before you have noticed.
As the pod is terminated, kubectl get pods will not show that pod. However, you will able to see the pod using kubectl get pods -a command as it hasn't been deleted.
You can also describe the job and check for completion or success event.
if you use kind created the K8s cluster, all the cluster node run as docker. If you had reboot you computer or VM, the cluster (pod) ip address may change, leeding to the cluster node internet communication failed. In this case, see the cluster manager logs, it has error message. Job created, but pod not.
try to re-create the cluster, or change the node config about ip address.