Kubernetes shcedule restart pods - kubernetes

I have Deployment that runs 5 pods.
I want to restart all pods each 5min.
Currently I'm doing it with python scritpt that is running kubectl get po and checks AGE, if the AGE bigger than 5 min it deletes the pod.
Is there another way to achieve that?

You could do a liveness check to achieve this, but why would you do it? Deployment is for LongRunning Tasks.
A liveness check will reschedule a pod if its not true (gives a other exit code than 0)
For more Info here:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Related

pod - How to kill or stop only one pod from n replicas of a deployment

I have a testing scenario to check if the API requests are being handled by another pod if one goes down. I know this is the default behaviour, but I want to stimulate the following scenario.
Pod replicas - 2 (pod A and B)
During my API requests, I want to kill/stop only pod A.
During downtime of A, requests should be handled by B.
I am aware that we can restart the deployment and also scale replicas to 0 and again to 2, but this won't work for me.
Is there any way to kill/stop/crash only pod A?
Any help will be appreciated.
If you want to simulate what happens if one of the pods just gets lost, you can scale down the deployment
kubectl scale deployment the-deployment-name --replicas=1
and Kubernetes will terminate all but one of the pods; you should almost immediately see all of the traffic going to the surviving pod.
But if instead you want to simulate what happens if one of the pods crashes and restarts, you can delete the pod
# kubectl scale deployment the-deployment-name --replicas=2
kubectl get pods
kubectl delete pod the-deployment-name-12345-f7h9j
Once the pod starts getting deleted, the Kubernetes Service should route all of the traffic to the surviving pod(s) (those with Running status). However, the pod is managed by a ReplicaSet that wants there to be 2 replicas, so if one of the pods is deleted, the ReplicaSet will immediately create a new one. This is similar to what would happen if the pod crashes and restarts (in this scenario you'd get the same pod and the same node, if you delete the pod it might come back in a different place).
As you mentioned you can manually kill or restart the pod that is the only solution to test the case or else you can try crashing the one single POD but in the end, it will create the same scenario POD will auto restart.
Or else may you can increase the Graceful shutdown period for deployment so this way POD might take time and stay in terminating state for a good amount of time and you can perform the test.
In kubernetes where pods are controlled by the replicaSet, if you kill a pod it will again be recreated. So the only way to do this is to scale down the number of replicas.
Let's say if your deployment had 4 replicas. You can scale down to 3 by running the command below
kubectl scale deployment <deployment-name> --replicas=3
My example is as show below
kubectl scale deployment hello-world --replicas=3
deployment.apps/hello-world scaled

How do I know why my SonarQube helm chart is getting auto-killed by Kubernetes

This question is about logging/monitoring.
I'm running a 3 node cluster on AKS, with 3 orgs, Dev, Test and Prod. The chart worked fine in Dev, but the same chart keeps getting killed by Kubernetes in Test, and it keeps getting recreated, and re-killed. Is there a way to extract details on why this is happening? All I see when I describe the pod is Reason: Killed
Please tell me more details on this or can give some suggestions. Thanks!
List Events sorted by timestamp
kubectl get events --sort-by=.metadata.creationTimestamp
There might be various reasons for it to be killed, e.g. not sufficient resources or failed liveness probe.
For SonarQube there is a liveness and readiness probe configured so it might fail. Also as described in helm's chart values:
If an ingress path other than the root (/) is defined, it should be reflected here
A trailing "/" must be included
You can also check if there are sufficient resources on node:
check what node are pods running on: kubectl get pods -test and
then run kubectl describe node <node-name> to check if there is no
disk/ memory pressure.
You can also run kubectl logs <pod-name> and kubectl describe pod <pod-name> that might give you some insight of kill reason.

kubectl apply vs kubernetes deployment - Terraform

I am trying to use Kubernetes Deployment , i would like to know whether this is same as kubectl apply -f deployment.yaml or does this wait for the deployments to be up and running . because when i used kubernetes deployment to create a basic pod which i know will not work, i got this error
Error: Waiting for rollout to finish: 0 of 1 updated replicas are available...
Is this just giving me the error from kubernetes or the entire terraform script fails because of this?
According to the documentation
A Deployment ensures that a specified number of pod “replicas” are running at any one time. In other words, a Deployment makes sure that a pod or homogeneous set of pods are always up and available. If there are too many pods, it will kill some. If there are too few, the Deployment will start more.
So, It will wait to ensure number of expected replicas are up

How to restart a failed pod in kubernetes deployment

I have 3 nodes in kubernetes cluster. I create a daemonset and deployed it in all the 3 devices. This daemonset created 3 pods and they were successfully running. But for some reasons, one of the pod failed.
I need to know how can we restart this pod without affecting other pods in the daemon set, also without creating any other daemon set deployment?
Thanks
kubectl delete pod <podname> it will delete this one pod and Deployment/StatefulSet/ReplicaSet/DaemonSet will reschedule a new one in its place
There are other possibilities to acheive what you want:
Just use rollout command
kubectl rollout restart deployment mydeploy
You can set some environment variable which will force your deployment pods to restart:
kubectl set env deployment mydeploy DEPLOY_DATE="$(date)"
You can scale your deployment to zero, and then back to some positive value
kubectl scale deployment mydeploy --replicas=0
kubectl scale deployment mydeploy --replicas=1
Just for others reading this...
A better solution (IMHO) is to implement a liveness probe that will force the pod to restart the container if it fails the probe test.
This is a great feature K8s offers out of the box. This is auto healing.
Also look into the pod lifecycle docs.
kubectl -n <namespace> delete pods --field-selector=status.phase=Failed
I think the above command is quite useful when you want to restart 1 or more failed pods :D
And we don't need to care about name of the failed pod.

sprint cloud data flow server kubernetes liveness timeout values

I am using SCDF for kubernetes to deploy streams. Some of the kubernetes pods deployed by SCDF server are in constant restarts because the livenessProbe initialDelaySeconds of 10s is too short:
#> kubectl get pods
NAME READY STATUS RESTARTS AGE
datapipeline-confirmation-0-g261e 0/1 CrashLoopBackOff 37 2h
As these pods are created by SCDF, I can not figure out how to tell SCDF server to use larger timeout values when deploying pods. Tried just about everything short of diving into SCDF java source code, and asking on StackExchange. Thanks in advance!
You could try to PATCH the pod config directly
something like:
kubectl patch pod datapipeline-confirmation-0-g261e -c <container name> -p '{"readinessProbe": {"timeoutSeconds": 60}}'
assuming you have the credentials to do so. Otherwise you might need to ask SCDF to fix their problem.