How to restart a failed pod in kubernetes deployment

How to restart a failed pod in kubernetes deployment - kubernetes

I have 3 nodes in kubernetes cluster. I create a daemonset and deployed it in all the 3 devices. This daemonset created 3 pods and they were successfully running. But for some reasons, one of the pod failed.
I need to know how can we restart this pod without affecting other pods in the daemon set, also without creating any other daemon set deployment?
Thanks

kubectl delete pod <podname> it will delete this one pod and Deployment/StatefulSet/ReplicaSet/DaemonSet will reschedule a new one in its place

There are other possibilities to acheive what you want:
Just use rollout command
kubectl rollout restart deployment mydeploy
You can set some environment variable which will force your deployment pods to restart:
kubectl set env deployment mydeploy DEPLOY_DATE="$(date)"
You can scale your deployment to zero, and then back to some positive value
kubectl scale deployment mydeploy --replicas=0
kubectl scale deployment mydeploy --replicas=1

Just for others reading this...
A better solution (IMHO) is to implement a liveness probe that will force the pod to restart the container if it fails the probe test.
This is a great feature K8s offers out of the box. This is auto healing.
Also look into the pod lifecycle docs.

kubectl -n <namespace> delete pods --field-selector=status.phase=Failed
I think the above command is quite useful when you want to restart 1 or more failed pods :D
And we don't need to care about name of the failed pod.

Related

Does "kubectl rollout restart deploy" cause downtime?

I'm trying to get all the deployments of a namespace to be restarted for implementation reasons.
I'm using "kubectl rollout -n restart deploy" and it works perfectly, but I'm not sure it that command causes downtime or if it works as the "rollout update", applying the restart one by one, keeping my services up.
Does anyone know?
In the documentation I can only find this:
Operation
Syntax
Description
rollout
kubectl rollout SUBCOMMAND [options]
Manage the rollout of a resource. Valid resource types include: deployments, daemonsets and statefulsets.
But I can't find details about the specific "rollout restart deploy".
I need to make sure it doesn't cause downtime. Right now is very hard to tell, because the restart process is very quick.
Update: I know that for one specific deployment (kubectl rollout restart deployment/name), it works as expected and doesn't cause downtime, but I need to apply it to all the namespace (without specifying the deployment) and that's the case I'm not sure about.

kubectl rollout restart deploy -n namespace1 will restart all deployments in specified namespace with zero downtime.
Restart command will work as follows:
After restart it will create new pods for a each deployments
Once new pods are up (running and ready) it will terminate old pods
Add readiness probes to your deployments to configure initial delays.

#pcsutar 's answer is almost correct. kubectl rollout restart $resourcetype $resourcename restarts your deployment, daemonset or stateful set according to the its update strategy. so if it is set to rollingUpdate it will behave exactly as the above answer:
After restart it will create new pods for a each deployments
Once new pods are up (running and ready) it will terminate old pods
Add readiness probes to your deployments to configure initial delays.
However, if the strategy for example is type: recreate all the currently running pods belonging to the deployment will be terminated before new pods will be spun up!

How to delete pod created with rolling restart?

I ran kubectl rollout restart deployment.
It created a new pod which is now stuck in Pending state because there are not enough resources to schedule it.
I can't increase the resources.
How do I delete the new pod?

please check if that pod has a Deployment controller (which should be recreating the pod), use:
kubectl get deployments
Then try to delete the Deployment with
Kubectl delete deployment DEPLOYMENT_NAME
Also, I would suggest to check resources allocation on GKE and its usage on your nodes with next command:
kubectl describe nodes | grep -A10 "Allocated resources"
And if you need more resources, try to activate GKE CA (cluster autoscaler) or in case you already have it enabled, then increase the number of nodes on Max value. You can also try to manually add a new node by manually resizing the Nodepool you are using.

Correct way to scale/restart an application down/up in kubernetes (replicaset, deployments and pod deletion)?

I usually restart my applications by:
kubectl scale deployment my-app --replicas=0
Followed by:
kubectl scale deployment my-app --replicas=1
which works fine. I also have another running application but when I look at its replicaset I see:
$ kubectl get rs
NAME DESIRED CURRENT READY AGE
another-app 2 2 2 2d
So to restart that correctly I would of course need to:
kubectl scale deployment another-app --replicas=0
kubectl scale deployment another-app --replicas=2
But is there a better way to do this so I don't have to manually look at the repliasets before scaling/restarting my application (that might have replicas > 1)?

You can restart pods by using level
kubectl delete pods -l name=myLabel
You can rolling restart of all pods for a deployments, so that you don't take the service down
kubectl patch deployment your_deployment_name -p \
"{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
And After kubernetes version 1.15 you can
kubectl rollout restart deployment your_deployment_name

To make changes in your current deployment you can use kubectl rollout pause deployment/YOUR_DEPLOYMENT. This way the deployment will be marked as paused and won't be reconciled by the controller. After it's paused you can make necessary changes to your configuration and then resume it by using kubectl rollout resume deployment/YOUR_DEPLOYMENT. This way it will create a new replicaset with updated configuration.
Pod with new configuration will be started and when it's in running status, pod with old configuration will be terminated.
Using this method you will be able to rollout the deployment to previous version by using:
kubectl rollout history deployment/YOUR_DEPLOYMENT
to check history of the rollouts and then execute following command to rollback:
kubectl rollout undo deployment/YOUR_DEPLOYMENT --to-revision=REVISION_NO

How to autoscale with GKE

I have a GKE cluster with an autoscale node pool.
After adding some pods, the cluster starts autoscale and creates a new node but the old running pods start to crash randomly:

I don't think it's directly related to autoscaling unless some of your old nodes are being removed. The autoscaling is triggered by adding more pods but most likely, there is something with your application or connectivity to external services (db for example). I would check the what's going on in the pod logs:
$ kubectl logs <pod-id-that-is-crashing>
You can also check for any other event in the pods or deployment (if you are using a deployment)
$ kubectl describe deployment <deployment-name>
$ kubectl describe pod <pod-id> -c <container-name>
Hope it helps!

Error while creating pods in Kubernetes

I have installed Kubernetes in Ubuntu server using instructions here. I am trying to create pods using kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --hostport=8000 --port=8080 as listed in the example. However, when I do kubectl get pod I get the status of the container as pending. I further did kubectl describe pod for debugging and I see the message:
FailedScheduling pod (hello-minikube-3383150820-1r4f7) failed to fit in any node fit failure on node (minikubevm): PodFitsHostPorts.
I am further trying to delete this pod by kubectl delete pod hello-minikube-3383150820-1r4f7 but when I further do kubectl get pod I see another pod with prefix "hello-minikube-3383150820-" that I havent created. Does anyone know how to fix this problem? Thank you in advance.

The PodFitsHostPorts predicate is failing because you have something else on your nodes using port 8000. You might be able to find what it is by running kubectl describe svc.
kubectl run creates a deployment object (you can see it with kubectl describe deployments) which makes sure that you always keep the intended number of replicas of the pod running (in this case 1). When you delete the pod, the deployment controller automatically creates another for you. If you want to delete the deployment and the pods it keeps creating, you can run kubectl delete deployments hello-minikube.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to restart a failed pod in kubernetes deployment - kubernetes

kubectl delete pod <podname> it will delete this one pod and Deployment/StatefulSet/ReplicaSet/DaemonSet will reschedule a new one in its place

Just for others reading this... A better solution (IMHO) is to implement a liveness probe that will force the pod to restart the container if it fails the probe test. This is a great feature K8s offers out of the box. This is auto healing. Also look into the pod lifecycle docs.

kubectl -n <namespace> delete pods --field-selector=status.phase=Failed I think the above command is quite useful when you want to restart 1 or more failed pods :D And we don't need to care about name of the failed pod.

Related

Does "kubectl rollout restart deploy" cause downtime?

How to delete pod created with rolling restart?

Correct way to scale/restart an application down/up in kubernetes (replicaset, deployments and pod deletion)?

How to autoscale with GKE

Error while creating pods in Kubernetes

Categories

Resources