How to delete a pod from Kubernetes master node? - kubernetes

Does anyone know how to delete pod from kubernetes master node? I have this one master node on bare-metal ubuntu server. When i'm trying to delete it with "kubectl delete pod .." or force deleting from there: https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/ it doesnt work. the pod is creating again and again...

The pods in a Statefulsets are managed by ReplicaSets and will be recreated again if the current and the desired replicas defined in the spec do not match.
The document you linked provides instructions as to how to kill the pods forcefully avoiding the graceful shutdown behaviour which can have unexpected behaviour depending on the application.
The link clearly states the pods will be recreated in the section:
Force deletions do not wait for confirmation from the kubelet that the Pod has been terminated. Irrespective of whether a force deletion is successful in killing a Pod, it will immediately free up the name from the apiserver. This would let the StatefulSet controller create a replacement Pod with that same identity; this can lead to the duplication of a still-running Pod, and if said Pod can still communicate with the other members of the StatefulSet, will violate the at most one semantics that StatefulSet is designed to guarantee.
If you want the pods to be stopped and new pods for the Statefulset do not get created, you need to scale down the Statefulset by changing the replicas to 0.
You can read the official docs for how to scale the Statefulset replicas.

The key to figuring out how to kill the pod will be to understand how it was created. For example, if the pod is part of a deployment with a declared replicas count as 1, Once you kill/ force kill, Kubernetes detects a mismatch between the desired state (the number of replicas defined in the deployment configuration) to the current state and will create a new pod to replace the one that was deleted - therefor in this example you will need to either scale the deployment to 0 or delete the deployment.

If we need to kill any pod we can just scale down the replica set.
kubectl scale deploy <deployment_name> --replicas=<expected_no_of_replicas>

Way of deleting pods will depends on how you created it. If you created it individually ( not part of a ReplicaSet/ReplicationController/Deployment ) then you can delete pod directly. otherwise the only option to delete is the scale option. In production setup what I believe is all are using Deployment option out of ReplicaSet/ReplicationController/Deployment( Please refer documents and understand the difference between all those three options )

Related

Auto delete CrashBackoffLoop pods in a deployment

In my kubernetes cluster, there are multiple deployments in a namespace.
For a specific deployment, there is a need to not allow "CrashLoopBackoff" pods to exist.
So basically, when any pod gets to this state, I would want it to be deleted and later a new pod to be created which is already handled by the ReplicaSet.
I tried with custom controllers, with the thought that the SharedInformer would alert about the state of Pod and then I would delete it from that loop.
However, this brings dependency on the pod on which the custom controller would run.
I also tried searching for any option to be configured in the manifest itself, but could not find any.
I am pretty new to Kuberenetes, so need help in the implementation of this behaviour.
Firstly, you should address the reason why the pod has entered the CrashLoopBackOff state rather than just delete it. If you do this, you'll potentially just recreate the problem again and you'll be deleting pods repeatedly. For example, if your pod is trying to access an external DB and that DB is down, it'll CrashLoop, and deleting and restarting the pod won't help fix that.
Secondly, if you want to do this deleting in an automated manner, an easy way would be to run a CronJob resource that goes through your deployment and deletes the CrashLooped pods. You could set the cronjob to run once an hour or whatever schedule you wish.
Deleting the POD and waiting for the New one is like restarting the deployment or POD.
Kubernetes will auto restart your CrashLoopBackoff POD if failing, you can check the Restart count.
NAME READY STATUS RESTARTS AGE
te-pod-1 0/1 CrashLoopBackOff 2 1m44s
This restarts will be similar to what you have mentioned
when any pod gets to this state, I would want it to be deleted and
later a new pod to be created which is already handled by the
ReplicaSet.
If you want to remove Crashing the POD fully and not look for new POD to come up, you have to rollback the deployment.
If there is any issue with your Replicaset and your POD is crashing it would be useless, any number of times you delete and restart the POD it will crash all time, unless you check logs & debug to solve the real issue in replicaset(Deployment).

Facing "The Pod "web-nginx" is invalid: spec.initContainers: Forbidden: pod updates may not add or remove containers" applying pod with initcontainers

I was trying to make file before application gets up in kubernetes cluster with initcontainers,
But when i am setting up the pod.yaml and trying to apply it with "kubectl apply -f pod.yaml" it throws below error
error-image
Like the error says, you cannot update a Pod adding or removing containers. To quote the documentation ( https://kubernetes.io/docs/concepts/workloads/pods/#pod-update-and-replacement )
Kubernetes doesn't prevent you from managing Pods directly. It is
possible to update some fields of a running Pod, in place. However,
Pod update operations like patch, and replace have some limitations
This is because usually, you don't create Pods directly, instead you use Deployments, Jobs, StatefulSets (and more) which are high-level resources that defines Pods templates. When you modify the template, Kubernetes simply delete the old Pod and then schedule the new version.
In your case:
you could delete the pod first, then create it again with the new specs you defined. But take into consideration that the Pod may be scheduled on a different node of the cluster (if you have more than one) and that may have a different IP Address as Pods are disposable entities.
Change your definition with a slightly more complex one, a Deployment ( https://kubernetes.io/docs/concepts/workloads/controllers/deployment/ ) which can be changed as desired and, each time you'll make a change to its definition, the old Pod will be removed and a new one will be scheduled.
From the spec of your Pod, I see that you are using a volume to share data between the init container and the main container. This is the optimal way but you don't necessarily need to use a hostPath. If the only needs for the volume is to share data between init container and other containers, you can simply use emptyDir type, which acts as a temporary volume that can be shared between containers and that will be cleaned up when the Pod is removed from the cluster for any reason.
You can check the documentation here: https://kubernetes.io/docs/concepts/storage/volumes/#emptydir

kubernetes creates more pods than scale amount

I have encountered a strange situation in one of our clusters, where all of a sudden a number of new pods have been created so that we end up with a greater number of running pods than the scale amount.
So in the dashboard it will show
serviceX pods: 8/2
and then 8 running instances of that service
Questions
How can this possibly happen?
Is there an easy way to get rid
of the extra pods (which all seem to be running)?
I have tried changing the scale amount in the dashboard and the extra pods do not disappear.
Both Pod and deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template.
In your case deployment name edgeservicepublic-svc is set to have 13 replicas. Deployment is a kind of controller in Kubernetes. Its is naturally that this controller with continuously check if 13 pods are created. When a deployment is added to the cluster, it will automatically spin up the requested number of pods, and then monitor them. If a pod dies, the deployment will automatically re-create it. Probably at first not enough pods are created co controller with pursue to achieve desried number of them.
To make sure your deployment works properly you can delete deployment, make sure that that pods are deleted. Make sure that you haven't set up autoscaler ( $ kubectl get hpa ) if so, delete it. Then if you want to change deployment specification edit deployment configuration file and apply changes ($ kubectl apply -f deployment_configuration_file.yaml).
Useful documentation about deployment , autoscaling in context of GKE.
EDIT:
Basically at first place check autoscaler then delete it if it exists. I told you to delete deployment because you told that you try to change scale amount/ number of replicas. So if you want to be 100 % sure that changes are applied is to delete whole deployment end then recreate it with desired number of replicas. Of course you can just apply changes in deployment configuration file ($ kubectl edit ...) or ( $ kubectl apply -f ) but sometimes existing pods are not deleted so it will be saver. You could also create new deployment with the same parameters but different name.

What happens when you drain nodes in a Kubernetes cluster?

I'd like to get some clarification for preparation for maintenance when you drain nodes in a Kubernetes cluster:
Here's what I know when you run kubectl drain MY_NODE:
Node is cordoned
Pods are gracefully shut down
You can opt to ignore Daemonset pods because if they are shut down, they'll just be re-spawned right away again.
I'm confused as to what happens when a node is drained though.
Questions:
What happens to the pods? As far as I know, there's no 'live migration' of pods in Kubernetes.
Will the pod be shut down and then automatically started on another node? Or does this depend on my configuration? (i.e. could a pod be shut down via drain and not start up on another node)
I would appreciate some clarification on this and any best practices or advice as well. Thanks in advance.
By default kubectl drain is non-destructive, you have to override to change that behaviour. It runs with the following defaults:
--delete-local-data=false
--force=false
--grace-period=-1
--ignore-daemonsets=false
--timeout=0s
Each of these safeguard deals with a different category of potential destruction (local data, bare pods, graceful termination, daemonsets). It also respects pod disruption budgets to adhere to workload availability. Any non-bare pod will be recreated on a new node by its respective controller (e.g. daemonset controller, replication controller).
It's up to you whether you want to override that behaviour (for example you might have a bare pod if running jenkins job. If you override by setting --force=true it will delete that pod and it won't be recreated). If you don't override it, the node will be in drain mode indefinitely (--timeout=0s)).
I just want to add a few things to eamon1234's answer:
You may find this useful as well:
Link to official docummentation (in case default flags change etc.). According to it:
The 'drain' evicts or deletes all pods except mirror pods (which
cannot be deleted through the API server). If there are
DaemonSet-managed pods, drain will not proceed without
--ignore-daemonsets, and regardless it will not delete any DaemonSet-managed pods, because those pods would be immediately
replaced by the DaemonSet controller, which ignores unschedulable
markings. If there are any pods that are neither mirror pods nor
managed by ReplicationController, ReplicaSet, DaemonSet, StatefulSet
or Job, then drain will not delete any pods unless you use --force.
--force will also allow deletion to proceed if the managing resource of one or more pods is missing.
Simple chart illustrating what actually happens when using kubectl drain.
Using kubectl drain with --dry-run option may be also a good idea so you can see its outcome before any actual changes are applied e.g.:
kubectl drain foo --force --dry-run
however it will not show any errors about existing local data or daemonsets which you can see whithout using --dry-run flag:
... error: cannot delete DaemonSet-managed Pods (use --ignore-daemonsets to ignore) ...
We can use kubectl drain to safely evict all of our pods from a node before we perform maintenance on the node.
If you want to update or patch or any kind of maintenance on Hardware/Node you should first drain all the pods(Migrate pods one node to another) kubectl drain
When kubectl drain returns successfully, that indicates that all of the pods have been safely evicted. It is then safe to bring down the node
After maintenance work we can use kubectl uncordon to tell Kubernetes that it can resume scheduling new pods onto the node.

Why does scaling down a deployment seem to always remove the newest pods?

(Before I start, I'm using minikube v27 on Windows 10.)
I have created a deployment with the nginx 'hello world' container with a desired count of 2:
I actually went into the '2 hours' old pod and edited the index.html file from the welcome message to "broken" - I want to play with k8s to seem what it would look like if one pod was 'faulty'.
If I scale this deployment up to more instances and then scale down again, I almost expected k8s to remove the oldest pods, but it consistently removes the newest:
How do I make it remove the oldest pods first?
(Ideally, I'd like to be able to just say "redeploy everything as the exact same version/image/desired count in a rolling deployment" if that is possible)
Pod deletion preference is based on a ordered series of checks, defined in code here:
https://github.com/kubernetes/kubernetes/blob/release-1.11/pkg/controller/controller_utils.go#L737
Summarizing- precedence is given to delete pods:
that are unassigned to a node, vs assigned to a node
that are in pending or not running state, vs running
that are in not-ready, vs ready
that have been in ready state for fewer seconds
that have higher restart counts
that have newer vs older creation times
These checks are not directly configurable.
Given the rules, if you can make an old pod to be not ready, or cause an old pod to restart, it will be removed at scale down time before a newer pod that is ready and has not restarted.
There is discussion around use cases for the ability to control deletion priority, which mostly involve workloads that are a mix of job and service, here:
https://github.com/kubernetes/kubernetes/issues/45509
what about this :
kubectl scale deployment ingress-nginx-controller --replicas=2
Wait until 2 replicas are up.
kubectl delete pod ingress-nginx-controller-oldest-replica
kubectl scale deployment ingress-nginx-controller --replicas=1
I experienced zero downtime doing so while removing oldest pod.