K8 Pod Lifetime: Is Cleanup Necessary? - kubernetes

The Kubernetes Docs say the following:
In general, Pods do not disappear until someone destroys them. This
might be a human or a controller. The only exception to this rule is
that Pods with a phase of Succeeded or Failed for more than some
duration (determined by the master) will expire and be automatically
destroyed.
What is the default value for this duration and how do I set it? My pods also never enter the Succeeded or Failed phase, rather they enter Completed or Error phase respectively. Is this to be expected; are the docs out of date?
I check the pod phases using kubectl get pods --show-all, where information about them seems to persist. Is there any additional cleanup necessary? Running kubectl get pods without --show-all does not show any pods after they destroyed.
I am creating pods with kubectl apply -f k8/dummy-pod.yaml and the following yaml file:
apiVersion: v1
kind: Pod
metadata:
name: dummy.3
labels:
vara: a
role: idk
spec:
hostNetwork: true
restartPolicy: Never
containers:
- image: gcr.io/gv-test-196801/dummy:v2
name: dummy-1

I believe this documentation is out of date.
Pod garbage collection using TTL was abandoned in favor of a threshold number of terminated pods. --terminated-pod-gc-threshold on the kube controller manager (docs here).
Currently deleting a DaemonSet, Deployment, ReplicaSet or StatefulSet will orphan its pods by default.
You can work around this by enabling cascading deletes
This behavior will change in 1.10
Prior to apps/v1 the default garbage collection policy for Pods in a
DaemonSet, Deployment, ReplicaSet, or StatefulSet, was to orphan the
Pods. That is, if you deleted one of these kinds, the Pods that they
owned would not be deleted automatically unless cascading deletion was
explicitly specified
see kubernetes blog

Related

How to make sure Kubernetes autoscaler not deleting the nodes which runs specific pod

I am running a Kubernetes cluster(AWS EKS one) with Autoscaler pod So that Cluster will autoscale according to the resource request within the cluster.
Also, cluster will shrink no of nodes when the load is reduced. As I observed, Autosclaer can delete any node in this process.
I want to control this behavior such as asking Autoscaler to stop deleting nodes that runs a specific pod.
For example, If a node runs the Jenkins pod, Autoscaler should skip that node and delete other matching node from the cluster.
Will there a way to achieve this requirement. Please give your thoughts.
You can use "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
...
template:
metadata:
labels:
app: jenkins
annotations:
"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
spec:
nodeSelector:
failure-domain.beta.kubernetes.io/zone: us-west-2b
...
You should set a pod disruption budget that references specific pods by label. If you want to ensure that there is at least one Jenkins worker pod running at all times, for example, you could create a PDB like
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: jenkins-worker-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: jenkins
component: worker
(adapted from the basic example in Specifying a Disruption Budget in the Kubernetes docs).
Doing this won't prevent nodes from being destroyed; the cluster autoscaler is still free to scale things down. What it will do is temporarily delay destroying a node until the disruption budget can be met again.
For example, say you've configured your Jenkins setup so that there are three workers. Two get scheduled on the same node, and the autoscaler takes that node offline. The ordinary Kubernetes Deployment system will create two new replicas on nodes that still exist. If the autoscaler also decides it wants to destroy the node that has the last worker, the pod disruption budget above will prevent it from doing so until at least one other worker is running.
When you say "the Jenkins pod" in the question, there are two other important implications to this. One is that you should almost always configure your applications using higher-level objects like Deployments or StatefulSets and not bare Pods. The other is that it is generally useful to run multiple copies of things for redundancy if nothing else. Even absent the cluster autoscaler, disks fail, Amazon occasionally arbitrarily decommissions EC2 instances, and nodes otherwise can go offline outside of your control; you often don't want just one copy of something running in your cluster, especially if you're considering it a critical service.
In autoscaler FAQ on github you can read the following:
What types of pods can prevent CA from removing a node?
Pods with restrictive PodDisruptionBudget.
Kube-system pods that:
are not run on the node by default, *
don't have a pod disruption
budget
set or their PDB is too restrictive (since CA 0.6).
Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc). *
Pods with local storage. *
Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching
anti-affinity, etc)
Pods that have the following annotation set: "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
*Unless the pod has the following annotation (supported in
CA 1.0.3 or later): "cluster-autoscaler.kubernetes.io/safe-to-evict": "true"

Kubernetes : How to delete a specific pod managed by StatefulSet without it being recreated?

I have a StatefulSet with 2 pods. It has a headless service and each pod has a LoadBalancer service that makes it available to the world.
Let's say pod names are pod-0 and pod-1.
If I want to delete pod-0 but keep pod-1 active, I am not able to do that.
I tried
kubectl delete pod pod-0
This deletes it but then restarts it because StatefulSet replica is set to 2.
So I tried
kubectl delete pod pod-0
kubectl scale statefulset some-name --replicas=1
This deletes pod-0, deletes pod-1 and then restarts pod-0. I guess because when replica is set to 1, StatefulSet wants to keep pod-0 active but not pod-1.
But how do I keep pod-1 active and delete pod-0?
This is not supported by the StatefulSet controller. Probably the best you could do is try to create that pod yourself with a sleep shim and maybe you could be faster. But then the sts controller will just be unhappy forever.
You can try using a custom controller like:
https://github.com/openkruise/kruise/
You can have more fine grain control with selective pod removal, if you use a CloneSet custom resource.
apiVersion: apps.kruise.io/v1alpha1
kind: CloneSet
spec:
# ...
replicas: 4
scaleStrategy:
podsToDelete:
- sample-9m4hp # you select which pod to remove
https://openkruise.io/en-us/docs/cloneset.html
The issue of removing specific pods of a deployment or StatefulSet has been opened for years with no resolution:
https://github.com/kubernetes/kubernetes/issues/45509
Statefulset always creates the pods with indices 0..(replica-1).
If you really want to have a finer control over individual pods, i guess you would need to create separate STS objects (with replica = 1)
ReplicaSet in K8s 1.21+ will have PodDeletionCost POD annotation to solve that issue for Deployments but so far nothting for STS.

Does ReplicaSets replace Pods?

I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no information about my old Pods ?
If I apply now the Replicaset does this reference to the Pod settings, so with all settings like readinessProbe/livenessProbe ... ?
My Questions came up because in my replicaset.yml is a container section where I specified my docker image, but why does it need that information, isn't it a redundant information, because this information is in my pods.yml ?
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: test1
spec:
replicas: 2
template:
metadata:
labels:
app: web
spec:
containers:
- name: test1
image: test/test
Pods are the smallest deployable units of computing that can be
created and managed in Kubernetes.
A Pod (as in a pod of whales or pea pod) is a group of one or more
containers (such as Docker containers), with shared storage/network,
and a specification for how to run the containers.
See, https://kubernetes.io/docs/concepts/workloads/pods/pod/.
So, you can specify how your Pod will be scheduled (one or more containers, ports, probes, volumes, etc.).
But in case of the node failure or anything bad that can harm to the Pod, then that Pod won't be rescheduled (you have to rescheduled manually). So, in that case, you need a controller. Kubernetes provides some controllers (each one for different purposes). They are -
ReplicaSet
ReplicationController
Deployment
StatefulSet
DaemonSet
Job
CronJob
All of the above controllers and the Pod together are called as Workload. Because they all have a podTemplate section. And they all create some number of identical Pods ass specified by the spec.replicas field (if this field exists in the corresponding workload manifest). They all are upper-level concept than Pod.
Though the Deployment is more suitable than the ReplicaSet, this answer focuses on ReplicaSet over Pod cause the question is between the Pod and ReplicaSet.
In addition, each one of the above controllers has it's own purpose. Like a ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
A ReplicaSet contains a podTemplate field including selectors to identify and acquire Pod(s). A pod template specifying the configuration of new Pods it should create to meet the number of replicas criteria. It creates and deletes Pod(s) as needed to reach the desired number. When a ReplicaSet needs to create new Pod(s), it uses its Pod template.
The Pod(s) maintained by a ReplicaSet has metadata.ownerReferences field, to tell which resource owns the current Pod(s).
A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a controller and it matches a ReplicaSet’s selector, it will be immediately acquired by said ReplicaSet.
Ref: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
**
Now, its time to answer your questions
Since ReplicaSet is one of the Pod controller (listed above), obviously, it needs a podTemplate (using this template, your Pods will be scheduled). All of the Pods the ReplicaSet creates will have the same Pod configuration (same containers, same ports, same readiness/livelinessProbe, volumes, etc.). And having this podTemplate is not redundant info, it's needed. So, if you have a Pod controller like ReplicaSet or other (as your need), you don't need the Pod itself anymore. Because the ReplicaSet (or the other controllers) will create Pod(s).
**
Guess, you got the answer.
Does ReplicaSets replace Pods?
Yes, if you have replicaset.yml you don't need pods.yml.
I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no
information about my old Pods ? If I apply now the Replicaset does
this reference to the Pod settings, so with all settings like
readinessProbe/livenessProbe ... ?
No, the ReplicaSet manifest has to contain the Pod specification in order to determine what is the configuration of the pods that should be deployed.
With the labels, you link the ReplicaSet to running Pods.
You don't link the ReplicaSet.yml manifest to the Pods.yml manifest.
Don’t use naked Pods (that is, Pods not bound to a ReplicaSet or Deployment) if you can avoid it. Naked Pods will not be rescheduled in the event of a node failure.
In 99% of the cases, there isn't a separate pods.yml manifest.
The pods + the ReplicaSet are defined in a single manifest, hence the containers section in the replicaset.yml.

Can not delete pods in Kubernetes

I tried installing dgraph (single server) using Kubernetes.
I created pod using:
kubectl create -f https://raw.githubusercontent.com/dgraph-io/dgraph/master/contrib/config/kubernetes/dgraph-single.yaml
Now all I need to do is to delete the created pods.
I tried deleting the pod using:
kubectl delete pod pod-name
The result shows pod deleted, but the pod keeps recreating itself.
I need to remove those pods from my Kubernetes. What should I do now?
I did face same issue. Run command:
kubectl get deployment
you will get respective deployment to your pod. Copy it and then run command:
kubectl delete deployment xyz
then check. No new pods will be created.
The link provided by the op may be unavailable. See the update section
As you specified you created your dgraph server using this https://raw.githubusercontent.com/dgraph-io/dgraph/master/contrib/config/kubernetes/dgraph-single.yaml, So just use this one to delete the resources you created:
$ kubectl delete -f https://raw.githubusercontent.com/dgraph-io/dgraph/master/contrib/config/kubernetes/dgraph-single.yaml
Update
Basically, this is an explanation for the reason.
Kubernetes has some workloads (those contain PodTemplate in their manifest). These are:
Pods
Controllers (basically Pod controllers)
ReplicationController
ReplicaSet
Deployment
StatefulSet
DaemonSet
Job
CronJob
See, who controls whom:
ReplicationController -> Pod(s)
ReplicaSet -> Pod(s)
Deployment -> ReplicaSet(s) -> Pod(s)
StatefulSet -> Pod(s)
DaemonSet -> Pod(s)
Job -> Pod
CronJob -> Job(s) -> Pod
a -> b means a creates and controls b and the value of field
.metadata.ownerReference in b's manifest is the reference of a. For
example,
apiVersion: v1
kind: Pod
metadata:
...
ownerReferences:
- apiVersion: apps/v1
controller: true
blockOwnerDeletion: true
kind: ReplicaSet
name: my-repset
uid: d9607e19-f88f-11e6-a518-42010a800195
...
This way, deletion of the parent object will also delete the child object via garbase collection.
So, a's controller ensures that a's current status matches with
a's spec. Say, if one deletes b, then b will be deleted. But
a is still alive and a's controller sees that there is a
difference between a's current status and a's spec. So a's
controller recreates a new b obj to match with the a's spec.
The ops created a Deployment that created ReplicaSet that further created Pod(s). So here the soln was to delete the root obj which was the Deployment.
$ kubectl get deploy -n {namespace}
$ kubectl delete deploy {deployment name} -n {namespace}
Note Book
Another problem may arise during deletion is as follows:
If there is any finalizer in the .metadata.finalizers[] section, then only after completing the task(s) performed by the associated controller, the deletion will be performed. If one wants to delete the object without performing the finalizer(s)' action(s), then he/she has to delete those finalizer(s) first. For example,
$ kubectl patch -n {namespace} deploy {deployment name} --patch '{"metadata":{"finalizers":[]}}'
$ kubectl delete -n {namespace} deploy {deployment name}
You can perform a graceful pod deletion with the following command:
kubectl delete pods <pod>
If you want to delete a Pod forcibly using kubectl version >= 1.5, do the following:
kubectl delete pods <pod> --grace-period=0 --force
If you’re using any version of kubectl <= 1.4, you should omit the --force option and use:
kubectl delete pods <pod> --grace-period=0
If even after these commands the pod is stuck on Unknown state, use the following command to remove the pod from the cluster:
kubectl patch pod <pod> -p '{"metadata":{"finalizers":null}}'
Pods in kubernetes also depends on its type.
Like
Replication Controllers
Replica Sets
Statefulsets
Deployments
Daemon Sets
Pod
Do kubectl describe pod <podname> and check
apiVersion: apps/v1
kind: StatefulSet
metadata:
Now do kubectl get <pod-kind>
At last delete the same and pod will also be deleted.
As #Shudipta Sharma's answer is obviously correct way on how to delete the pods. I would just like to make sure author will understand why this is happening.
The reason is the "mindset" of the Kubernetes in which Pods are considered to be ephemeral, throwaway entities. As Pods come and go, StatefulSets are one way of ensuring that a given number of pods with unique identities will be running at any given time. Reaching the yaml file you used to deploy:
# This StatefulSet runs 1 pod with one Zero, one Alpha & one Ratel containers.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dgraph
spec:
serviceName: "dgraph"
replicas: 1
By deploying this you are basically saying that you want Kubernetes to always run 1 replica of that Pod, at any time. When you delete the Pod, that condition is no longer true so after deletion, there is another Pod spawning to make sure the condition above will be valid.
The way that #Shudipta Sharma provided is just deletion of that StatefulSet so you no longer have a desired state which will keep an eye on the number of running Pods.
You can find more about that in Kubernetes documentation on:
StatefulSets
Cluster's desired state
More about Kubernetes objects and difference between each of them
Delete deployment, not the pods. It is deployment that is making another pod. You can see the different pod name after you delete pods.
kubectl get all
kubectl delete deployment DEPLOYMENTNAME

What is the difference between a pod and a deployment?

I have been creating pods with type:deployment but I see that some documentation uses type:pod, more specifically the documentation for multi-container pods:
apiVersion: v1
kind: Pod
metadata:
name: ""
labels:
name: ""
namespace: ""
annotations: []
generateName: ""
spec:
? "// See 'The spec schema' for details."
: ~
But to create pods I can just use a deployment type:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ""
spec:
replicas: 3
template:
metadata:
labels:
app: ""
spec:
containers:
etc
I noticed the pod documentation says:
The create command can be used to create a pod directly, or it can
create a pod or pods through a Deployment. It is highly recommended
that you use a Deployment to create your pods. It watches for failed
pods and will start up new pods as required to maintain the specified
number. If you don’t want a Deployment to monitor your pod (e.g. your
pod is writing non-persistent data which won’t survive a restart, or
your pod is intended to be very short-lived), you can create a pod
directly with the create command.
Note: We recommend using a Deployment to create pods. You should use
the instructions below only if you don’t want to create a Deployment.
But this raises the question of what kind:pod is good for? Can you somehow reference pods in a deployment? I didn't see a way. It looks like what you get with pods is some extra metadata but none of the deployment options such as replica or a restart policy. What good is a pod that doesn't persist data, survives a restart? I think I'd be able to create a multi-container pod with a deployment as well.
Radek's answer is very good, but I would like to pitch in from my experience, you will almost never use an object with the kind pod, because that doesn't make any sense in practice.
Because you need a deployment object - or other Kubernetes API objects like a replication controller or replicaset - that needs to keep the replicas (pods) alive (that's kind of the point of using kubernetes).
What you will use in practice for a typical application are:
Deployment object (where you will specify your apps container/containers) that will host your app's container with some other specifications.
Service object (that is like a grouping object and gives it a so-called virtual IP (cluster IP) for the pods that have a certain label - and those pods are basically the app containers that you deployed with the former deployment object).
You need to have the service object because the pods from the deployment object can be killed, scaled up and down, and you can't rely on their IP addresses because they will not be persistent.
So you need an object like a service, that gives those pods a stable IP.
Just wanted to give you some context around pods, so you know how things work together.
Hope that clears a few things for you, not long ago I was in your shoes :)
Both Pod and Deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template. It is rather unlikely that you will ever need to create Pods directly for a production use-case.
Kubernetes has three Object Types you should know about:
Pods - runs one or more closely related containers
Services - sets up networking in a Kubernetes cluster
Deployment - Maintains a set of identical pods, ensuring that they have the correct config and that the right number of them exist.
Pods:
Runs a single set of containers
Good for one-off dev purposes
Rarely used directly in production
Deployment:
Runs a set of identical pods
Monitors the state of each pod, updating as necessary
Good for dev
Good for production
And I would agree with other answers, forget about Pods and just use Deployment. Why? Look at the second bullet point, it monitors the state of each pod, updating as necessary.
So, instead of struggling with error messages such as this one:
Forbidden: pod updates may not change fields other than spec.containers[*].image
So just refactor or completely recreate your Pod into a Deployment that creates a pod to do what you need done. With Deployment you can change any piece of configuration you want to and you need not worry about seeing that error message.
Pod is container instance.
That is the output of replicas: 3
Think of one deployment can have many running instances(replica).
//deployment.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: tomcat-deployment222
spec:
selector:
matchLabels:
app: tomcat
replicas: 3
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat:9.0
ports:
- containerPort: 8080
I want to add some informations from Kubernetes In Action book, so you can see all picture and connect relation between Kubernetes resources like Pod, Deployment and ReplicationController(ReplicaSet)
Pods
are the basic deployable unit in Kubernetes. But in real-world use cases, you want your deployments to stay up and running automatically and remain healthy without any manual intervention. For this the recommended approach is to use a Deployment, which under the hood create a ReplicaSet.
A ReplicaSet, as the name implies, is a set of replicas (Pods) maintained with their Revision history.
(ReplicaSet extends an older object called ReplicationController -- which is exactly the same but without the Revision history.)
A ReplicaSet constantly monitors the list of running pods and makes sure the running number of pods matching a certain specification always matches the desired number.
Removing a pod from the scope of the ReplicationController comes in handy
when you want to perform actions on a specific pod. For example, you might
have a bug that causes your pod to start behaving badly after a specific amount
of time or a specific event.
A Deployment
is a higher-level resource meant for deploying applications and updating them declaratively.
When you create a Deployment, a ReplicaSet resource is created underneath (eventually more of them). ReplicaSets replicate and manage pods, as well. When using a Deployment, the actual pods are created and managed by the Deployment’s ReplicaSets, not by the Deployment directly
Let’s think about what has happened. By changing the pod template in your Deployment resource, you’ve updated your app to a newer version—by changing a single field!
Finally, Roll back a Deployment either to the previous revision or to any earlier revision so easy with Deployment resource.
These images are from Kubernetes In Action book, too.
Pod is a collection of containers and basic object of Kuberntes. All containers of pod lie in same node.
Not suitable for production
No rolling updates
Deployment is a kind of controller in Kubernetes.
Controllers use a Pod Template that you provide to create the Pods for which it is responsible.
Deployment creates a ReplicaSet which in turn make sure that,
CurrentReplicas is always same as desiredReplicas .
Advantages :
You can rollout and rollback your changes using deployment
Monitors the state of each pod
Best suitable for production
Supports rolling updates
In Kubernetes we can deploy our workloads using different type of API objects like Pods, Deployment, ReplicaSet, ReplicationController and StatefulSets.
Out of those Pods are the smallest deployable unit in Kubernetes. Any workload/application that runs in Kubernetes, has to run inside a container part of a Pod. A Pod could run multiple containers (meaning multiple applications) within it. A Pod is a wrapper on top of one/many running containers. Using a Pod, kubernetes could control, monitor, operate the containers.
Now using stand alone Pods we can't do lot of things. We can't change configurations, volumes inside Pods. We can't restart the Pod if one is down.
So there is another API Object called Deployment comes into picture which maintains the desired state (how many instances, how much compute resource application uses) of the application. The Deployment maintaines multiple instances of same application by running multiple Pods. Deployments unlike Pods are mutable. Deployments uses another API Object called ReplicaSet to maintain the desired state. Deployments through ReplicaSet spawns another Pod if one is down.
So Pod runs applications in containers. Deployments run Pods and maintains desired state of the application.
Try to avoid Pods and implement Deployments instead for managing containers as objects of kind Pod will not be rescheduled (or self healed) in the event of a node failure or pod termination.
A Deployment is generally preferable because it defines a ReplicaSet to ensure that the desired number of Pods is always available and specifies a strategy to replace Pods, such as RollingUpdate.
May be this example will be helpful for beginners !!
1) Listing PODs
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 92s
webapp-mysql-75dfdf859f-9c54j 1/1 Running 0 92s
2) Deleting web-app pode - which is created using deployment
controlplane $ kubectl -n my-namespace delete pod webapp-mysql-75dfdf859f-9c54j
pod "webapp-mysql-75dfdf859f-9c54j" deleted
3) Listing PODs ( You can see, it is recreated automatically)
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 2m42s
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 45s
4) Deleting mysql POD whcih is created directly ( with out deployment)
controlplane $ kubectl -n my-namespace delete pod mysql
pod "mysql" deleted
5) Listing PODs ( You can see mysql POD is lost for ever )
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 76s
In kubernetes Pods are the smallest deployable units. Every time when we create a kubernetes object like Deployments, replica-sets, statefulsets, daemonsets it creates pod.
As mentioned above deployments create pods based on desired state mentioned in your deployment object. So for example you want 5 replicas of a application, you mentioned replicas: 5 in your deployment manifest. Now deployment controller is responsible to create 5 identical replicas (no less, no more) of given application with all metadata like RBAC policy, networks policy, labels, annotations, health check, resource quotas, taint/tolerations and others and associate with each pods it creates.
There are some cases when you wants to create pod, for example if you are running a test sidecar where you don't need to run application forever, you don't need multiple replicas, and you run application when you wants to execute in that case pod is suitable. For example helm test, which is a pod definition that specifies a container with a given command to run.
I am also a beginner in k8s so correct me if I am wrong.
We know that a pod is created when we create a deployment. What I observed is that if you see the YAML file of the deployment, you can see its kind:deployment. But if you see the YAML file of the pod, you see its kind:pod.