How to cleanup kubernetes configMaps that are no longer used by pods - kubernetes

We use kustomize to create a unique configMap for our Deployments whenever a new change to configMap data is made. We now are left with a number of old configMaps which are no longer in use by any Pods - I can find them in Rancher, but that's a pain - how can I automate cleaning up those configMaps that are no longer used by any Pods?
I've tried running: kubectl get configmaps --namespace mynamespace --output=json
I was hoping to see a reverse reference to the Pod that's using it - but I can't find the right info in there.

If your configmaps can be identified using a label, you can just use the --prune flag to get rid of the dangling resources. If you'd add it to your deployment pipelines the orphaned resources should slowly be cleaned from the cluster.
See this comment for how people are using this in conjunction with kustomize.

Related

Is kubectl apply safe enough to update all pods no matter how they were created?

A pod can be created by Deployment or ReplicaSet or DaemonSet, if I am updating a pod's container specs, is it OK for me to simply modify the yaml file that created the pod? Would it be erroneous once I have done that?
Brief Question:
Is kubectl apply -f xxx.yml the silver bullet for all pod update?
...if I am updating a pod's container specs, is it OK for me to simply modify the yaml file that created the pod?
The fact that the pod spec is part of the controller spec (eg. deployment, daemonset), to update the container spec you naturally start with the controller spec. Also, a running pod is largely immutable, there isn't much you can change directly unless you do a replace - which is what the controller already doing.
you should not make changes to the pods directly, but update the spec.template.spec section of the deployment used to create the pods.
reason for this is that the deployment is the controller that manages the replicasets and therefore the pods that are created for your application. that means if you apply changes to the pods manifest directly, and something like a pod rescheduling/restart happens, the changes made to the pod will be lost because the replicaset will recreate the pod according to its own specification and not the specification of the last running pod.
you are safe to use kubectl apply to apply changes to existing resources but if you are unsure, you can always extract the current state of the deployment from kubernetes and pipe that output into a yaml file to create a backup:
kubectl get deploy/<name> --namespace <namespace> -o yaml > deploy.yaml
another option is to use the internal rollback mechanism of kubernetes to restore a previous revision of your deployment. see https://learnk8s.io/kubernetes-rollbacks for more infos on that.

Recreate Pod managed by a StatefulSet with a fresh PersistentVolume

On an occasional basis I need to perform a rolling replace of all Pods in my StatefulSet such that all PVs are also recreated from scratch. The reason to do so is to get rid of all underlying hard drives that use old versions of encryption key. This operation should not be confused with regular rolling upgrades, for which I still want volumes to survive Pod terminations. The best routine I figured so far to do that is following:
Delete the PV.
Delete the PVC.
Delete the Pod.
Wait until all deletions complete.
Manually recreate the PVC deleted in step 2.
Wait for the new Pod to finish streaming data from other Pods in the StatefulSet.
Repeat from step 1. for the next Pod.
I'm not happy about step 5. I wish StatefulSet recreated the PVC for me, but unfortunately it does not. I have to do it myself, otherwise Pod creation fails with following error:
Warning FailedScheduling 3s (x15 over 15m) default-scheduler persistentvolumeclaim "foo-bar-0" not found
Is there a better way to do that?
I just recently had to do this. The following worked for me:
# Delete the PVC
$ kubectl delete pvc <pvc_name>
# Delete the underlying statefulset WITHOUT deleting the pods
$ kubectl delete statefulset <statefulset_name> --cascade=false
# Delete the pod with the PVC you don't want
$ kubectl delete pod <pod_name>
# Apply the statefulset manifest to re-create the StatefulSet,
# which will also recreate the deleted pod with a new PVC
$ kubectl apply -f <statefulset_yaml>
This is described in https://github.com/kubernetes/kubernetes/issues/89910. The workaround proposed there, of deleting the new Pod which is stuck pending, works and the second time it gets replaced a new PVC is created. It was marked as a duplicate of https://github.com/kubernetes/kubernetes/issues/74374, and reported as potentially fixed in 1.20.
It seems like you're using "Persistent" volume in a wrong way. It's designed to keep the data between roll-outs, not to delete it. There are other different ways to renew the keys. One can use k8s Secret and ConfigMap to mount the key into the Pod. Then you just need to recreate a Secret during a rolling update

Delete namespace and also delete Helm deployment

I deploy my Helm deploys to isolated namespaces.
Deleting a namespace deletes all the resources in it - except the Helm deployment.
Deleting a Helm deployment deletes all resource in it - except the namespace.
I have to do this which seems redundant:
helm del `helm ls NAMESPACE --short` --purge
kubectl delete namespace NAMESPACE
I'd rather just delete my namespace and have the Helm deploy also purged - is this possible?
Deleting a namespace deletes all the resources in it- except the helm deployment
This can't be (deleting a namespace implies deleting everything in it, there aren't any exceptions), and must means that the state representing Helm's concept of a deployment doesn't live in that namespace. Helm stores these as config maps in the TILLER_NAMESPACE. See here and here.
It's not surprising that if you create some resources with helm and then go "under the hood" and delete those resources directly via kubectl, Helm's state of the world won't result in that deployment disappearing.
Deleting a helm deployment deletes all resource in it- except the namespace
That sounds like expected behaviour. Presumably you created the namespace out of band with kubectl, it's not part of your Helm deployment. So deleting the Helm deployment wouldn't delete that namespace.
If you kubectl create namespace NS and helm install CHART --namespace NS then it's not surprising that to clean up, you need to helm delete the release and then kubectl delete the namespace.
The only way I could imagine to do that would be for the Helm chart itself to both create a namespace and create all subsequent namespace-scoped resources within that namespace. Here is an example that appears to do such a thing.
There is a PR created to cleanup all resources deployed from helm. follow the link --> https://github.com/helm/helm/issues/1464
hopefully in the future release it will be addressed

Kubernetes up storage for a pod

My pod jenkins nexus pod has run out of disk space and I need to up the persistent volume claim.
I can see the yaml file for this in the kubernetes dashboard, however when I try to change it I get - PersistentVolumeClaim "jenkins-x-nexus" is invalid: spec: Forbidden: field is immutable after creation
Deleting the pod and quickly trying to update the yaml doesn't work either.
Our version of kubernetes (1.8) doens't have kubectl stop, so is there a way to stop the replication controller in order to change the yaml?
Our version of kubernetes (1.8) doens't have kubectl stop, so is there a way to stop the replication controller in order to change the yaml?
You can scale RC to 0, and it will stop spawning pods.
I can see the yaml file for this in the kubernetes dashboard, however when I try to change it I get - PersistentVolumeClaim "jenkins-x-nexus" is invalid: spec: Forbidden: field is immutable after creation
That message means that you cannot change the size of your volume. There are several tickets on GitHub about that limitation, and regarding different types of volumes, that one for example.
So, to change size, you need to create a new bigger PVC and somehow migrate your data from old volume to the new one.

Schedule pod on specific node without modifying PodSpec

As a k8s cluster administrator, I want to specify on which nodes (using labels) pods will be scheduled, but without modifying any PodSpec section.
So, nodeSelector, affinity and taints can't be options.
Is there any other solution ?
PS: the reason I can't modify the PodSpec is that deployed applications are available as Helm charts and I don't have hand on those files. Moreover, if I change the PodSpec, it will be lost on next release upgrade.
You can use the PodNodeSelector admission controller for this:
This admission controller has the following behavior:
If the Namespace has an annotation with a key scheduler.kubernetes.io/nodeSelector, use its value as the node selector.
If the namespace lacks such an annotation, use the clusterDefaultNodeSelector defined in the PodNodeSelector plugin configuration file as the node selector.
Evaluate the pod’s node selector against the namespace node selector for conflicts. Conflicts result in rejection.
Evaluate the pod’s node selector against the namespace-specific whitelist defined the plugin configuration file. Conflicts result in rejection.
First of all you will need to enable this admission controller. The way to enable it depends on your environment, but it's done via the parameter kube-apiserver --enable-admission-plugins=PodNodeSelector.
Then create a namespace and annotate it with whatever node label you want all Pods in that namespace to have:
kubectl create ns node-selector-test
kubectl annotate ns node-selector-test \
scheduler.alpha.kubernetes.io/node-selector=mynodelabel=mynodelabelvalue
To test it you could do something like this:
kubectl run busybox \
-n node-selector-test -it --restart=Never --attach=false --image=busybox
kubectl get pod busybox -n node-selector-test -o yaml
It should output something like this:
apiVersion: v1
kind: Pod
metadata:
name: busybox
....
spec:
...
nodeSelector:
mynodelabel: mynodelabelvalue
Now, unless that label exists on some nodes, this Pod will never be scheduled, so put this label on a node to see it scheduled:
kubectl label node myfavoritenode mynodelabel=mynodelabelvalue
No, there are no other ways to specify on which nodes pod will be scheduled, only labels and selectors.
I think the problem with a Helm is related to that issue.
For now, the only way for you is to change a spec, remove release and deploy a new one with updated specs.
UPD
#Janos Lenart provides a way how to manage scheduling per-namespace. That is a good idea if your releases already are split among namespaces and if you don't want to spawn different pods on different nodes in a single release. Otherwise, you will have to create new releases in new namespaces and in that case I highly recommend you to use selectors in the Pods spec.