How to delete files from EFS mounted into K8s pod? - kubernetes

I have a kubernetes deployment which generates hundreds of thousands of files. I've mounted and EFS instance into my pod with a persistent volume and persistent volume claim. I've tried running my deployment but ran into an issue and now I need to wipe the persistent volume. What's the best way to do this?
I've tried running exec-ing into my pod and running rm -rf but that didn't seem to make any progress after 30 minutes. I also tried using rsync but that also was incredibly slow.
Does EFS offer a mechanism to delete files from the console or command line? Does k8s offer a mechanism to wipe a persistent volume (claim)? What's the best way to give my pod a fresh slate to start working with again?
EDIT: I tried deleting and recreating the PVC but that didn't seem to work since my pod crashlooped once the deployment was restarted with the new PVC.
EDIT 2: I was mounting my PVC with a subPath - changing the subPath gave my pod a fresh new directory to work with. This was a nice workaround but I still would like to delete the old data in the EFS volume so I don't have to pay for it.

Related

How to save changes on pod in Kubernetes after pod deployed

I have Jenkins deployment with one pod I want to make changes to the pod, for example, I wanna install and set up maven. I mounted volume to do pod. But when I restart the pod, changes made with kubectl exec are gone. But when I make changes in Jenkins GUI, changes are persistent. What is the reason behind it, and is there a way to save changes after pod deployed?
The kubernetes pod (the docker container in general) is by default stateless, to make it stateful, you need to store the state somewhere (a database, cloud storage, or a persistent disk, ...).
In your case you use mount a volume to the pod, and the state is restored when you use Jenkins, so here is a few things to check:
is the volume mount after every deployment/restart?
do you execute the same command manually and in Jenkins GUI?
do you use the correct mount path when you execute the command manually?
...I mounted volume to do pod...when I make changes in Jenkins GUI, changes are persistent.
By default changes made with Jenkins GUI is saved to the Jenkins home; presumably the location that you have mounted with a persistent volume.
What is the reason behind it,
When your pod goes away, the persistent volume remains in the system. You get back your changes when your pod come back online and mounted with the same volume. This means any changes that did not persist in the mounted volume will not be retain. This also means if your new pod cannot mount back the same persistent volume for any reason; you loose all the previous changes as well.
...and is there a way to save changes after pod deployed?
GUI or kubectl exec; any change that you want to persist thru Pod lifecycle; you ensure such change is always saves to the mounted volume; and the same volume is always available for new pod to mount.

Kubernetes pod went down

I am pretty new to Kubernetes so I don't have much idea. Last day a pod went down and I was thinking if I would be able to recover the tmp folder.
So basically I want to know that when a pod in Kubernetes goes down, does it lose access to the "/tmp" folder ?
Unless you configure otherwise, this folder will be considered storage within the container, and the contents will be lost when the container terminates.
Similarly to how you can run a container in docker, write something to the filesystem within the container, then stop and remove the container, start a new one, and find the file you wrote within the container is no longer there.
If you want to keep the /tmp folder contents between restarts, you'll need to attach a persistent volume to it and mount it as /tmp within the container, but with the caveat that if you do that, you cannot use that same volume with other replicas in a deployment unless you use a read-write-many capable filesystem underneath, like NFS.

Debugging nfs volume "Unable to attach or mount volumes for pod"

I've set up an nfs server that serves a RMW pv according to the example at https://github.com/kubernetes/examples/tree/master/staging/volumes/nfs
This setup works fine for me in lots of production environments, but in some specific GKE cluster instance, mount stopped working after pods restarted.
From kubelet logs I see the following repeating many times
Unable to attach or mount volumes for pod "api-bf5869665-zpj4c_default(521b43c8-319f-425f-aaa7-e05c08282e8e)": unmounted volumes=[shared-mount], unattached volumes=[geekadm-net deployment-role-token-6tg9p shared-mount]: timed out waiting for the condition; skipping pod
Error syncing pod 521b43c8-319f-425f-aaa7-e05c08282e8e ("api-bf5869665-zpj4c_default(521b43c8-319f-425f-aaa7-e05c08282e8e)"), skipping: unmounted volumes=[shared-mount], unattached volumes=[geekadm-net deployment-role-token-6tg9p shared-mount]: timed out waiting for the condition
Manually mounting the nfs on any of the nodes work just fine: mount -t nfs <service ip>:/ /tmp/mnt
How can I further debug the issue? Are there any other logs I could look at besides kubelet?
In case the pod gets kicked out of the node because the mount is too slow, you may see messages like that in logs.
Kubelets even inform about this issue in logs.
Sample log from Kubelets:
Setting volume ownership for /var/lib/kubelet/pods/c9987636-acbe-4653-8b8d-
aa80fe423597/volumes/kubernetes.io~gce-pd/pvc-fbae0402-b8c7-4bc8-b375-
1060487d730d and fsGroup set. If the volume has a lot of files then setting
volume ownership could be slow, see https://github.com/kubernetes/kubernetes/issues/69699
Cause:
The pod.spec.securityContext.fsGroup setting causes kubelet to run chown and chmod on all the files in the volumes mounted for given pod. This can be a very time consuming thing to do in case of big volumes with many files.
By default, Kubernetes recursively changes ownership and permissions for the contents of each volume to match the fsGroup specified in a Pod's securityContext when that volume is mounted. From the document.
Solution:
You can deal with it in the following ways.
Reduce the number of files in the volume.
Stop using the fsGroup setting.
Did you specify an nfs version when mounting command-line? I had the same issue on AKS, but inspired by https://stackoverflow.com/a/71789693/1382108 I checked the nfs versions. Noticed my PV had a vers=3. When I tried mounting command-line using mount -t nfs -o vers=3 command just hung, with vers=4.1 it worked immediately. Changed the version in my PV and next Pod worked just fine.

Kubernetes Edit File In A Pod

I have used some bitnami charts in my kubernetes app. In my pod, there is a file whose path is /etc/settings/test.html. I want to override the file. When I search it, I figured out that I should mount my file by creating a configmap. But how can I use the created configmap with the existed pod . Many of the examples creates a new pod and uses the created config map. But I dont want to create a new pod, I wnat to use the existed pod.
Thanks
If not all then almost all pod specs are immutable, meaning that you can't change them without destroying the old pod and creating a new one with desired parameters. There is no way to edit pod volume list without recreating it.
The reason behind this is that pods aren't meant to be immortal. Pods meant to be temporary units that can be spawned/destroyed according to scheduler needs. In general, you need a workload object that does pod management for you (a Deployement, StatefulSet, Job, or DaemonSet, depenging on deployment strategy and application nature).
There are two ways to edit a file in an existing pod: either by using kubectl exec and console commands to edit the file in place, or kubectl cp to copy an already edited file into the pod. I advise you against both of these, because this is not permanent. Better backup the necessary data, switch deployment type to Deployment with one replica, then go with mounting a configMap as you read on the Internet.

Kubernetes: fsGroup has different impact on hostPath versus pvc and different impact on nfs versus cifs

Many of my workflows use pod iam roles. As documented here, I must include fsGroup in order for non-root containers to read the generated identity token. The problem with this is when I additionally include pvc’s that point to cifs pv’s, the volumes fail to mount because they time out. Seemingly this is because Kubelet tries to chown all of the files on the volume, which takes too much time and causes the timeout. Questions…
Why doesnt Kubernetes try to chown all of the files when hostPath is used instead of a pvc? All of the workflows were fine until I made the switch to use pvcs from hostPath, and now the timeout issue happens.
Why does this problem happen on cifs pvcs but not nfs pvcs? I have noticed that nfs pvcs continue to mount just fine and the fsGroup seemingly doesn’t take effect as I don’t see the group id change on any of the files. However, the cifs pvcs can no longer be mounted seemingly due to the timeout issue. If it matters, I am using the native nfs pv lego and this cifs flexVolume plugin that has worked great up until now.
Overall, the goal of this post is to better understand how Kubernetes determines when to chown all of the files on a volume when fsGroup is included in order to make a good design decision going forward. Thanks for any help you can provide!
Kubernetes Chowning Files References
https://learn.microsoft.com/en-us/azure/aks/troubleshooting
Since gid and uid are mounted as root or 0 by default. If gid or uid
are set as non-root, for example 1000, Kubernetes will use chown to
change all directories and files under that disk. This operation can
be time consuming and may make mounting the disk very slow.
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods
By default, Kubernetes recursively changes ownership and permissions
for the contents of each volume to match the fsGroup specified in a
Pod's securityContext when that volume is mounted. For large volumes,
checking and changing ownership and permissions can take a lot of
time, slowing Pod startup.
I posted this question on the Kubernetes Repo a while ago and it was recently answered in the comments.
The gist is fsgroup support is implemented and decided on per plugin. They ignore it for nfs, which is why I have never seen Kubelet chown files on nfs pvcs. For FlexVolume plugins, a plugin can opt-out of fsGroup based permission changes by returning FSGroup false. So, that is why Kubelet was trying to chown the cifs pvcs -- the FlexVolume plugin I am using does not return fsGroup false.
So, in the end you don't need to worry about this for nfs, and if you are using a FlexVolume plugin for a shared file system, you should make sure it returns fsGroup false if you don't want Kubelet to chown all of the files.