Kubernetes delete POD with hostPath data files - kubernetes

I am new to Kubernetes, I am creating POD on run time to push data and after pushing and collecting data I am deleting POD.
For the processing of files I have connected SSD. and assigned its path as hostPath: /my-drive/example while creating POD. Now when i run my POD i can see the files in defined path.
But, Now I just wanted to delete files created by POD in a hostPath directory while deleting POD. is it possible?
My POD file looks like.
apiVersion: v1
kind: Pod
metadata:
name: pod-example
labels:
app: pod-example
spec:
containers:
- name: pod-example
image: "myimage.com/abcd:latest"
imagePullPolicy: Always
workingDir: /pod-example
env:
volumeMounts:
- name: "my-drive"
mountPath: "/my-drive"
volumes:
- name: "my-drive"
persistentVolumeReclaimPolicy: Recycle
hostPath:
path: /my-drive/example
restartPolicy: Never
imagePullSecrets:
- name: regcred
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "kubernetes.io/hostname"
operator: In
values:
- my-node
topologyKey: "kubernetes.io/hostname"

You can achieve this by using lifecycle hooks in K8s. Under them, preStop hook can be used here since you need to do action when stopping the pod.
Check docs related to lifecycle hooks: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
If you exact know the files or let's say a directory to delete, you can use Exec hook hanlder. Check the sample below that I've added for your reference.
lifecycle:
preStop:
exec:
command:
- "sh"
- "-c"
- >
echo "Deleting files in my-drive/example/to-be-deleted" > /proc/1/fd/1 # Add preStop hook's stdout to main process's stdout
rm -r my-drive/example/to-be-deleted
P.S. According to your problem statement, you are not using the POD continuously it seems. If the task that you are looking is to execute periodically or not continuous, I would suggest you to select either K8s CronJob or Job rather a POD.
Make sure to have required user access inside the container to delete files/floders.

Update persistentVolumeReclaimPolicy to Delete as shown below
persistentVolumeReclaimPolicy: Delete

Related

Why directory is created by hostPath with no type specified?

According to Kubernetes documentation, we should specify type: DirectoryOrCreate if we want to create directory on the host. The default option is "no checks will be performed before mounting the hostPath volume".
However, I am seeing directory gets created on host even when no type is specified:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: busybox-user-hostpath
spec:
replicas: 1
selector:
matchLabels:
app: busybox-user-local-storage1
template:
metadata:
labels:
app: busybox-user-local-storage1
spec:
containers:
- name: busybox
image: busybox:latest
command: ["/bin/sh", "-ec", "while :; do echo $(date '+%Y-%m-%d %H:%M:%S') deployment1 >> /home/test.txt; sleep 5 ; done"]
volumeMounts:
- name: busybox-hostpath
mountPath: /home
volumes:
- name: busybox-hostpath
hostPath:
path: /home/maintainer/data
/home/maintainer/data directory did not exist before running the pod. After deployment, I can see the directory is created. This goes against the documentation unless I am missing something. I was expecting the pod should crash but I can see the files are created. Any idea please?
This is something that goes back in time, before type was even implemented for hostPath volume. When unset should just go and default directly to create an empty directory, and it's a backward compatible implementation, because no one had the option to add type and forcing an error when it's not defined would have broken all previously created pods without it. You can take a look into the actual design-proposal: https://github.com/kubernetes/design-proposals-archive/blob/main/storage/volume-hostpath-qualifiers.md#host-volume
The design proposal clearly specifies that "unset - If nothing exists at the given path, an empty directory will be created there. Otherwise, behaves like exists"

kubectl copy logs from pod when terminating

We are trying to get the logs of pods after multiple restarts but we dont want to use any external solution like efk.
i tried below config but its not working. does the below cmd run on the pod or it will run on node level
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "kubectl logs appworks-0 > /container-stoped.txt"]
i tried below config but its not working. does the below cmd run on
the pod or it will run on node level
it will run on the POD level, not on Node level
You can use the Hostpath in POD configuration
apiVersion: v1
kind: Pod
metadata:
name: test-pd
spec:
containers:
- image: alpine
name: test-container
command: ["tail"]
args: ["-f", "/dev/null"]
volumeMounts:
- mountPath: /host
name: test-volume
volumes:
- name: test-volume
hostPath:
path: /
type: Directory
Hostpath will directly will create one Dir at the Node level and save logs over there, if you don't want this solution you can add your solution of lifecycle hook also however when you can directly write app logs to Host don't add lifecycle hook extra.
Note : Make sure if your Node goes down hostpath or emptyDir logs you will miss.

Copy file inside Kubernetes pod from another container

I need to copy a file inside my pod during the time of creation. I don't want to use ConfigMap and Secrets. I am trying to create a volumeMounts and copy the source file using the kubectl cp command—my manifest looks like this.
apiVersion: v1
kind: Pod
metadata:
name: copy
labels:
app: hello
spec:
containers:
- name: init-myservice
image: bitnami/kubectl
command: ['kubectl','cp','./test.json','init-myservice:./data']
volumeMounts:
- name: my-storage
mountPath: data
- name: init-myservices
image: nginx
volumeMounts:
- name: my-storage
mountPath: data
volumes:
- name: my-storage
emptyDir: {}
But I am getting a CrashLoopBackOff error. Any help or suggestion is highly appreciated.
it's not possible.
let me explain : you need to think of it like two different machine. here your local machine is the one where the file exist and you want to copy it in another machine with cp. but it's not possible. and this is what you are trying to do here. you are trying to copy file from your machine to pod's machine.
here you can do one thing just create your own docker image for init-container. and copy the file you want to store before building the docker image. then you can copy that file in shared volume where you want to store the file.
I do agree with an answer provided by H.R. Emon, it explains why you can't just run kubectl cp inside of the container. I do also think there are some resources that could be added to show you how you can tackle this particular setup.
For this particular use case it is recommended to use an initContainer.
initContainers - specialized containers that run before app containers in a Pod. Init containers can contain utilities or setup scripts not present in an app image.
Kubernetes.io: Docs: Concepts: Workloads: Pods: Init-containers
You could use the example from the official Kubernetes documentation (assuming that downloading your test.json is feasible):
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
# These containers are run during pod initialization
initContainers:
- name: install
image: busybox
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://info.cern.ch
volumeMounts:
- name: workdir
mountPath: "/work-dir"
dnsPolicy: Default
volumes:
- name: workdir
emptyDir: {}
-- Kubernetes.io: Docs: Tasks: Configure Pod Initalization: Create a pod that has an initContainer
You can also modify above example to your specific needs.
Also, referring to your particular example, there are some things that you will need to be aware of:
To use kubectl inside of a Pod you will need to have required permissions to access the Kubernetes API. You can do it by using serviceAccount with some permissions. More can be found in this links:
Kubernetes.io: Docs: Reference: Access authn authz: Authentication: Service account tokens
Kubernetes.io: Docs: Reference: Access authn authz: RBAC
Your bitnami/kubectl container will run into CrashLoopBackOff errors because of the fact that you're passing a single command that will run to completion. After that Pod would report status Completed and it would be restarted due to this fact resulting in before mentioned CrashLoopBackOff. To avoid that you would need to use initContainer.
You can read more about what is happening in your setup by following this answer (connected with previous point):
Stackoverflow.com: Questions: What happens one of the container process crashes in multiple container POD?
Additional resources:
Kubernetes.io: Pod lifecycle
A side note!
I also do consider including the reason why Secrets and ConfigMaps cannot be used to be important in this particular setup.

Volume shared between two containers "is busy or locked"

I have a deployment that runs two containers. One of the containers attempts to build (during deployment) a javascript bundle that the other container, nginx, tries to serve.
I want to use a shared volume to place the javascript bundle after it's built.
So far, I have the following deployment file (with irrelevant pieces removed):
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
To the best of my ability, I have followed these guides:
https://kubernetes.io/docs/concepts/storage/volumes/#emptydir
https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
One other things to point out is that I'm trying to run this locally atm using minikube.
EDIT: The Dockerfile I used to build this image is:
FROM node:alpine
WORKDIR /var/app
COPY . .
RUN npm install
RUN npm install -g #vue/cli#latest
CMD ["npm", "run", "build"]
I realize that I do not need to build this when I actually run the image, but my next goal is to insert pod instance information as environment variables, so with javascript unfortunately I can only build once that information is available to me.
Problem
The logs from the personal-site container reveal:
- Building for production...
ERROR Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
Error: EBUSY: resource busy or locked, rmdir '/var/app/dist'
I'm not sure why the build is trying to remove /dist, but also have a feeling that this is irrelevant. I could be wrong?
I thought that maybe this could be related to the lifecycle of containers/volumes, but the docs suggest that "An emptyDir volume is first created when a Pod is assigned to a Node, and exists as long as that Pod is running on that node".
Question
What are some reasons that a volume might not be available to me after the containers are already running? Given that you probably have much more experience than I do with Kubernetes, what would you look into next?
The best way is to customize your image's entrypoint as following:
Once you finish building the /var/app/dist folder, copy(or move) this folder to another empty path (.e.g: /opt/dist)
cp -r /var/app/dist/* /opt/dist
PAY ATTENTION: this Step must be done in the script of ENTRYPOINT not in the RUN layer.
Now use /opt/dist instead..:
apiVersion: apps/v1
kind: Deployment
metadata:
...
spec:
...
template:
...
spec:
hostNetwork: true
containers:
- name: personal-site
image: wheresmycookie/personal-site:3.1
volumeMounts:
- name: build-volume
mountPath: /opt/dist # <--- make it consistent with image's entrypoint algorithm
- name: nginx-server
image: nginx:1.19.0
volumeMounts:
- name: build-volume
mountPath: /var/app/dist
volumes:
- name: build-volume
emptyDir: {}
Good luck!
If it's not clear how to customize the entrypoint, share with us your entrypoint of the image and we will implement it.

Can I have different host mount in pods from same deployment?

For logs, I mount a volume from host on to the pod. This is written in the deployment yaml.
But, if my 2 pods run on the same host, there will be conflict as both pods will produce log files with same name.
Can I use some dynamic variables in deployment file so that mount on host is created with different name for different pods?
you can use subPathExpr to achieve the uniqueness in the absolute path, this is one of the use case of the this feature. As of now its is alpha in k1.14.
In this example, a Pod uses subPathExpr to create a directory pod1 within the hostPath volume /var/log/pods, using the pod name from the Downward API. The host directory /var/log/pods/pod1 is mounted at /logs in the container.
apiVersion: v1
kind: Pod
metadata:
name: pod1
spec:
containers:
- name: container1
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: busybox
command: [ "sh", "-c", "while [ true ]; do echo 'Hello'; sleep 10; done | tee -a /logs/hello.txt" ]
volumeMounts:
- name: workdir1
mountPath: /logs
subPathExpr: $(POD_NAME)
restartPolicy: Never
volumes:
- name: workdir1
hostPath:
path: /var/log/pods
look at pod affinity/anti affinity to not to schedule the replica on the same node. that way each replica of a specific deployment gets deployed on separate node. you will not have to bother about same folder being used by multiple pods.
I had to spend hours for this, your solution worked like a charm!
Had tried with, none worked despite being given in multiple documents.
subPathExpr: "$POD_NAME"
subPathExpr: $POD_NAME
subPathExpr: ${POD_NAME}
Finally this worked, subPathExpr: $(POD_NAME)