Retrieve output from completed pod 'kubernetes' - kubernetes

How can i retrieve any directory/file from the completed pod of a particular job in kubernetes?
I am trying to store the output file locally from the container.

Kubernetes pods don't maintain state by default. You will need to use persistent volumes for data persistence (PVC/PV).
In order to retrieve files/directories from a completed job, you will have to mount the same persistent volume to another pod after your job is completed in order to retrieve the "output file"

Thanks for the suggestion!! I finally manage to do it by using init-container along with main container... one to run the real functioning i want the job for and another one to let the container sleep and then mounted an empty directory (PVC too can be used ofcourse) and shared with both main container and init-container.

Related

Automatically transfer files between containers using Kubernetes

I want to make a container that is able to transfer files between itself and other containers on the cluster. I have multiple containers that are responsible for executing a task, and they are waiting to get an input file to do so. I want a separate container to be responsible for handling files before and after the task is executed by the other containers. As an example:
have all files on the file manager container.
let the file manager container automatically copy a file to a task executing container.
let task executing container run the task.
transfer the output of the task executing container to the file manager container.
And i want to do this automatically, so that for example 400 input files can be processed to output files in this way. What would be the best way to realise such a process with kubernetes? Where should I start?
A simple approach would be to set up the NFS or use the File system like AWS EFS or so.
You can mount the File system or NFS directly to POD which will be in ReadWriteMany access method.
ReadWriteMany - Multiple POD can access the single file system.
If you don't want to use the Managed service like EFS or so you can also set up the file system on K8s checkout the MinIO : https://min.io/
All files will be saved in the File system and as per POD requirement, it can simply access it from the file system.
You can create different directories to separate the outputs.
If you want only read operation, meaning all PODs can read the files only you can also set up the ReadOnlyMany access mode.
If you are GCP you can checkout this nice document : https://cloud.google.com/filestore/docs/accessing-fileshares

Does a kubernetes CronJob spin up a new container to run or does it just create a new process on an existing one?

For example if a kubernetes cronjob runs and saves a file to disk, will it save on one of the containers my application is running on or will it save on a separate cronjob pod and then be destroyed?
kubernetes cron job spin up a new container for every run. if you want to save file and manage it across each run then use persistent volume.
Kubernetes CronJob or CJ creates a Job which in turn creates a Pod with a container inside. Containers by nature are ephemeral in nature, meaning that any file/logs generated inside are tied to its life-cycle. As long as the container lives, those files/logs will live. They will disappear when a container terminates.
In short, each CronJob will create it's own file. If you want to persistent them permanently, attach a PersistentVolume (PV) with the CronJob.
From the CronJob docs:
A CronJob creates Jobs on a repeating schedule.
From the Jobs docs:
A Job creates one or more Pods
In other words, without configuration of a shared volume, the resulting Pod will be completely isolated from the rest of the cluster, and therefore your file will be written to the internal filesystem of the Pod container and destroyed on termination.

What is the best way to mount files generated by one pod on another pod (before it starts) on a different node in GCP?

I have a simple use case. I am trying to deploy two pods on two different nodes in Kubernetes. Pod A is a server which creates a file abc.txt after receiving an API request. I want to mount this abc.txt file onto Pod B.
If the file jhsdiak.conf (the name of this file is randomly generated) is not present on pod B before it starts, pod B will create its own default file. Hence to avoid this, the file has to be mounted onto Pod B before it starts.
Here are the things I have tried
Shared Volume using dynamically provisioned PVC -> This approach works fine if both the pods are created on the same node. Not otherwise as GCP doesn't support ReadWriteMany.
Using Kubectl CP to copy the files from Pod A to host path and then creating configmaps/secrets to mount it onto Pod B -> This approach fails as the name of file jhsdiak.conf is randomly generated.
InitContainers -> I am not sure how I can use an init container to move files from one pod to another.
Using NFS Persisted storage -> I haven't tried it yet, but seems like a lot of overhead to just move one file between pods.
Is there a better or more efficient way to solve this problem?
A similar solution is to use Cloud Storage for storing your files.
But, I have another solution than "file". Create a PubSub topic and push your files in it with Pod A.
Create a pull subscription and poll it with Pod B.
You can achieve what you want, I mean sending data from A to B, and you don't have to worry about file system.
If your cluster is compliant with Knative, the eventing solution can help you for staying inside the cluster (if it's a requirement)

Using initcontainers in a job to do some stuff after pod initialization

I'm currently trying to create a job in Kubernetes in a project I've just joined.
This job should wait until 2 other pods are up and running, then run a .sh script to create some data for the testers in the application.
I've figured out i should use initContainers. However, there is one point i don't understand.
Why should i include some environmental values under env tag under initContainers in job description .yaml file ?
I thought i was just waiting for the pods to be initialised, not creating them again. Am i missing something ?
Thanks in advance.
initContainers are like containers running in the Pod, but executed before the ones defined in the spec key containers.
Like containers, they share some namespaces and IPC, so it means that the Scheduler will check if the declared initContainers are successful, then it will schedule the containers.
Keep in mind that when you create a Pod, basically, you're creating an empty container named pause that will provide a namespace-base for the following containers: so, in the end, the initContainer is not really creating again a new Pod, like its name suggests, it's an initializer.

Difficulty with different kubernetes pods run using kubetctl apply running same container images sharing directories

I am attempting to run two separate pods using the same container image on a cluster by applying a config file. Despite there being no shared or persistent volume when both pods are active the same directory on both pods is updated with created files from the other pod and write access changes suddenly. The container being used is the jupyter-docker-stacks jupyter/minimal-notebook image being pulled directly from dockerhub. These pods running this container is created by applying a manifest. The two pods have different labels and names. A service with a unique name is created for each pod for access.
Do resources for containers persist over time on a cluster like in docker containers? I cannot find something equivalent to a --rm flag to be used alongside kubectl apply.
Thanks
If you want to delete the pod after the job is completed, you might want to apply job instead of pod. The idea of job in k8s is to launch a pod and do the job, and then the pod get stopped. For more info: https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/
$ kubectl apply -f <fileName> will create or make some changes in the pod. If you want to delete pod using apply you must use $ kubectl delete -f <fileName>
About sharing, if you have 2 separate manifest you can specify volumeMounts for each container. For more information please read the documentation depends on your needs.
Also as #Kaizhe Huang advised you can use Job if you want to execute something one time or try initContainers if you want to install something in POD before main container will be run. More about initContainers here.
You could check the dockerfile of your image. See if there are 'VOLUME' claimed. If have, maybe they share the same volume on host. Not sure, but you could check.