Mount text file greater than 3 MB in kubernetes for locust - kubernetes

I am using locust helm chart. I need to perform load testing by supplying a text files whose size is greater than 3MB. The configuration maps do not work such big files. I tried splitting my files into smaller chunks of less than 1MB. But I am still getting the error "requested entity too large: limit is 3145728" as I got before when the file was a single huge one. Is there any option available?

Put the file somewhere accessible via the network and curl it down to a shared emptyDir volume using an initContainer. You can also use a ROX volume that you've populated manually if your hosting environment/provider offers those.

You got this error due to a limitation for ConfigMap that is equal to 1Mb. According to the official ConfigMap documentation:
A ConfigMap is not designed to hold large chunks of data. The data
stored in a ConfigMap cannot exceed 1 MiB. If you need to store
settings that are larger than this limit, you may want to consider
mounting a volume or use a separate database or file service.
At this point I can suggest you check #coderanger answer - that's the only available options for you.
To read and check
1)Configure a Pod to Use a Volume for Storage
2)Great SO answer Make large static data files available to kubernetes pods

You may also place the file on a persistent volume and mount the volume to the pod. Also, you may also use a hostPath volume but your pod will then be scheduled to only that host.

You can use the PVC or use the Node path for storing the file.
PVC : https://cloud.google.com/kubernetes-engine/docs/concepts/persistent-volumes
If you are using the Host path method you have to make sure your pod each time get schedule on same Node or host.
For that you can also use : https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/
Or you can use the taints & toleration.

Related

Kubernetes configMap or persistent volume?

What is the best approach to passing multiple configuration files into a POD?
Assume that we have a legacy application that we have to dockerize and run in a Kubernetes environment. This application requires more than 100 configuration files to be passed. What is the best solution to do that? Create hostPath volume and mount it to some directory containing config files on the host machine? Or maybe config maps allow passing everything as a single compressed file, and then extracting it in the pod volume?
Maybe helm allows somehow to iterate over some directory, and create automatically one big configMap that will act as a directory?
Any suggestions are welcomed
Create hostPath volume and mount it to some directory containing config files on the host machine
This should be avoided.
Accessing hostPaths may not always be allowed. Kubernetes may use PodSecurityPolicies (soon to be replaced by OPA/Gatekeeper/whatever admission controller you want ...), OpenShift has a similar SecurityContextConstraint objects, allowing to define policies for which user can do what. As a general rule: accessing hostPaths would be forbidden.
Besides, hostPaths devices are local to one of your node. You won't be able to schedule your Pod some place else, if there's any outage. Either you've set a nodeSelector restricting its deployment to a single node, and your application would be done as long as your node is. Or there's no placement rule, and your application may restart without its configuration.
Now you could say: "if I mount my volume from an NFS share of some sort, ...". Which is true. But then, you would probably be better using a PersistentVolumeClaim.
Create automatically one big configMap that will act as a directory
This could be an option. Although as noted by #larsks in comments to your post: beware that ConfigMaps are limited in terms of size. While manipulating large objects (frequent edit/updates) could grow your etcd database size.
If you really have ~100 files, ConfigMaps may not be the best choice here.
What next?
There's no one good answer, not knowing exactly what we're talking about.
If you want to allow editing those configurations without restarting containers, it would make sense to use some PersistentVolumeClaim.
If that's not needed, ConfigMaps could be helpful, if you can somewhat limit their volume, and stick with non-critical data. While Secrets could be used storing passwords or any sensitive configuration snippet.
Some emptyDir could also be used, assuming you can figure out a way to automate provisioning of those configurations during container startup (eg: git clone in some initContainer, and/or some shell script contextualizing your configuration based on some environment variables)
If there are files that are not expected to change over time, or whose lifecycle is closely related to that of the application version shipping in your container image: I would consider adding them to my Dockerfile. Maybe even add some startup script -- something you could easily call from an initContainer, generating whichever configuration you couldn't ship in the image.
Depending on what you're dealing with, you could combine PVC, emptyDirs, ConfigMaps, Secrets, git stored configurations, scripts, ...

Kubernetes how different mountPath share data in single pod

I read an article from here which the data is shared in the same Pod with 2 different containers. These 2 containers both have volumnMount on name, shared-data. But both of them having different mountPath.
My question is, if these mountPath are not same, how are they sharing data? And what is the path for the volume shared-data? My thought is, both should have the same path in order to share data, and i seems like mistaken some concept, but not sure what.
Kubernetes maintains the storage internally. It doesn't have a fixed path that you can see, and it doesn't matter if it gets mounted in the same place in different containers.
By way of analogy, imagine you have an external USB drive. If you've unplugged the drive, it doesn't make sense to ask "what is its path"; and if you plug it in and mount it on /mnt/usb on one machine, that doesn't stop you from mounting it on /home/me/app/data when you plug it into a different machine.
The volume does have a name within its pod (in your example, shared-data). If the volume is backed by a PersistentVolumeClaim that will also have a name. Potentially the matching PersistentVolume is something like an AWS EBS volume, and that will have a name. But none of these names are fixed filesystem paths, and for the most part you can't directly use these to access the file content.
There is only one volume being created "shared-data" which in being declared in pod initially empty :
volumes:- name: shared-data emptyDir: {}
and shared between these two containers .That volume exists on the pod level and it existence only depends on the pod not the two containers .However its bind mounted by the two : meaning whatever you add/edit on the one container or the other , will affect the volume (in your case adding index.html from the debian container).. and yes you can find the path of the volume :/var/lib/kubelet/pods/PODUID/volumes/kubernetes.io~empty-dir/VOLUMENAME .. there is similar question answered here

How can a file inside a pod be copied to the outside?

I have an audit pod, which has logic to generate a report file. Currently, this file is present in the pod itself. I have only one pod having only one replica.
I know, I can run kubectl cp to copy those files from my pod. This command has to be executed on the Kubernetes node itself, but the task is to copy the file from the pod itself due to many restrictions.
I cannot use a Persistent Volume due to restrictions. I checked the Kubernetes API, but couldn't find anything by which I can do a copy.
Is there another way to copy that file out of the pod?
This is a community wiki answer posted to sum up the whole scenario and for better visibility. Feel free to edit and expand on it.
Taking under consideration all the mentioned restrictions:
not supposed to use the Kubernetes volumes
no cloud storage
pod names not accessible to your user
no sidecar containers
the only workaround for your use case is the one you currently use:
the dynamic PV with the annotations."helm.sh/resource-policy": keep
use PVCs and explicitly mention the user to not to delete the
namespace
If any one has a better idea. Feel free to contribute.

kubernetes volume : One shared volume and one dedicate volume between replicated pods

I'm new to Kubernetes and learning it.
have deployment kind of pods and replcas=3.
Is there any way I can mount separate volume for each pod and one volume for all pods.
Requirements:
case 1- My application that is generating some temp file name tempfile.txt, So there is three replica pod, each one will generate tempfile.txt but content might be different. So If I use shared volume that will override each other .
case-2: I have a common file that is not part of image, that will be used by all pods starting the application i.e copy files from host to all pods's container
Thanks in Advance.
There are multiple ways to achieve the first part. Here is mine:
Instead of a deployment, use a statefulSet to create the replicas. statefulSets allow you to include a volume template which each pod have created with it, thus each new pod will have a new PV created specificlaly for it.
This does require your cluster to allow for dynamically provisioned volumes.
Depending on the size of your tempfile.txt, your use case, and your cluster/node configuration, you might also want to consider using a hostPath volume which will use the local storage of your node.
For the second part of your question, using any readWriteMany volume will work (such as any NFS option).
On the note of subPath, this should also work, so long as you define different subPaths for each pod. the example in the link provided by DT does this by creating a subpath based off the pod name.

Where to store files in GKE container?

I'm having trouble understanding where to store files in a GKE container? I've seen the following documentation of the filesystem layout:
https://cloud.google.com/kubernetes-engine/docs/concepts/node-images#file_system_layout
But then there are also Dockerfile examples on the web that copy executable files to other paths not listed in the layout, such as /usr or /go. One of these examples is here:
https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/master/hello-app/Dockerfile
Another question is: If I have runtime code that needs to download certain configuration information after the container starts, can I write the configuration file to the same directory as my executable? Or do I have to choose /etc or /tmp.
And finally, the layout documentation states that /home and /var store data for the the lifetime of the boot disk? What does that mean? How does that compare to the lifetime of the pod or the node?
When you want to store something in a container you can either store something ephemeral or permanent
To store ephemeral way just choose a path /tmp, /var, /opt etc (this depends on the container set up as well), once the container is restarted the information you would have is the same at the moment the container was created, for instance your binary files and initial config files.
To store permanent you must have to mount a volume, this is a support for your container where a volume (container path) is linked with a external storage. with this if your container is restarted the volume will be mounted once the container is ready again and you are no gonna lose anything.
In kubernetes this is called Persistent Volumes and you can leverage this even if you are in another cloud provider,
steps to used
Define a path where you would mount the volume in your source code example /myfiles/private
Create a storage class in your GKE https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/ssd-pd
Create a Persistent Volume Claim in your GKE https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/ssd-pd
Relate this storage class with your Kubernetes deployment
Example
link the volume with your container
volumeMounts:
- mountPath: /myfiles/private
name: any-name-you-want
relate the persistent volume with your deployment
volumes:
- name: any-name-you-want
persistentVolumeClaim:
claimName: my-claim-name
This is really up to you. By default most base images will leave /tmp writeable as per normal. But anything written inside the container will be gone if/when the container restarts for any reason. For something like config data, that might be fine, for a database probably less so. To get more stable storage you need to use a Volume. The exact type to use depends on your environment and how long the data should live. An emptyDir volume lives only as long as the pod but can be shared between containers in the same pod. Beyond that you would probably use a PersistentVolumeClaim to dynamically provision a new Google Cloud disk which will last unless the claim is deleted (or forever depending on your Reclaim setting).