persistence volume with multiple local disks - kubernetes

I have a home Kubernetes cluster with multiple SSDs attached to one of the nodes.
I currently have one persistence volume per mounted disk. Is there an easy way to create a persistence volume that can access data from multiple disks? I thought about symlink but it doesn't seem to work.

You would have to combine them at a lower level. The simplest approach would be Linux LVM but there's a wide range of storage strategies. Kubernetes orchestrates mounting volumes but it's not a storage management solution itself, just the last-mile bits.

As already mentioned by coderanger Kubernetes does not manage your storage at lower level. While with cloud solutions there might some provisioners that will do some of the work for you with bare metal there isn't.
The closest thing that help you manage local storage is Local-volume-static-provisionner.
The local volume static provisioner manages the PersistentVolume
lifecycle for pre-allocated disks by detecting and creating PVs for
each local disk on the host, and cleaning up the disks when released.
It does not support dynamic provisioning.
Have a look at this article for more example it.

I have a trick which is working for me.
You can mount these disks at a directory like /disks/, and then make a loop filesystem, mounted it, and make a symbol link from disks to the loop filesystem.
for example:
touch ~/disk-bunch1 && truncate -s 32M ~/disk-bunch1 && mke2fs -t ext4 -F ~/disk-bunch1
mount it and make a symbol link from disks to the loop filesystem:
mkdir -p /local-pv/bunch1 && mount ~/disk-bunch1 /local-pv/bunch1
ln -s /disk/disk1 /local-pv/bunch1/disk1
ln -s /disk/disk2 /local-pv/bunch1/disk2
Finally, use sig-storage-local-static-provisioner, modify the "hostDir" to "/local-pv" in the values.yaml and deploy the provisioner. And then, you could make a pod use multiple disks.
But this method have a drawback, when you run "kubectl get pv", the CAPACITY is just the size of the loop filesystem instead of the sum of several disk capacities...
By the way, this method, is not recommended ... You'd better think of such as raid0 or lvm and etc...

Related

How to see what a k8s container is writing to ephemeral storage

One of our containers is using ephemeral storage but we don't know why. The app running in the container shouldn't be writing anything to the disk.
We set the storage limit to 20MB but it's still being evicted. We could increase the limit but this seems like a bandaid fix.
We're not sure what or where this container is writing to, and I'm not sure how to check that. When a container is evicted, the only information I can see is that the container exceeded its storage limit.
Is there an efficient way to know what's being written, or is our only option to comb through the code?
Adding details to the topic.
Pods use ephemeral local storage for scratch space, caching, and logs.
Pods can be evicted due to other pods filling the local storage, after which new pods are not admitted until sufficient storage has been reclaimed.
The kubelet can provide scratch space to Pods using local ephemeral storage to mount emptyDir volumes into containers.
For container-level isolation, if a container's writable layer and log usage exceeds its storage limit, the kubelet marks the Pod for eviction.
For pod-level isolation the kubelet works out an overall Pod storage limit by summing the limits for the containers in that Pod. In this case, if the sum of the local ephemeral storage usage from all containers and also the Pod's emptyDir volumes exceeds the overall Pod storage limit, then the kubelet also marks the Pod for eviction.
To see what files have been written since the pod started, you can run:
find / -mount -newer /proc -print
This will output a list of files modified more recently than '/proc'.
/etc/nginx/conf.d
/etc/nginx/conf.d/default.conf
/run/secrets
/run/secrets/kubernetes.io
/run/secrets/kubernetes.io/serviceaccount
/run/nginx.pid
/var/cache/nginx
/var/cache/nginx/fastcgi_temp
/var/cache/nginx/client_temp
/var/cache/nginx/uwsgi_temp
/var/cache/nginx/proxy_temp
/var/cache/nginx/scgi_temp
/dev
Also, try without the '-mount' option.
To see if any new files are being modified, you can run some variations of the following command in a Pod:
while true; do rm -f a; touch a; sleep 30; echo "monitoring..."; find / -mount -newer a -print; done
and check the file size using the du -h someDir command.
Also, as #gohm'c pointed out in his answer, you can use sidecar/ephemeral debug containers.
Read more about Local ephemeral storage here.
We're not sure what or where this container is writing to, and I'm not sure how to check that.
Try look into the container volumeMounts section that is mounted with emptyDir, then add a sidecar container (eg. busybox) to start a shell session where you can check the path. If your cluster support ephemeral debug container you don't need the sidecar container.

How to mimic Docker ability to pre-populate a volume from a container directory with Kubernetes

I am migrating my previous deployment made with docker-compose to Kubernetes.
In my previous deployment, some containers do have some data made at build time in some paths and these paths are mounted in persistent volumes.
Therefore, as the Docker volume documentation states,the persistent volume (not a bind mount) will be pre-populated with the container directory content.
I'd like to achieve this behavior with Kubernetes and its persistent volumes, How can I do ? Do I need to add some kind of logic using scripts in order to copy my container's files to the mounted path when data is not present the first time the container starts ?
Possibly related question: Kubernetes mount volume on existing directory with files inside the container
I think your options are
ConfigMap (are "some data" configuration files?)
Init containers (as mentioned)
CSI Volume Cloning (clone combining an init or your first app container)
there used to be a gitRepo; deprecated in favour of init containers where you can clone your config and data from
HostPath volume mount is an option too
An NFS volume is probably a very reasonable option and similar from an approach point of view to your Docker Volumes
Storage type: NFS, iscsi, awsElasticBlockStore, gcePersistentDisk and others can be pre-populated. There are constraints. NFS probably the most flexible for sharing bits & bytes.
FYI
The subPath might be of interest too depending on your use case and
PodPreset might help in streamlining the op across the fleet of your pods
HTH

How to transfer files from container to container within a pod in Kubernetes?

There are 5 containers in my pod in kubernetes deployment. I want to transfer files from 1 container to another container.
How to go about this?
The most common approach to this would be to use EmptyDir volume, run an initContainer that will spin up the image you want to copy from, mount target volume in it and perform the copy, before the actual containers forming your pod runtime will take the same volume and mount it for their use.
If you need to run the copy (transfer) operation during actual operation then you should mount a shared volume (most likely EmptyDir as well) on both containers and just use it as a shared storage space.
You can do that by using shared volume.
Follow this

Use SSD or local SSD in GKE cluster

I would like to have Kubernetes use the local SSD in my Google Kubernetes engine cluster without using alpha features. Is there a way to do this?
Thanks in advance for any suggestions or your help.
https://cloud.google.com/kubernetes-engine/docs/concepts/local-ssd explains how to use local SSDs on your nodes in Google Kubernetes Engine. Based on the gcloud commands, the feature appears to be beta (not alpha) so I don't think you need to rely on any alpha features to take advantage of it.
You can use local SSD with your Kubernetes nodes as explained in the below documentation:
Visit the Kubernetes Engine menu in GCP Console.
Click Create cluster.
Configure your cluster as desired. Then, from the Local SSD disks (per node) field, enter the desired number of SSDs as an absolute number.
Click Create.
To create a node pool with local SSD disks in an existing cluster:
Visit the Kubernetes Engine menu in GCP Console.
Select the desired cluster.
Click Edit.
From the Node pools menu, click Add node pool.
Configure the node pool as desired. Then, from the Local SSD disks (per node) field, enter the desired number of SSDs as an absolute number.
Click Save.
Be aware of the disadvantages/limitations of local SSD storage in Kubernetes as explained in this documentation link:
Because local SSDs are physically attached to the node's host virtual machine instance, any data stored in them only exists on that node. As the data stored on the disks is local, you should ensure that your application is resilient to having this data being unavailable.
A Pod that writes to a local SSD might lose access to the data stored on the disk if the Pod is rescheduled away from that node. Additionally, upgrading a node causes the data to be erased.
You cannot add local SSDs to an existing node pool.
Above points are very important if you want to have high availability in your Kubernetes deployment.
Kubernetes local SSD storage is ephemeral and presents some problems for non-trivial applications when running in containers.
In Kubernetes, when a container crashes, kubelet will restart it, but the files in it will be lost because the container starts with a clean state.
Also, when running containers together in a Pod it is often necessary that those containers share files.
You can use Kubernetes Volume abstraction to solve above problems as explained in the following documentation.
If you're looking to run the whole of Docker on SSD's in your Kubernetes cluster, this is how I did it on my node pool (ubuntu nodes):
Go to Compute Engine > VM Instances
Edit your node to add a new SSD (explained in the first step "Create and attach a persistent disk in the Google Cloud Platform Console" here: https://cloud.google.com/compute/docs/disks/add-persistent-disk)
On your server:
# stop docker
sudo service docker stop
# format and mount disk
sudo mkfs.ext4 -m 0 -F -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/sdb
rm -fr /var/lib/docker
sudo mkdir -p /var/lib/docker
sudo mount -o discard,defaults /dev/sdb /var/lib/docker
sudo chmod 711 /var/lib/docker
# backup and edit fstab
sudo cp /etc/fstab /etc/fstab.backup
echo UUID=`sudo blkid -s UUID -o value /dev/sdb` /var/lib/docker ext4 discard,defaults,nofail 0 2 | sudo tee -a /etc/fstab
# start docker
sudo service docker start
As mentioned by others, you might want to look into the "Local SSD's option" provided by GKE first. Reason the provided option of adding SSD's didn't cut it for me, was that my nodes needed a single SSD of 4TB and as I understand the local ssd's are a fixed size.

glusterfs volume creation failed - brick is already part of volume

In a cloud , we have a cluster of glusterfs nodes (participating in gluster volume) and clients (that mount to gluster volumes). These nodes are created using terraform hashicorp tool.
Once the cluster is up and running, if we want to change the gluster machine configuration like increasing the compute size from 4 cpus to 8 cpus , terraform has the provision to recreate the nodes with new configuration.So the existing gluster nodes are destroyed and new instances are created but with the same ip. In the newly created instance , volume creation command fails saying brick is already part of volume.
sudo gluster volume create VolName replica 2 transport tcp ip1:/mnt/ppshare/brick0 ip2:/mnt/ppshare/brick0
volume create: VolName: failed: /mnt/ppshare/brick0 is already part
of a volume
But no volumes are present in this instance.
I understand if I have to expand or shrink volume, I can add or remove bricks from existing volume. Here, I'm changing the compute of the node and hence it has to be recreated. I don't understand why it should say brick is already part of volume as it is a new machine altogether.
It would be very helpful if someone can explain why it says Brick is already part of volume and where it is storing the volume/brick information. So that I can recreate the volume successfully.
I also tried the below steps from this link to clear the glusterfs volume related attributes from the mount but no luck.
https://linuxsysadm.wordpress.com/2013/05/16/glusterfs-remove-extended-attributes-to-completely-remove-bricks/.
apt-get install attr
cd /glusterfs
for i in attr -lq .; do setfattr -x trusted.$i .; done
attr -lq /glusterfs (for testing, the output should pe empty)
Simply put "force" in the end of "gluster volume create ..." command.
Please check if you have directories /mnt/ppshare/brick0 created.
You should have /mnt/ppshare without the brick0 folder. The create command creates those folders. The error indicates that the brick0 folders are present.