unable to mount volume to spark.kubernetes.executor - scala

I am trying to read a file from server in spark cluster mode using kubernetes, so i put my file on all workers and i mount driver volume using
val conf = new SparkConf().setAppName("sparksetuptest")
.set("spark.kubernetes.driver.volumes.hostPath.host.mount.path", "/file-directory")
Everything works fine here but when i execute it shows that file not found at specific location.
So i mount directory to executor with .set("spark.kubernetes.executor.volumes.hostPath.host.mount.path", "/file-directory")
But now i am not able to execute program it stuck in a never ending process while fetching data.
Please suggest something, so that i can mount my directory with executor and read that file.

this is an example from nfs-example
spark.kubernetes.driver.volumes.nfs.images.options.server=example.com
spark.kubernetes.driver.volumes.nfs.images.options.path=/data
I think you need to declare the path that you want to mount under options.path and the spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.path is the mount path in your container
For example:
If I want to mount /home/lemon/data on the node of k8s to the path /data the docker container with VolumeName exepv, then
conf.set("spark.kubernetes.executor.volumes.hostPath.exepv.mount.path","/data")
conf.set("spark.kubernetes.executor.volumes.hostPath.exepv.options.path", "/home/lemon/data")
after this, you can access the path /data in your executor container

Related

Docker-compose:modify the volume parameter

I modified the mount directory in the docker-compose.yml file. Which command should I use to make the mounted directory effective?
should I use docker-compose restart?
In the past, I used the docker compose restart command.

Copying files from docker volume to kubernetes volume

I am moving my service to kubernetes from docker and I also have to copy over some files from my docker volume.
I am using a PersistentVolumeClaim and a StorageClass in kubernetes and that's already implemented.
But now I need to copy the contents in the folder /opt/checker/dataFiles to the same mount path on kubernetes. How best to do it ? Is there a better way than copying the files into the folder inside the kubernetes container manually?
Apparently I couldn't find any and I ended up doing this manually:
Copy the contents from the docker volume to you local disk (docker cp container:source_path local_dest_path)
Copy the contents into the Kubernetes volumes into the mount path (kubectl cp local_dest_path namespace/pod:final_dest_path)

How to mount "/" in kubernetes

Saving Data in kubernetes is not persistant. so we should use volume.
Forexample we can mount "/apt" to save data in "apt".
Now I want to mount "/" but I get this error.
Error: Error response from daemon: invalid volume specification:
'/var/lib/kubelet/pods/26c39eeb-85d7-11e9-933c-7c8bca006fec/volumes/kubernetes.io~rbd/pvc-d66d9039-853d-11e9-8aa3-7c8bca006fec:/':
invalid mount config for type "bind": invalid specification:
destination can't be '/'
The question is How can I mount "/" in kubernetes?
Not completely sure about your environment, but I ran into this issue today because I wanted to be able to browse the entire root filesystem of a container via SSH (WinSCP) to the host. I am using Docker in a Photon OS VM environment. The answer I've come to is: you can't do what you're trying to do, but you may be able to accomplish what you're trying to accomplish. Let's say I created a volume called mysql and I create a new (oversimplified) mysql container using that volume as root:
docker volume create --name mysql
docker run -d --name=mysqldb -v /var/lib/docker/volumes/mysql:/ mysql:5.7
Docker will cry and say I can't mount to root (destination can't be '/'). However, since I know the location where our volumes live (/var/lib/docker/volumes/) then we can simply create our container as normal and an arbitrarily-named volume will be placed in that folder. So if your goal is (as mine was) to be able to ssh to the host and browse the files in the root of your container, you CAN do that, you just need to go to the correct arbitrarily-named volume. In my case it is "/var/lib/docker/volumes/12dccb66f2eeaeefe8e1feabb86f3c6def87b091dabeccad2902851caa97f04c/_data", which isn't as pretty as "/var/lib/docker/volumes/mysql", but it gets the job done.
Hope that helps someone.

dockerized postgresql with volumes

i am relatively new to docker. I'd like to set up a postgres database but I wonder how to make sure that the data isn't being lost if I recreated the container.
Then I stumbled over named volumes (not bind volumes) and how to use them.
But... in a Dockerfile you can't use named volumes. E.g. data:/var/lib etc.
As I understood using a Dockerfile it's always an anonymous volume.
So every single time I'd recreate a container it would get its
own new volume.
So here comes my question:
Firstly: how do I make sure, if the container get's updated or recreated that the postgres database from within the new container references to the same data and not losing the reference to the previously created anonymous volume.
Secondly: how does this work with a yml file?
is it possible to reference multiple replicas of such a database container to one volume? (High Availability Mode)?
It would really be great if someone could get me a hint or best practices.
Thank you in advance.
Looking at the Dockerfile for Postgres, you see that it declares a volume instruction:
VOLUME /var/lib/postgresql/data
Everytime you run a new Postgres container, without specifying a --volume option, docker automatically creates a new volume. The volume is given a random name.
You can see all volumes by running the command:
docker volume ls
You can also inspect the files stored on the host by the volume, by inspecting the host path using:
docker volume inspect <volume-name>
So when you don't specify the --volume option for the run command, docker create volumes for all volumes declared in the Dockerfile. This is mainly a safety if you forget to name your volume and the data shouldn't be lost.
Firstly: how do I make sure, if the container get's updated or
recreated that the postgres database from within the new container
references to the same data and not losing the reference to the
previously created anonymous volume.
If you want docker to use the same volume, you need to specify the --volume option. Once specified, docker won't create a new volume and it will simply mount the existing volume onto the specified folder in the docker command.
As a best practice, name your volumes that have valuable data. For example:
docker run --volume postgresData:/var/lib/postgresql/data ...
If you run this command for the first time the volume postgresData will be created and will backup /var/lib/postgresql/data on the host. The second time you run it the same data backed up on the host will be mounted onto the container.
Secondly: how does this work with a yml file? is it possible to
reference multiple replicas of such a database container to one
volume?
Yes, volumes can be shared between multiple containers. You can mount the same volume onto multiple containers, and the containers will use the same files. Docker compose allows you to do that ...
However, beware that volumes are limited to the host they were created. When running containers on multiple machines, the volume needs to be accessible from all the machines. There are ways/tools to achieve that
but they are a bit complex. This is still a limitation to be addressed in Docker.

Kubernetes in vmware vsphere issues

I am following this guide to set up my cluster. It all works fine.
However, when I install fabric8 in this cluster I run out of disk on the minions. The image, kube.vmdk, is only about 6GB. It is the /var/lib/docker which gets filled up. How do I solve this?
Using the GUI for vmware the option to resize the disk is 'greyed out'.
Should I attach a second disk to the minions and then mount this disk? Where should I mount it? /var/lib/docker?
I would appreciate any input.
Docker's image is store in /var/lib/docker(more precisely, it store in storage driver's directory, /var/lib/docker/aufs when using aufs storage driver) , so when Kubernetes report disk gets filled up, it check that directory.
So you can
Remove all the images in docker(not necessary, you can copy everything to new dir).
stop docker daemon.
mount your new disk to /var/lib/docker/ or /var/lib/docker
start docker daemon.
If you are not sure what storage driver your docker is using, type docker info in your node, will get something contain this:
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 139
Dirperm1 Supported: true
It seems that you run out of the space of the disk. You can remove all the files in /var/lib/docker, and mount the second disk. Finally you need restart your dockerd.