Able to get basic Rook/Ceph example to work, but all data apparently sits on a single node - kubernetes

Using Rook 0.9.3 I was able to bring up a Ceph-based directory for a MySQL-database on a three-node Kubernetes cluster (1 master, two workers) simply as follows:
kubectl create -f cluster/examples/kubernetes/ceph/operator.yaml
kubeclt create -f cluster/examples/kubernetes/ceph/cluster.yaml
vim cluster/examples/kubernetes/ceph/storageclass.yaml # change xfs to ext4
kubectl create -f cluster/examples/kubernetes/ceph/storageclass.yaml
kubectl create -f cluster/examples/kubernetes/mysql.yaml
When I now bash into the pod wordpress-mysql-* I can see that /var/lib/mysql is mounted from /dev/rbd1. If I create a random file in this directory and then delete the pod, the file has persisted when a new instance of the pod comes up.
My first worker contains these directories in /var/lib/rook: mon-a mon-c mon-d osd0 rook-ceph. My second worker contains only one directory in /var/lib/rook: mon-b. This and other evidence (from df) suggest that Rook (and by extension Ceph) stores all of its file data (e.g. all blocks that constitute the mounted /var/lib/mysql) in /var/lib/rook/osd0, i.e. once on a single node.
I would have expected that blocks are distributed across several nodes so that when one node (the first worker, in my case) fails, data access is still available. Is this a naive expectation? If not, how can I configure Rook accordingly? Also, I have second, unformatted disks on both worker nodes, and I would prefer for Rook/Ceph to use those. How can this be accomplished?

for using other partition as osd
you should change cluster.yml and add
nodes:
- name: "kube-node1"
devices:
- name: "sdb"
- name: "kube-node2"
devices:
- name: "sdb"
- name: "kube-node3"
devices:
- name: "sdb"

Related

How to mount a single file from the local Kubernetes cluster into the pods

I set up a local Kubernetes cluster using Kind, and then I run Apache-Airflow on it using Helm.
To actually create the pods and run Airflow, I use the command:
helm upgrade -f k8s/values.yaml airflow bitnami/airflow
which uses the chart airflow from the bitnami/airflow repo, and "feeds" it with the configuration of values.yaml.
The file values.yaml looks something like:
web:
extraVolumeMounts:
- name: functions
mountPath: /dir/functions/
extraVolumes:
- name: functions
hostPath:
path: /dir/functions/
type: Directory
where web is one component of Airflow (and one of the pods on my setup), and the directory /dir/functions/ is successfully mapped from the cluster inside the pod. However, I fail to do the same for a single, specific file, instead of a whole directory.
Does anyone knows the syntax for that? Or have an idea for an alternative way to map the file into the pod (its whole directory is successfully mapped into the cluster)?
There is a File type for hostPath which should behave like you desire, as it states in the docs:
File: A file must exist at the given path
which you can then use with the precise file path in mountPath. Example:
web:
extraVolumeMounts:
- name: singlefile
mountPath: /path/to/mount/the/file.txt
extraVolumes:
- name: singlefile
hostPath:
path: /path/on/the/host/to/the/file.txt
type: File
Or if it's not a problem, you could mount the whole directory containing it at the expected path.
With this said, I want to point out that using hostPath is (almost always) never a good idea.
If you have a cluster with more than one node, saying that your Pod is mounting an hostPath doesn't restrict it to run on a specific host (even tho you can enforce it with nodeSelectors and so on) which means that if the Pod starts on a different node, it may behave differently, not finding the directory and / or file it was expecting.
But even if you restrict the application to run on a specific node, you need to be ok with the idea that, if such node becomes unavailable, the Pod will not be scheduled on its own somewhere else.. meaning you'll need manual intervention to recover from a single node failure (unless the application is multi-instance and can resist one instance going down)
To conclude:
if you want to mount a path on a particular host, for whatever reason, I would go for local volumes.. or at least use hostPath and restrict the Pod to run on the specific node it needs to run on.
if you want to mount small, textual files, you could consider mounting them from ConfigMaps
if you want to configure an application, providing a set of files at a certain path when the app starts, you could go for an init container which prepares files for the main container in an emptyDir volume

Graylog in Kubernetes (1.20) cluster

I try to set up graylog.
This deploy works. But i need to add a volume to graylog deploy. Because I want to install plugins.
When i add volume (hostPath) and start pod, I get an error in my pod:
ERROR StatusLogger File not found in file system or classpath: /usr/share/graylog/data/config/log4j2.xml
ERROR StatusLogger Reconfiguration failed: No configuration found for '70dea4e' at 'null' in 'null'
06:37:19.707 [main] ERROR org.graylog2.bootstrap.CmdLineTool - Couldn't load configuration: Properties file /usr/share/graylog/data/config/graylog.conf doesn't exist!
I see, that pod create catalogues (owner id 1100:1100) in volume, but there is no any file there.
Kubernetes version is 1.20
Runtime in my kubernetes cluster is "containerd".
My Graylog dеploy:
Volume ounys for container:
volumeMounts:
- mountPath: /usr/share/graylog/data
name: graylog-data
Volume:
volumes:
- name: graylog-data
hostPath:
path: /mnt/k8s-storage/graylog-data
type: DirectoryOrCreate
There are few things to look at here starting from the concept of the hostPath volume:
A hostPath volume mounts a file or directory from the host node's
filesystem into your Pod.
Pods with identical configuration (such as created from a PodTemplate)
may behave differently on different nodes due to different files on
the nodes
The files or directories created on the underlying hosts are only
writable by root. You either need to run your process as root in a
privileged Container or modify the file permissions on the host to be
able to write to a hostPath volume
The hostPath would be good if for example you would like to use it for log collector running in a DaemonSet but in your described use case it might not be ideal because you don't directly control which node your pods will run on, so you're not guaranteed that the pod will actually be scheduled on the node that has the data volume.
But if that not the case, you also need to notice that type: DirectoryOrCreate is not best here as I see that you want to load a file. It would be better to use either:
File: A file must exist at the given path
or
FileOrCreate: If nothing exists at the given path, an empty file will be created there as needed with permission set to 0644, having the same group and ownership with Kubelet.
Lastly, there might be a permissions problem. As already stated:
The files or directories created on the underlying hosts are only
writable by root.
Graylog is running with the userid 1100 which might cause a permission denial. Also, I have found a similar issue that might be helpful for you.

How to run kubectl commands in a cron

I created a schedule configuration inside my Gcloud project to create snapshots of a bunch of virtual disks.
Now I want to add my schedule configuration to my disks, but I dont know how to do it in a automated way, because I have more than 1200 disks.
I tryed to use a POD with a cron inside, but I cannot execute the kubectl command to list all my persistent volumes:
kubectl describe pv | grep "Name" | awk 'NR % 2 == 1' | awk '{print $2}'
I want to use this list with the next command in a Loop to add automatically my programmed schedule to my disks:
gcloud compute disks add-resource-policies [DISK_NAME] --resource-policies [SCHEDULE_NAME] --zone [ZONE]
Thanks in advance for your help.
Edit 1: After some comments I changed my code to add a Kubernetes CronJob, but the result is the same, the code doesn't work (the pod is created, but it gives me an error: ImagePullBackOff):
resource "kubernetes_cron_job" "schedulerdemo" {
metadata {
name = "schedulerdemo"
}
spec {
concurrency_policy = "Replace"
failed_jobs_history_limit = 5
schedule = "*/5 * * * *"
starting_deadline_seconds = 10
successful_jobs_history_limit = 10
job_template {
metadata {}
spec {
backoff_limit = 2
ttl_seconds_after_finished = 10
template {
metadata {}
spec {
container {
name = "scheduler"
image = "imgscheduler"
command = ["/bin/sh", "-c", "date; kubectl describe pv | grep 'Name' | awk 'NR % 2 == 1' | awk '{print $2}'"]
}
}
}
}
}
}
}
Answering the comment:
Ok, shame on me, wrong image name. Now I have an error in the Container Log: /bin/sh: kubectl: not found
It means that the image that you are using doesn't have kubectl installed (or it's not in the PATH). You can use image: google/cloud-sdk:latest. This image already have cloud-sdk installed which includes:
gcloud
kubectl
To run a CronJob that will get the information about PV's and change the configuration of GCP storage you will need following accesses:
Kubernetes/GKE API(kubectl) - ServiceAccount with a Role and RoleBinding.
GCP API (gcloud) - Google Service account with IAM permissions for storage operations.
I found this links helpful when assigning permissions to list PV's:
Kubernetes.io: RBAC
Success.mirantis.com: Article: User unable to list persistent volumes
The recommended way to assign specific permissions for GCP access:
Workload Identity is the recommended way to access Google Cloud services from applications running within GKE due to its improved security properties and manageability.
-- Cloud.google.com: Kubernetes Engine: Workload Identity: How to
I encourage you to read documentation I linked above and check other alternatives.
As for the script used inside of a CronJob. You should look for pdName instead of Name as the pdName is representation of the gce-pd disk in GCP (assuming that we are talking about in-tree plugin).
You will have multiple options to retrieve the disk name from the API to use it in the gcloud command.
One of the options:
kubectl get pv -o yaml | grep "pdName" | cut -d " " -f 8 | xargs -n 1 gcloud compute disks add-resource-policies --zone=ZONE --resource-policies=POLICY
Disclaimer!
Please treat above command only as an example.
Above command will get the PDName attribute from the PV's and iterate with each of them in the command after xargs.
Some of the things to take into consideration when creating a script/program:
Running this command more than once on a single disk will issue an error that you cannot assign multiple policies. You could have a list of already configured disks that do not require assigning a policy.
Consider using .spec.concurrencyPolicy: Forbid instead of Replace. Replaced CronJob will start from the beginning iterating over all of those disks. Command could not complete in the desired time and CronJob will be replaced.
You will need to check for the correct kubectl version as the official support allows +1/-1 version difference between client and a server (cloud-sdk:latest uses v1.19.3).
I highly encourage you to look on other methods to backup your PVC's (like for example VolumeSnapshots).
Take a look on below links for more reference/ideas:
Stackoverflow.com: Answer: Periodic database backup in kubernetes
Stash.run: Guides: Latest: Volumesnapshot: PVC
Velero.io
It's worth to mention that:
CSI drivers are the future of storage extension in Kubernetes. Kubernetes has announced that the in-tree volume plugins are expected to be removed from Kubernetes in version 1.21. For details, see Kubernetes In-Tree to CSI Volume Migration Moves to Beta. After this change happens, existing volumes using in-tree volume plugins will communicate through CSI drivers instead.
-- Cloud.google.com: Kubernetes Engine: Persistent Volumes: GCE PD CSI Driver: Benefits of using
Switching to CSI plugin for your StorageClass will allow you to use Volume Snapshots inside of GKE:
Volume snapshots let you create a copy of your volume at a specific point in time. You can use this copy to bring a volume back to a prior state or to provision a new volume.
-- Cloud.google.com: Kubernetes Engine: Persistent Volumes: Volume snaphosts: How to
Additional resources:
Cloud.google.com: Kubernetes Engine: Persistent Volumes
Cloud.google.com: Kubernetes Engine: Cronjobs: How to
Terraform.io: Kubernetes: CronJob
Cloud.google.com: Compute: Disks: Create snapshot

Kubernetes Pod - Sync pod directory with a local directory

I have a python pod running.
This python pod is using different shared libraries. To make it easier to debug the shared libraries I would like to have the libraries directory on my host too.
The python dependencies are located in /usr/local/lib/python3.8/site-packages/ and I would like to access this directory on my host to modify some files.
Is that possible and if so, how? I have tried with emptyDir and PV but they always override what already exists in the directory.
Thank you!
This is by design. Kubelet is responsible for preparing the mounts for your container. At the time of mounting they are empty and kubelet has no reason to put any content in them.
That said, there are ways to achieve what you seem to expect by using init container. In your pod you define init container using your docker image, mount your volume in it in some path (ie. /target) but instead of running regular content of your container, run something like
cp -r /my/dir/* /target/
which will initiate your directory with expected content and exit allowing further startup of the pod
Please take a look: overriding-directory.
Another option is to use subPath. Subpath references files or directories that are controlled by the user, not the system. Take a loot on this example how to mount single file into existing directory:
---
volumeMounts:
- name: "config"
mountPath: "/<existing folder>/<file1>"
subPath: "<file1>"
- name: "config"
mountPath: "/<existing folder>/<file2>"
subPath: "<file2>"
restartPolicy: Always
volumes:
- name: "config"
configMap:
name: "config"
---
Check full example here. See: mountpath, files-in-folder-overriding.
You can also as #DavidMaze said debug your setup in a non-container Python virtual environment if you can, or as a second choice debugging the image in Docker without Kubernetes.
You can take into consideration also below third party tools, that were created especially for Kubernetes app developers keeping in mind this functionality (keep in-sync source and remote files).
Skaffold's Continuous Deployment workflow - it takes care of keeping source and remote files (Pod mounted directory) in sync.
Telepresence`s Volume access feature.

Expose container in Kubernetes

I want to create specific version of redis to be used as a cache. Task:
Pod must run in web namespace
Pod name should be cache
Image name is lfccncf/redis with the 4.0-alpine tag
Expose port 6379
The pods need to be running after complete
This are my steps:
k create ns web
k -n web run cache --image=lfccncf/redis:4.0-alpine --port=6379 --dry-run=client-o yaml > pod1.yaml
vi pod1.yaml
pod looks like this
k create -f pod1.yaml
When the expose service name is not define is this right command to fully complete the task ?
k expose pod cache --port=6379 --target-port=6379.
Is it the best way to keep pod running using command like this command: ["/bin/sh", "-ec", "sleep 1000"] ?
You should not use sleep to keep a redis pod running. As long as the redis process runs in the container the pod will be running.
The best way to go about it is to take a stable helm chart from https://hub.helm.sh/charts/stable/redis-ha. Do a helm pull and modify the values as you need.
Redis should be definde as a Statefulset for various reasons. You could also do a
mkdir my-redis
helm fetch --untar --untardir . 'stable/redis' #makes a directory called redis
helm template --output-dir './my-redis' './redis' #redis dir (local helm chart), export to my-redis dir
then use Kustomise if you like.
You will notice that a redis deployment definition is not so trivial when you see how much code there is in the stable chart.
You can then expose it in various ways, but normally you need the access only within the cluster.
If you need a fast way to test from outside the cluster or use it as a development environment check the official ways to do that.