PVCs and multiple namespaces - kubernetes

I have a small Saas App that customers can sign up for, and they get their own instance completely separated from the rest of the clients. It's a simple REST API, with a DB (postgres) and caddy that gets deployed using a docker-compose.
This works fine, but requires me to create the VPS, deploy the different services, and essentially is really hard to manage as I most of the work is manual.
I have decided to use kubernetes, and I have gotten to the point where I can create a separate instance of the system on it's own, isolated namespace for each client, fully automated. This creates the different deployments, services, and pods. Also, I create a PVC for each namespace/client.
The issue has to do with Persistent Volume Claims, and how they work in namespaces. As I want to keep the data completely separate from other instances, I wanted to create a PVC for each client, so that only the DB from that client can access it (and the server, as it requires some data to be written to disk).
This works fine in minikube, but the issue comes with the hosting provider. I use DigitalOcean's managed cluster and they do not allow multiple PVCs to be created, therefore making it impossible to achieve the level of isolation that I want. They allow you to mount a block storage (whatever size you need), and then use it. This would mean that the data is all stored on the "same disk", and all namespaces can access it.
My question is: Is there a way to achieve the same level of isolation, i.e. separate the mount points for each of the DB instances, in such a way that I can still achieve (or at least get close) to the level of separation that I require? The idea would be something along the lines of:
/pvc-root
/client1
/server
/db
/client2
/server
/db
...
This is what I have for now:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
creationTimestamp: null
labels:
io.kompose.service: database-claim
name: database-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Service
metadata:
labels:
io.kompose.service: database
name: database
spec:
ports:
- name: "5432"
port: 5432
targetPort: 5432
selector:
io.kompose.service: database
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
io.kompose.service: database
name: database
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: database
strategy:
type: Recreate
template:
metadata:
labels:
io.kompose.service: database
spec:
containers:
- env:
- name: POSTGRES_DB
value: db_name
- name: POSTGRES_PASSWORD
value: db_password
- name: POSTGRES_USER
value: db_user
image: postgres:10
imagePullPolicy: ""
name: postgres
ports:
- containerPort: 5432
resources: {}
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: database-claim
restartPolicy: Always
serviceAccountName: ""
volumes:
- name: database-claim
persistentVolumeClaim:
claimName: database-claim
---
apiVersion: v1
kind: Service
metadata:
labels:
io.kompose.service: server
name: server
spec:
ports:
- name: "8080"
port: 8080
targetPort: 8080
selector:
io.kompose.service: server
---
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
io.kompose.service: server
name: server
spec:
replicas: 1
selector:
matchLabels:
io.kompose.service: server
template:
metadata:
labels:
io.kompose.service: server
spec:
containers:
- env:
- name: DB_HOST
value: database
- name: DB_NAME
value: db_name
- name: DB_PASSWORD
value: db_password
- name: DB_PORT
value: "5432"
- name: DB_USERNAME
value: db_user
image: <REDACTED>
name: server-image
ports:
- containerPort: 8080
restartPolicy: Always
volumes: null
EDIT Feb 02, 2021
I have been in contact with DO's customer support and they clarified a few things:
You need to manually attach a volume to the cluster, so the PVC deployment file is ignored. The volume is then mounted and available to the cluster but NOT in a ReadWriteMany config, which could have served this case quite well
They provide an API, so in theory I could create the volume (for each client) programmatically and then attach a volume for a specific client, keeping a ReadWriteOnce
This of course locks me in to them as a vendor, and makes things a bit harder to configure and migrate
I am still looking for suggestions whether this is the correct approach for my case. If you have a better way let me know!
in theory this should be achievabe

Is there a way to achieve the same level of isolation, i.e. separate the mount points for each of the DB instances, in such a way that I can still achieve (or at least get close) to the level of separation that I require?
Don't run a production database with a single volume. You want to run a database with some form of replication, in case a volume or node crash.
Either run the database in a distributed setup e.g. using Crunchy PostgreSQL for Kubernetes or use a managed database e.g. DigitalOcean Managed Database
Within that DBMS, create logical databases or schemas for each customer - if you really need that strong isolation. Hint: it is probably easier to maintain with less isolation e.g. using multi-tenancy within the tables.

Late to the party, but here's a solution that may bode well for your use case - if you still haven't found a solution.
Create a highly available NFS/CEPH server then export that in a way that your pods can attach to it. Then create PV's and PVC's as you'd like and bypass all that DO blockage.
You have an application very similar to what you describe I support and I went with a highly available NFS server using DRBD, Corosync, and Pacemaker and it all works as expected. No issues so far.

Related

Is there an efficient way to create a mechanism for automatic updating osrm map data in kubernetes?

We have created .yaml file to deploy osrm/osrm-backend (https://hub.docker.com/r/osrm/osrm-backend/tags) in a Kubernetes cluster.
We initially download the pbf file in the node's volume, then we create the necessary files for the service and finally the service starts.
You may find the yaml file below:
apiVersion: v1
kind: Service
metadata:
name: osrm-albania
labels:
app: osrm-albania
spec:
ports:
- port: 5000
targetPort: 5000
name: http
selector:
app: osrm-albania
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: osrm-albania
spec:
replicas: 1
selector:
matchLabels:
app: osrm-albania
template:
metadata:
labels:
app: osrm-albania
spec:
containers:
- name: osrm-albania
image: osrm/osrm-backend:latest
command: ["/bin/sh", "-c"]
args: ["osrm-extract -p /opt/car.lua /data/albania-latest.osm.pbf && osrm-partition /data/albania-latest.osrm && osrm-customize /data/albania-latest.osrm && osrm-routed --algorithm mld /data/albania-latest.osrm"]
ports:
- containerPort: 5000
name: osrm-port
volumeMounts:
- name: albania
readOnly: false
mountPath: /data
initContainers:
- name: get-osrm-file
image: busybox
command: ['wget', 'http://download.geofabrik.de/europe/albania-latest.osm.pbf', '--directory-prefix=/data']
volumeMounts:
- name: albania
readOnly: false
mountPath: /data
volumes:
- name: albania
emptyDir: {}
The problem is that we need to update the map data used by the osrm service, regularly. Which means to be able to re-download the pbf file and recreate the necessary files to be used by the service.
This might be achieved via kubernetes cronjobs which might has to use persistent volumes instead (Cron Jobs in Kubernetes - connect to existing Pod, execute script).
Is this the only way to achieve getting new map data and refresh the data used by the osrm service?
How exactly?
Is there a better - easier way to achieve this?
This is a tricky situation, I had the same problem in my cluster and I fixed dividing the job in more pods:
1 wget in a volume mount ('volume A')
2 extract, partition, customize in 'volume A'
3 copy 'volume A' to volume mount B
4 run osrm-routed with 'volume B'
In this way a set pod 1, 2, and 3 as a cronjob and each pod would do all operation without broke the service.
This issue was due by a large amount of time for the first 3 operation (2 to 3 hours).

Passing values from initContainers to container spec

I have a kubernetes deployment with the below spec that gets installed via helm 3.
apiVersion: apps/v1
kind: Deployment
metadata:
name: gatekeeper
spec:
replicas: 1
template:
spec:
containers:
- name: gatekeeper
image: my-gatekeeper-image:some-sha
args:
- --listen=0.0.0.0:80
- --client-id=gk-client
- --discovery-url={{ .Values.discoveryUrl }}
I need to pass the discoveryUrl value as a helm value, which is the public IP address of the nginx-ingress pod that I deploy via a different helm chart. I install the above deployment like below:
helm3 install my-nginx-ingress-chart
INGRESS_IP=$(kubectl get svc -lapp=nginx-ingress -o=jsonpath='{.items[].status.loadBalancer.ingress[].ip}')
helm3 install my-gatekeeper-chart --set discovery_url=${INGRESS_IP}
This works fine, however, Now instead of these two helm3 install, I want to have a single helm3 install, where both the nginx-ingress and the gatekeeper deployment should be created.
I understand that in the initContainer of my-gatekeeper-image we can get the nginx-ingress ip address, but I am not able to understand how to set that as an environment variable or pass to the container spec.
There are some stackoverflow questions that mention that we can create a persistent volume or secret to achieve this, but I am not sure, how that would work if we have to delete them. I do not want to create any extra objects and maintain the lifecycle of them.
It is not possible to do this without mounting a persistent volume. But the creation of persistent volume can be backed by just an in-memory store, instead of a block storage device. That way, we do not have to do any extra lifecycle management. The way to achieve that is:
apiVersion: v1
kind: ConfigMap
metadata:
name: gatekeeper
data:
gatekeeper.sh: |-
#!/usr/bin/env bash
set -e
INGRESS_IP=$(kubectl get svc -lapp=nginx-ingress -o=jsonpath='{.items[].status.loadBalancer.ingress[].name}')
# Do other validations/cleanup
echo $INGRESS_IP > /opt/gkconf/discovery_url;
exit 0
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: gatekeeper
labels:
app: gatekeeper
spec:
replicas: 1
selector:
matchLabels:
app: gatekeeper
template:
metadata:
name: gatekeeper
labels:
app: gatekeeper
spec:
initContainers:
- name: gkinit
command: [ "/opt/gk-init.sh" ]
image: 'bitnami/kubectl:1.12'
volumeMounts:
- mountPath: /opt/gkconf
name: gkconf
- mountPath: /opt/gk-init.sh
name: gatekeeper
subPath: gatekeeper.sh
readOnly: false
containers:
- name: gatekeeper
image: my-gatekeeper-image:some-sha
# ENTRYPOINT of above image should read the
# file /opt/gkconf/discovery_url and then launch
# the actual gatekeeper binary
imagePullPolicy: Always
ports:
- containerPort: 80
protocol: TCP
volumeMounts:
- mountPath: /opt/gkconf
name: gkconf
volumes:
- name: gkconf
emptyDir:
medium: Memory
- name: gatekeeper
configMap:
name: gatekeeper
defaultMode: 0555
Using init containers is indeed a valid solution but you need to be aware that by doing so you are adding complexity to your deployment.
This is because you would also need to create serviceaccount with permisions to be able to read service objects from inside of init container. Then, when having the IP, you can't just set env variable for gatekeeper container without recreating a pod so you would need to save the IP e.g. to shared file and read it from it when starting gatekeeper.
Alternatively you can reserve ip address if your cloud provided supports this feature and use this static IP when deploying nginx service:
apiVersion: v1
kind: Service
[...]
type: LoadBalancer
loadBalancerIP: "YOUR.IP.ADDRESS.HERE"
Let me know if you have any questions or if something needs clarification.

Openshift: Is it possible to make different pods of the same deployment to use different resources?

In Openshift, say there are two pods of the same deployment in Test env. Is it possible to make one pod to use/connect to database1, make another pod to use/connect to dababase2 via label or configuration?
I have created two diff pods with same code base or image containing same compiled code. Using spring profiling,passed two different arguments for connection to oracle database.
for example
How about try to use StatefulSet for deploying each pod ? StatefulSet make each pod uses each PersistentVolume, so if you place each configuration file which is configured with other database connection data on each PersistentVolume, each pod can use other database each other. Because the pod can refer different config file.
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: app
spec:
serviceName: "app"
replicas: 2
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: example.com/app:1.0
ports:
- containerPort: 8080
name: web
volumeMounts:
- name: databaseconfig
mountPath: /usr/local/databaseconfig
volumeClaimTemplates:
- metadata:
name: databaseconfig
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Mi

REDIS cluster without persistence on KUBERNETES

I am trying to setup a redis cluster without persistence on a kubernetes cluster. Is there a way I can do that without persistence volume. I need auto recovery after pod reboot. is there an easy way to do that ?
Tried out updating node info with a script on startup which doesn't really work as the rebooted pod comes up with a new static private ip.
fyi i have created a stateful set and a configmap referred here: https://github.com/rustudorcalin/deploying-redis-cluster
and the empty dir setup for volumes.
ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-volume-storage/
You cannot do this, the state of your redis is lots when a pod is restarted. Even with persistence storage is not so easy. You will need some kind of orchestration to manage and reconnect Redis.
Do you mean actual cluster mode or just running Redis in general without persistence? This is what I usually use.
apiVersion: apps/v1
kind: Deployment
metadata:
name: ...
namespace: ...
labels:
app.kubernetes.io/name: redis
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: redis
template:
metadata:
labels:
app.kubernetes.io/name: redis
spec:
containers:
- name: default
image: redis:latest
imagePullPolicy: Always
ports:
- containerPort: 6379
args:
- "--save"
- ""
- "--appendonly"
- "no"

Can we create Multiple databases in Same Persistent Volume in kubernetes ?

I have Azure Machine (kubernetes) who have Agent of 2 core, 1 GB. My two services are running on this Machine each have its own Postgres (Deplyment, Service, Pv, PVC).
I want to host my third service too on same machine.
So when I tried to create Postgres Deployment (this too have its own service, PV, PVC) but Pod was stuck in status=ContainerCreating .
After some digging I got to know that my VM only Supports data-disks.
So i thought why not use PVC of earlier deployment in current service like:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: third-postgres
labels:
name: third-postgres
spec:
replicas: 1
template:
metadata:
labels:
name: third-postgres
spec:
containers:
- name: third-postgres
image: postgres
env:
- name: PGDATA
value: /var/lib/postgresql/data/pgdata
- name: POSTGRES_USER
value: third-user
- name: POSTGRES_PASSWORD
value: <password>
- name: POSTGRES_DB
value: third_service_db
ports:
- containerPort: 5432
volumeMounts:
- name: third-postgresdata
mountPath: /var/lib/postgresql/data
volumes:
- name: third-postgresdata
persistentVolumeClaim:
claimName: <second-postgres-data>
Now this Deployment was successfully running but it doesn't create new database third_service_db
May be because second PVC was already exists so it skips the Db create part ?
So is their any way I can use same PVC for my all services and same PVC can have multiple databases. So that when I run kubectl create -f <path-to-thirst-postgres.yaml> it takes name Database configuration from env Variables and create DB in same PVC
You have to create one PVC per Deployment. Once a PVC has been claimed, it must be released before it can be used again.
In the case of AzureDisk, the auto-created volumes can only be mounted by a single node (ReadWriteOnce access mode) so there's one more constraint: each of your Deployments can have at most 1 replica.
Yes you can create as much databas as you want on the same Persistent Volume. You have to change the path value to store different database. See the example below.
apiVersion: v1
kind: PersistentVolume
metadata:
name: ...
namespace: ...
labels:
type: ...
spec:
storageClassName: ...
capacity:
storage: ...
accessModes:
- ...
hostPath:
path: "/mnt/data/DIFFERENT_PATH_FOR_EACH_DATABASE"