Kubernetes Citus setup with individual hostname/ip - postgresql

I am in the process of learning Kubernetes with a view to setting up a simple cluster with Citus DB and I'm having a little trouble with getting things going, so would be grateful for any help.
I have a docker image containing my base debian image configured for Citus for the project, and I want to set it up at this point with one master, that should mount a GCP master disk with a Postgres DB that I'll then distribute among the other containers, each mounted with a individual separate disk with empty tables (configured with the Citus extension) to hold what gets distributed to each. I'd like to automate this further at some point, but now I'm aiming for just a master container, and eight nodes. My plan is to create a deployment that opens port 5432 and 80 on each node, and I thought that I can create two pods, one to hold the master and one to hold the eight nodes. Ideally I'd want to mount all the disks and then run a post-mount script on the master that will find all the node containers (by IP or hostname??), add them as Citus nodes, then run create_distributed_table to distribute the data.
My confusion at present is about how to label all the individual nodes so they will keep their internal address or hostname and so in the case of one going down it will be replaced and resume with the data on the PD. I've read about ConfigMaps and setting hostname aliases but I'm still unclear about how to proceed. Is this possible, or is this the wrong way to approach this kind of setup?

You are looking for a StatefulSet. That lets you have a known number of pod replicas; with attached storage (PersistentVolumes); and consistent DNS names. In the pod spec I would launch only a single copy of the server and use the StatefulSet's replica count to control the number of "nodes" (also a Kubernetes term), if the replica is #0 then it's the master.

Related

Shared file system among pods

We are running a cluster of x nodes.
Every node in the cluster pulls some files from remote storage. Unfortunately, the remote server is getting overloaded. So we are exploring a solution in which only a subset of the nodes pulls the files and are served to the remaining nodes (read-only - the other nodes do not need to write). Some subset of nodes can undergo maintenance often and can be taken offline.
I was experimenting with running NFS as a pod in a replica set with a service (fixed IP) for each of the NFS pods. If one node with the NFS-pod goes down, k8 will take care of bringing up an NFS-pod in another node with the same sticky IP.
But this new NFS would still need to remounted on the other nodes.
Any better solution for this storage problem?
Note that we would ideally not like to use remote storage since this adds extra latency.
Try Expanding Persistent Volume Claims. It's overhead for you to maintain, I recommend you to go with some locally managed the same. After that your choice.
There 2 options also recommended like : hostPath & GlusterFS volume, Please refer to this SO for more information.
#scenox suggested that's also a good option.

MariaDB Server vs MariaDB Galera Cluster HA Replication

I am planning to deploy HA database cluster on my kubernetes cluster. I am new to database and I am confused by the various database terms. I have decided on MariaDB and I have found two charts, MariaDB and MariaDB Galera Cluster.
I understand that both can achieve the same goal, but what are the main differences between the two? Under what scenario I should use either or?
Thanks in advance!
I'm not an expert so take my explanation with precaution (and double check it)
The main difference between the MariaDB's Chart and the MariaDB Galera Cluster's Chart is that the first one will deploy the standard master-slave (or primary-secondary) database, while the second one is a resilient master-master (or primary-primary) database cluster.
What does it means in more detail is the following:
MariaDB Chart will deploy a Master StatefulSet and a Slave StatefulSet which will spawn (with default values) one master Pod and 2 slave Pods. Once your database is up and running, you can connect to the master and write or read data, which is then replicated on the slaves, so that you have safe copies of your data available.
The copies can be used to read data, but only the master Pod can write new data in the database. Should the Pod crash.. or the Kubernetes cluster node where the Pod is running malfunction, you will not be able to write new data until the master's Pod is once more up and running (which may require manual intervention).. or if you perform a failover, promoting one of the other Pods to be the new temporary master (which also requires a manual intervention or some setup with proxies or virtual ips and so on).
Galera Cluster Chart instead, will deploy something more resilient. With default values, it will create a single StatefulSet with 3 Pods.. and each one of these Pods will be able to either read and write data, acting virtually as a master.
This means that if one of the Pods stop working for whatever reason, the other 2 will continue serving the database as if nothing happened, making the whole thing way more resilient. When the Pod (which stopped working) will come back up and running, it will obtain the new / different data from the other Pods, getting in sync.
In exchange for the resilience of the whole infrastructure (it would be too easy if the Galera Cluster solution would offer extreme resilience with no drawbacks), there are some cons in a multi-master application, with the more commons being some added latency in the operations, required to keep everything in sync and consistent.. and added complexity, which often may brings headaches.
There are several other limits with Galera Cluster, like explicit LOCKS of tables not working or that all tables must declare a primary key. You can find the full list here (https://mariadb.com/kb/en/mariadb-galera-cluster-known-limitations/)
Deciding between the two solutions mostly depends on the following question:
Do you have the necessity that, should one of your Kubernetes cluster node fail, the database keeps working (and being usable by your apps) like nothing happened, even if one of its Pods was running on that particular node?

Ephemeral Storage usage in AKS

I have a simple 3-node cluster created using AKS. Everything has been going fine for 3 months. However, I'm starting to have some disk space usage issues that seem related to the Os disks attached to each nodes.
I have no error in kubectl describe node and all disk-related checks are fine. However, when I try to run kubectl logs on some pods, I sometimes get "no space left on device".
How can one manage storage used in those disks? I can't seem to find a way to SSH into those nodes as it seems to only be manageable via Azure CLI / web interface. Is there also a way to clean what takes up this space (I assume unused docker images would take place, but I was under the impression that those would get cleaned automatically...)
Generally, the AKS nodes just run the pods or other resources for you, the data is stored in other space just like remote storage server. In Azure, it means managed disks and Azure file Share. You can also store the growing data in the nodes, but you need to configure big storage for each node and I don't think it's a good way.
To SSH into the AKS nodes, there are ways. One is that set the NAT rule manually for the node which you want to SSH into in the load balancer. Another is that create a pod as the jump box and the steps here.
The last point is that the AKS will delete the unused images regularly and automatically. It's not recommended to delete the unused images manually.
Things you can do to fix this:
Create AKS with bigger OS disk (I usually use 128gb)
Upgrade AKS to a newer version (this would replace all the existing vms with new ones, so they won't have stale docker images on them)
Manually clean up space on nodes
Manually extend OS disk on nodes (will only work until you scale\upgrade the cluster)
I'd probably go with option 1, else this problem would haunt you forever :(

Kubernetes with hybrid containers on one VM?

I have played around a little bit with docker and kubernetes. Need some advice here on - Is it a good idea to have one POD on a VM with all these deployed in multiple (hybrid) containers?
This is our POC plan:
Customers to access (nginx reverse proxy) with a public API endpoint. eg., abc.xyz.com or def.xyz.com
List of containers that we need
Identity server Connected to SQL server
Our API server with Hangfire. Connected to SQL server
The API server that connects to Redis Server
The Redis in turn has 3 agents with Hangfire load-balanced (future scalable)
Setup 1 or 2 VMs?
Combination of Windows and Linux Containers, is that advisable?
How many Pods per VM? How many containers per Pod?
Should we attach volumes for DB?
Thank you for your help
Cluster size can be different depending on the Kubernetes platform you want to use. For managed solutions like GKE/EKS/AKS you don't need to create a master node but you have less control over our cluster and you can't use latest Kubernetes version.
It is safer to have at least 2 worker nodes. (More is better). In case of node failure, pods will be rescheduled on another healthy node.
I'd say linux containers are more lightweight and have less overhead, but it's up to you to decide what to use.
Number of pods per VM is defined during scheduling process by the kube-scheduler and depends on the pods' requested resources and amount of resources available on cluster nodes.
All data inside running containers in a Pod are lost after pod restart/deletion. You can import/restore DB content during pod startup using Init Containers(or DB replication) or configure volumes to save data between pod restarts.
You can easily decide which container you need to put in the same Pod if you look at your application set from the perspective of scaling, updating and availability.
If you can benefit from scaling, updating application parts independently and having several replicas of some crucial parts of your application, it's better to put them in the separate Deployments. If it's required for the application parts to run always on the same node and if it's fine to restart them all at once, you can put them in one Pod.

Kubernetes - Persistent storage for PostgreSQL

We currently have a 2-node Kubernetes environment running on bare-metal machines (no GCE) and now we wish to set up a PostgreSQL instance on top of this.
Our plan was to map a data volume for the PostgreSQL Data Directory to the node using the volumeMounts option in Kubernetes. However this would be a problem because if the Pod ever gets stopped, Kubernetes will re-launch it at random on one of the other nodes. Thus we have no guarantee that it will use the correct data directory on re-launch...
So what is the best approach for maintaining a consistent and persistent PostgreSQL Data Directory across a Kubernetes cluster?
one solution is to deploy HA postgresql, for example https://github.com/sorintlab/stolon
another is to have some network storage attached to all nodes(NFS, glusterFS) and use volumeMounts in the pods