Active MQ in HA Shared Database (Master/Slave) on Kubernetes with StatefulSet - kubernetes

I am in the process of deploying ActiveMQ 5.15 in HA on Kubernetes. Previously I was using a deployment and a clusterIP Service. And it was working fine. The master will boot up and the slave will wait for the lock to be acquired. If I delete the pod which is the master one, the slave picks up and becomes the master.
Now I want to try with statefulset basing myself on this thread.
Deployment is done successfully and two pods were created with id0 and id1. But what I noticed is that both pods were master. They were both started. I noticed also that two PVC were created id0 and id1 in the case of Statefulset compared to deployment which had only 1 PVC. Could that be the issue since it is no more a shared storage? Can we still achieve a master/slave setup with Statefulset?

I noticed also that two PVC were created id0 and id1 in the case of statefulset compared to deployment which had only 1 PVC. Could that be the issue since it is no more a shared storage?
You are right. When using k8s StatefulSets each Pod gets its own persistent storage (dedicated PVC and PV), and this persistent storage is not shared.
When a Pod gets terminated and is rescheduled on a different Node, the Kubernetes controller will ensure that the Pod is associated with the same PVC which will guarantee that the state is intact.
In your case, to achieve a master/slave setup, consider using a shared network location / filesystem for persistent storage like:
NFS storage for on-premise k8s cluster.
AWS EFS for EKS.
or Azure Files for AKS.
Check the complete list of PersistentVolume types currently supported by Kubernetes (implemented as plugins).

Related

Persistent Volume and Kubernetes upgrade

What happens to the persistent volume post cluster upgrade ?
The Kubernetes cluster is for a stateful application. It has one pv and corresponding pvc for storing input data. I would like to understand if there is a way to preserve the input data during K3S upgrade.
The kubernetes PV are not created on the node disk storage: when you kill your StatefulSet pod, It may be deployed on a different node, with the same PV.
Most of cloud providers use their block storage services as a default backend for K8S PV (ex: AWS EBS) and they provide other CSI (Container Storage Interface) drivers to use other storage services (ex: NFS service).
So when you upgrade your cluster, you can re-use your data if they are stored outside the cluster, you need just to check which CSI you are using, and read its doc to understand where it is created.

Kubernetes Statefulset problem with Cluster Autoscaler and Multi-AZ

I have a EKS cluster with cluster autoscaler setup, spanning across three availability zones. I have deployed a Redis Cluster using helm and it works fine. Basically it is a statefulset of 6 replicas with dynamic PVC.
Currently, my EKS cluster has two worker nodes, which I will name as Worker-1A and Worker-1B in AZ 1A and 1B respectively, and has no worker node on AZ 1C. I am doing some testing to make sure the Redis Cluster can always spin up and attach the volume properly. All the Redis Cluster pods are created in Worker-1B. In my testing, I kill all the pods in the Redis Cluster, and before it spins new pods up, I deploy some other deployments to use all the resources in Worker-1A and Worker-1B. Now since that the worker nodes have no resource to create new pods, the cluster autoscaler will create a worker node in AZ 1C (to balance nodes across AZ). Now the problem comes, when the Redis Cluster statefulset trying to recreate the pods, it cannot create in Worker-1B because there is no resource, and it will try to create in Worker-1C instead, and the pods will hit the following error: node(s) had volume node affinity conflict.
I know this situation might be rare but how do I fix this issue if it ever happens? I am hoping if there is an automated way to solve this instead of fixing it manually.

Does stellar core deployment on k8s needs persistent storage?

I want to deploy stellar core on k8s with CATCHUP COMPLETE. I'm using this docker image satoshipay/stellar-core
In docker image docs mentioned /data used to store the some informations about DB. And I've seen that helm template is using a persistent volume and mounting it in /data.
I was wondering what will happen if I use a deployment instead of the stateful set and I restart the pod, update it's docker version or delete it? Does it initialize the DB again?
Also does the stellar core need any extra storage for the catchup?
Statefulset vs Deployment
A StatefulSet "provides guarantees about the ordering and uniqueness of these Pods".
If your application needs to be brought up in a specific order, use statefulset.
Storage
Definitely leverage a persistent volume for database. From K8S Docs
On-disk files in a Container are ephemeral
Since it appears you're deploying some kind of blockchain application, this could cause significant delays for startup
In Deployment you specify a PersistentVolumeClaim that is shared by all pod replicas. In other words, shared volume.
The backing storage obviously must have ReadWriteMany or ReadOnlyMany accessMode if you have more than one replica pod.
StatefulSet you specify a volumeClaimTemplates so that each replica pod gets a unique PersistentVolumeClaim associated with it.
In other words, no shared volume.
StatefulSet is useful for running things in cluster e.g Hadoop cluster, MySQL cluster, where each node has its own storage.
So in your case to have more isolation (no shared volumes) is better to have statefulset based solution.
If you use deployment based solution (restart the pod, update it's docker version or delete it) your DB will be initialized again.
Regarding catchup:
In general, running CATCHUP_COMPLETE=true is not recommended in docker containers as they have limited resources by default (if you really want to do it, make sure to give them access to more resources: CPU, memory and disk space).

Creating a Deployment and PersistentVolumeClaim with dynamic AWS EBS backed claims

I create a Deployment with a volumeMount that references a PersistentVolumeClaim along with a memory request on a cluster with nodes in 3 difference AZs us-west-2a, us-west-2b, and us-west-2c.
The Deployment takes a while to start while the PersistentVolume is being dynamically created but they both eventually start up.
The problem I am running into is that the PersistentVolume is made in us-west-2c and the only node the pod can run on is already over allocated.
Is there a way for me to create the Deployment and claim such that the claim is not made in a region where no pod can start up?
I believe you're looking for Topology Awareness feature.
Topology Awareness
In Multi-Zone clusters, Pods can be spread across
Zones in a Region. Single-Zone storage backends should be provisioned
in the Zones where Pods are scheduled. This can be accomplished by
setting the Volume Binding Mode.
Kubernetes released topology-aware dynamic provisioning feature with kubernetes version 1.12, and I believe this will solve your issue.

In GCP Kubernetes (GKE) how do I assign a stateless pod created by a deployment to a provisioned vm

I have several operational deployments on minikube locally and am trying to deploy them on GCP with kubernetes.
When I describe a pod created by a deployment (which created a replication set that spawned the pod):
kubectl get po redis-sentinel-2953931510-0ngjx -o yaml
It indicates it landed on one of the kubernetes vms.
I'm having trouble with deployments that work separately failing due to lack of resources e.g. cpu even though I provisioned a VM above the requirements. I suspect the cluster is placing the pods on it's own nodes and running out of resources.
How should I proceed?
Do I introduce a vm to be orchestrated by kubernetes?
Do I enlarge the kubernetes nodes?
Or something else all together?
It was a resource problem and node pool size was inhibiting the deployments.I was mistaken in trying to provide google compute instances and disks.
I ended up provisioning Kubernetes node pools with more cpu and disk space and solved it. I also added elasticity by provisioning autoscaling.
here is a node pool documentation
here is a terraform Kubernetes deployment
here is the machine type documentation