Kubernetes control the order of scale and upgrade for a StatefulSet - kubernetes

I have the following scenario:
A StatefulSet with 1 replica
Update the template section and scale it in the same operation using helm as application manager
The order of operation is the following:
Scaling to 3
Update the replica with name 0
Because I cannot control to first update and after that scale, I am losing data because there is a specific logic in the new statefulset template.
Is there a way to control the ordering of those operations?
The service in question is Redis, we are trying to get from standalone mode (1 replica) to replication(HA) without losing data.

For the moment I resolved the problem using a helm pre-install job that is basically scaling the sts to zero, after that helm is coming with the update.

I am not Redis expert, but I think that solution below should help you.
I would try to install another Redis HA instance (B) next to the existing one (A), taking as a data source for B a snapshot of A's PV. This could to avoid losing your data. For more information you can read more about volume snapshots.
See also this related problem.

Related

Whole Application level rolling update

My kubernetes application is made of several flavors of nodes, a couple of “schedulers” which send tasks to quite a few more “worker” nodes. In order for this app to work correctly all the nodes must be of exactly the same code version.
The deployment is performed using a standard ReplicaSet and when my CICD kicks in it just does a simple rolling update. This causes a problem though since during the rolling update, nodes of different code versions co-exist for a few seconds, so a few tasks during this time get wrong results.
Ideally what I would want is that deploying a new version would create a completely new application that only communicates with itself and has time to warm its cache, then on a flick of a switch this new app would become active and start to get new client requests. The old app would remain active for a few more seconds and then shut down.
I’m using Istio sidecar for mesh communication.
Is there a standard way to do this? How is such a requirement usually handled?
I also had such a situation. Kubernetes alone cannot satisfy your requirement, I was also not able to find any tool that allows to coordinate multiple deployments together (although Flagger looks promising).
So the only way I found was by using CI/CD: Jenkins in my case. I don't have the code, but the idea is the following:
Deploy all application deployments using single Helm chart. Every Helm release name and corresponding Kubernetes labels must be based off of some sequential number, e.g. Jenkins $BUILD_NUMBER. Helm release can be named like example-app-${BUILD_NUMBER} and all deployments must have label version: $BUILD_NUMBER . Important part here is that your Services should not be a part of your Helm chart because they will be handled by Jenkins.
Start your build with detecting the current version of the app (using bash script or you can store it in ConfigMap).
Start helm install example-app-{$BUILD_NUMBER} with --atomic flag set. Atomic flag will make sure that the release is properly removed on failure. And don't delete previous version of the app yet.
Wait for Helm to complete and in case of success run kubectl set selector service/example-app version=$BUILD_NUMBER. That will instantly switch Kubernetes Service from one version to another. If you have multiple services you can issue multiple set selector commands (each command executes immediately).
Delete previous Helm release and optionally update ConfigMap with new app version.
Depending on your app you may want to run tests on non user facing Services as a part of step 4 (after Helm release succeeds).
Another good idea is to have preStop hooks on your worker pods so that they can finish their jobs before being deleted.
You should consider Blue/Green Deployment strategy

I want to stop/hibernate the cluster to save cost, any best approach/practice for it?

I have created A GKE cluster for a POC, later on, I want to stop/hibernate the cluster to save cost, any best approach/practice for it?
You can put all your node pool to 0 VM but be careful to data lost (according to your node pool configuration, if you delete all the VM you can loose data). However, you will continue to pay for the control plane.
Another approach is to backup your data and to use IaC (Infra as code, such as terraform) to detroy and rebuild your cluster as needed.
Both approach are valid, they depend on your use case and how long to you need to hibernate your cluster.
An alternative is to use GKE Autopilot if your workloads are compliant with this deployment mode.

If I declare 2 replicas of PostgreSQL StatefulSet pods in k8s, are they the same database or they just share the volume?

After making 2 replicas of PostgreSQL StatefulSet pods in k8s, are the the same database?
If they do, why I created DB and user in one pod, and can not find the value in the other.
If they not, is there no point of creating replicas?
There isn't one simple answer here, it depends on how you configured things. Postgres doesn't support multiple instances sharing the same underlying volume without massive corruption so if you did set things up that way, it's definitely a mistake. More common would be to use the volumeClaimTemplate system so each pod gets its own distinct storage. Then you set up Postgres streaming replication yourself.
Or look at using an operator which handles that setup (and probably more) for you.
To add the answer in coderanger, as he said it's not easy to say how Postgres will work with the multi replicas, and data replication across the cluster unless checking more in-depth. Setting the multiple replicas directly without reading the document of replication of data might lead to big issue.
Here is one nice example from google for ref : https://cloud.google.com/architecture/deploying-highly-available-postgresql-with-gke
For the example of Postgres database replication example and clustering config files : https://github.com/CrunchyData/crunchy-containers/tree/master/examples/kube

Kubernetes deployment with Recreate strategy and maxSurge?

Summary
Can I give a deployment the rollout strategy Recreate and also set a fixed maxSurge for the deployment?
More details
I am developing an application that runs in Kubernetes. The backend will have multiple replicas, and runs EF Core with database migrations. I understand there are several ways to solve this; here's my idea at the moment.
On a new release, I would like all replicas to be stopped. Then a single replica at a time should start, and for each replica there should be an init container that runs the migrations (if needed).
This seems to be possible, using the following two configuration values:
.spec.strategy.type==Recreate and
.spec.strategy.rollingUpdate.maxSurge==1
Is it possible to use these two together? If not, is there any way to control how many replicas a controller will start at once with the Recreate strategy?
"No! You should do this in a completely different way!"
Feel free to suggest other methods as well, if you think I am coming at this from the completely wrong angle.
Statefulset might help you in this case.
StatefulSets are valuable for applications that require one or more of the following.
Stable, unique network identifiers.
Stable, persistent storage.
Ordered, graceful deployment and scaling.
Ordered, automated rolling updates.

How Databases synchronize data between persistent volumens in Kubernetes

I`ve just read Deploying Cassandra with Stateful Sets topic in the Kubernetes documentation.
The deployment process:
1. Creation of StorageClass
2. Creation of PersistentVolume (in my case 4 PersistentVolume). Set created in 1) storageClassName
3. Creation of Cassandra Headless Service
4. Using a StatefulSet to Create a Cassandra Ring - setting created in 1) storageClassName in StatefulSet yml definition.
As a result, there are 4 pods: Cassandra-0, Cassandra-1, Cassandra-2, Cassandra-4, which are mounted to created in 2) volumes (pv-0, pv-1, pv-2, pv-3).
I wonder how / if these persistent volumes synchronize data with each other.
E.g. if I add some record, which will be written by pod cassandra-0 in persistent volume pv-0, then if someone who is going to retrieve data from the database a moment later - using the cassandra-1 pod/pv will see data that has been added to pv-0. Can anyone tell me how it works exactly?
This is not related to Kubernetes
The replication is done by database and is configurable
See the CAP theorem and Eventual Consistency for Cassandra
You can control the level of consistency in Cassandra, whether the record is immediately updated across or later , depends on the configuration you do in Cassandra.
See also: Synchronous Replication , Asynchronous Replication
Cassandra Consistency:
how to set cassandra read and write consistency
How is the consistency level configured?
The mechanism to spread data across the clusters is independent if it was deployed in kubernetes or bare-metal instances. Cassandra will try to spread randomly the data across the nodes depending on a hash value (known as token), and will use the same algorithm to retrieve the information.
There are other factors to take in consideration: The replication factor (amount of copies), and the consistency level used.
You would want to take a look to DS201: DataStax Enterprise Foundations of Apache Cassandra™ in Datastax academy, where they cover the basics of Cassandra.
Just to slightly extend Carlos' answer, Kubernetes is not involved and the volumes are completely isolated. The replication and distribution stuffs are entirely up to the database software to handle. As far as K8s sees, they are just separate processes and separate volumes.
Thanks for comments guys!
so, when I have my db with 3 PVs:
cassandra-pod0 cassandra-pod1 cassandra-pod2
| | |
cassandra-pv0 cassandra-pv0 cassandra-pv0
Data is divided into 3 pvs.When I kill cassandra-pod1 - it is possible that I will lose (temporarily) part of the data. Am I right?