Can a ReplicaSet be configured to allow in progress updates to complete? - kubernetes

I currently have a kubernetes setup where we are running a decoupled drupal/gatsby app. The drupal acts as a content repository that gatsby pulls from when building. Drupal is also configured through a custom module to connect to the k8s api and patch the deployment gatsby runs under. Gatsby doesn't run persistently, instead this deployment uses gatsby as an init container to build the site so that it can then be served by a nginx container. By patching the deployment(modifying a label) a new replicaset is created which forces a new gatsby build, ultimately replacing the old build.
This seems to work well and I'm reasonably happy with it except for one aspect. There is currently an issue with the default scaling behaviour of replica sets when it comes to multiple subsequent content edits. When you make a subsequent content edit within drupal it will still contact the k8s api and patch the deployment. This results in a new replicaset being created, the original replicaset being left as is, the previous replicaset being scaled down and any pods that are currently being created(gatsby building) are killed. I can see why this is probably desirable in most situations but for me this increases the amount of time that it takes for you to be able to see these changes on the site. If multiple people are using drupal at the same time making edits this will be compounded and could become problematic.
Ideally I would like the containers that are currently building to be able to complete and for those replicasets to finish scaling up, queuing another replicaset to be created once this is completed. This would allow any updates in the first build to be deployed asap, whilst queueing up another build immediately after to include any subsequent content, and this could continue for as long as the load is there to require it and no longer. Is there any way to accomplish this?

It is the regular behavior of Kubernetes. When you update a Deployment it creates new ReplicaSet and respectively a Pod according to new settings. Kubernetes keeps some old ReplicatSets in case of possible roll-backs.
If I understand your question correctly. You cannot change this behavior, so you need to do something with architecture of your application.

Related

Whole Application level rolling update

My kubernetes application is made of several flavors of nodes, a couple of “schedulers” which send tasks to quite a few more “worker” nodes. In order for this app to work correctly all the nodes must be of exactly the same code version.
The deployment is performed using a standard ReplicaSet and when my CICD kicks in it just does a simple rolling update. This causes a problem though since during the rolling update, nodes of different code versions co-exist for a few seconds, so a few tasks during this time get wrong results.
Ideally what I would want is that deploying a new version would create a completely new application that only communicates with itself and has time to warm its cache, then on a flick of a switch this new app would become active and start to get new client requests. The old app would remain active for a few more seconds and then shut down.
I’m using Istio sidecar for mesh communication.
Is there a standard way to do this? How is such a requirement usually handled?
I also had such a situation. Kubernetes alone cannot satisfy your requirement, I was also not able to find any tool that allows to coordinate multiple deployments together (although Flagger looks promising).
So the only way I found was by using CI/CD: Jenkins in my case. I don't have the code, but the idea is the following:
Deploy all application deployments using single Helm chart. Every Helm release name and corresponding Kubernetes labels must be based off of some sequential number, e.g. Jenkins $BUILD_NUMBER. Helm release can be named like example-app-${BUILD_NUMBER} and all deployments must have label version: $BUILD_NUMBER . Important part here is that your Services should not be a part of your Helm chart because they will be handled by Jenkins.
Start your build with detecting the current version of the app (using bash script or you can store it in ConfigMap).
Start helm install example-app-{$BUILD_NUMBER} with --atomic flag set. Atomic flag will make sure that the release is properly removed on failure. And don't delete previous version of the app yet.
Wait for Helm to complete and in case of success run kubectl set selector service/example-app version=$BUILD_NUMBER. That will instantly switch Kubernetes Service from one version to another. If you have multiple services you can issue multiple set selector commands (each command executes immediately).
Delete previous Helm release and optionally update ConfigMap with new app version.
Depending on your app you may want to run tests on non user facing Services as a part of step 4 (after Helm release succeeds).
Another good idea is to have preStop hooks on your worker pods so that they can finish their jobs before being deleted.
You should consider Blue/Green Deployment strategy

How to reduce downtime caused by pulling images in the Kubernetes Recreate deployment strategy

Assuming I have a Kubernetes Deployment object with the Recreate strategy and I update the Deployment with a new container image version. Kubernetes will:
scale down/kill the existing Pods of the Deployment,
create the new Pods,
which will pull the new container images
so the new containers can finally run.
Of course, the Recreate strategy is exepected to cause a downtime between steps 1 and 4, where no Pod is actually running. However, step 3 can take a lot of time if the container images in question are or the container registry connection is slow, or both. In a test setup (Azure Kubernetes Services pulling a Windows container image from Docker Hub), I see it taking 5 minutes and more, which makes for a really long downtime.
So, what is a good option to reduce that downtime? Can I somehow get Kubernetes to pull the new images before killing the Pods in step 1 above? (Note that the solution should work with Windows containers, which are notoriously large, in case that is relevant.)
On the Internet, I have found this Codefresh article using a DaemonSet and Docker in Docker, but I guess Docker in Docker is no longer compatible with containerd.
I've also found this StackOverflow answer that suggests using an Azure Container Registry with Project Teleport, but that is in private preview and doesn't support Windows containers yet. Also, it's specific to Azure Kubernetes Services, and I'm looking for a more general solution.
Surely, this is a common problem that has a "standard" answer?
Update 2021-12-21: Because I've got a corresponding answer, I'll clarify that I cannot easily change the deployment strategy. The application in question does not support running Pods of different versions at the same time because it uses a database that needs to be migrated to the corresponding application version, without forwards or backwards compatibility.
Implement a "blue-green" deployment strategy. For instance, the service might be running and active in the "blue" state. A new deployment is created with a new container image, which deploys the "green" pods with the new container image. When all of the "green" pods are ready, the "switch live" step is run, which switches the active color. Very little downtime.
Obviously, this has tradeoffs. Your cluster will need more memory to run the additional transitional pods. The deployment process will be more complex.
Via https://www.reddit.com/r/kubernetes/comments/oeruh9/can_kubernetes_prepull_and_cache_images/, I've found these ideas:
Implement a DaemonSet that runs a "sleep" loop on all the images I need.
Use http://github.com/mattmoor/warm-image, which has no Windows support.
Use https://github.com/ContainerSolutions/ImageWolf, which says, "ImageWolf is currently alpha software and intended as a PoC - please don't run it in production!"
Use https://github.com/uber/kraken, which seems to be a registry, not a pre-pulling solution.
Use https://github.com/dragonflyoss/Dragonfly (now https://github.com/dragonflyoss/Dragonfly2), which also seems to do somethings completely different.
Use https://github.com/senthilrch/kube-fledged, which looks exactly right and more mature than the others, but has no Windows support.
Use https://github.com/dcherman/image-cache-daemon, which has no Windows support.
Use https://goharbor.io/blog/harbor-2.1/, which also seems to be a registry, not a pre-pulling solution.
Use https://openkruise.io/docs/user-manuals/imagepulljob/, which also looks right, but a) OpenKruise is huge and I'm not sure I want to install this just to preload images, and b) it seems it has no Windows support.
So, it seems I have to implement this on my own, with a DaemonSet. I still hope someone can provide a better answer than this one 🙂 .

Service Fabric Application - changing instance count on application update fails

I am building a CI/CD pipeline to release SF Stateless Application packages into clusters using parameters for everything. This is to ensure environments (DEV/UAT/PROD) can be scoped with different settings.
For example in a DEV cluster an application package may have an instance count of 3 (in a 10 node cluster)
I have noticed that if an application is in the cluster and running with an instance count (for example) of 3, and I change the deployment parameter to anything else (e.g. 5), the application package will upload and register the type, but will fail on attempting to do a rolling upgrade of the running application.
This also works the other way e.g. if the running app is -1 and you want to reduce the count on next rolling deployment.
Have I missed a setting or config somewhere, is this how it is supposed to be? At present its not lending itself to being something that is easily scaled without downtime.
At its simplest form we just want to be able to change instance counts on application updates, as we have an infrastructure-as-code approach to changes, builds and deployments for full tracking ability.
Thanks in advance
This is a common error when using Default services.
This has been already answered multiple times in these places:
Default service descriptions can not be modified as part of upgrade set EnableDefaultServicesUpgrade to true
https://blogs.msdn.microsoft.com/maheshk/2017/05/24/azure-service-fabric-error-to-allow-it-set-enabledefaultservicesupgrade-to-true/
https://github.com/Microsoft/service-fabric/issues/253#issuecomment-442074878

How can we route a request to every pod under a kubernetes service on Openshift?

We are building a Jboss BRMS application with two microservices in spring-boot, one for rule generation (SRV1) and one for rule execution (SRV2).
The idea is to generate the rules using the generation microservice (SRV1) and persist them in the database with versioning. The next part of the process is having the execution microservice load these persisted rules into each pods memory by querying the information from the shared database.
There are two following scenarios when this should happen :
When the rule execution service pod/pods starts up, it queries the db for the lastest version and every pod running the execution application loads those rules from the shared db.
The second senario is we manually want to trigger the loading of a specific version of rules on every pod running the execution application preferably via a rest call.
Which is where the problem lies!
Whenever we try and issue a rest request to the api, since it is load balanced under a kubernetes service, the request hits only one of the pods and the rest of them do not load the specific rules.
Is there a programatic or design change that may help us achieve that or is there any other way we construct our application to achieve a capability to load a certain version of rules on all pods serving the execution microservice.
The second senario is we manually want to trigger the loading of a specific version of rules on every pod running the execution application preferably via a rest call.
What about using Rolling Updates? When you want to change the version of rules to be fetched within all execution pods, tell OpenShift to do rolling update which kills/starts all your pods one by one until all pods are on the new version, thus, they fetch the specific version of rules at the startup. The trigger of Rolling Updates and the way you define the version resolution is up to you. For instance: Have an ENV var within a pod that defines the version of rules that are going to be fetched from db, then change the ENV var to a new value and perform Rollling Updates. At the end, you should end up with new set of pods, all of them fetching the version rules based on the new value of the ENV var you set.

How to handle recurring short-lived tasks with Kubernetes

I have a setup with a webserver (NGINX) and a react-based frontend that uses webpack to build the final static sources.
The webserver has its own kubernetes deployment + service.
The frontend needs to be build before the webserver can serve the static html/js/css files - but after that, the pod/container can stop.
My idea was to share a volume between the webserver and the frontend pod. The frontend will write the generated files to the volume and the webserver can serve them from there. Whenever there is an update to the frontend sourcecode, the files need to be regenerated.
What is the best way to accomplish that using kubernetes tools?
Right now, I'm using a init-container to build - but this leads to a restart of the webserver pod as well, which wouldn't be neccessary.
Is this the best/only solution to this problem or should I use kubernetes' jobs for this kind of tasks?
There are multiple ways to do this. Here's how I think about this:
Option 1: The static files represent built source code
In this case, the static files that you want to serve should actually be packaged and built into the docker image of your nginx webserver (in the html directory say). When you want to update your frontend, you update the version of the image used and update the pod.
Option 2: The static files represent state
In this case, your approach is correct. Your 'state' (like a database) is stored in a folder. You then run an init container/job to initialise 'state' and then your webserver pod works fine.
I believe option 1 to be better for 2 reasons:
You can horizontally scale your webserver trivially by increasing the pod replica number. In option 2, you're actually dealing with state so that's a problem when you want to add more nodes to your underlying k8s cluster (you'll have to copy files/folders from one volume/folder to another).
The static files are actually the source code of your app. These are not uploaded media files or similar. In this case, it absolutely makes sense to make them a part of your docker image. Otherwise, it kind of defeats that advantage of containerising and deploying.
Jobs, Init containers, or alternatively a gitRepo type of Volume would work for you.
http://kubernetes.io/docs/user-guide/volumes/#gitrepo
It is not clear in your question why you want to update the static content without simply re-deploying / updating the Pod.
Since somewhere, somehow, you have to build the webserver Docker image, it seems best to build the static content into the image: no moving parts once deployed, no need for volumes or storage. Overall it is simpler.
If you use any kind of automation tool for Docker builds, it's easy.
I personally use Jenkins to build Docker images based on a hook from git repo, and the image is simply rebuilt and deployed whenever the code changes.
Running a Job or Init container doesn't gain you much: sure the web server keeps running, but it's as easy to have a Deployment with rolling updates which will deploy the new Pod before the old one is torn down and you server will always be up too.
Keep it simple...