Does Kubernetes natively support "blue-green"-like deployments? - kubernetes

I have a single page app. It is served by (and talks to) an API server running on a Kubernetes deployment with 2 replicas. I have added a X-API-Version header that my API sends on every request, and my client can compare with, to figure out if it needs to inform the user their client code is outdated.
One issue I am facing however, is when I deploy, I want to ensure only ever 1 version of the API is running. I do not want a situation where a client can be refreshed many times in a loop, as it receives different API versions.
I basically want it to go from 2 replicas running version A, to 2 replicas running Version A, 2 running version B. Then switch the traffic to version B once health checks pass, then tear down the old version A's.
Does Kubernetes support this using the RollingDeploy strategy?

For blue-green deployment in Kubernetes, I will recommend to use some third party solution like Argo Rollouts, NGINX, Istio etc.. They will let you split the traffic between the versions of your application.
However, Kubernentes is introducing Gateway API which has built-in support for traffic splitting.

What you are asking isn't a blue/green deploy really. If you require two pods, or more, to run during the upgrade, for performance issues, you will get an overlap where some pods of version A responds and some from version B.
You can fine tune it a little, for instance you can configure it to start all of the new pods at once and for each one that turn from running->ready one of the old will be removed. If your pods starts fast, or at least equally fast, the overlap will be really short.
Or, if you can accept a temporary downtime there is a deployment strategy that completely decommission all old pods before rolling out the new ones. Depending on how fast your service starts this could give a short or long downtime.
Of, if you don't mind just a little bit extra work, you deploy version B in parallell with version A and you add the version to the set of labels.
Then, in your service you make sure the version label is a part of the selector and once the pods for version B is running you change the service selectors from version A to version B and it will instantly start using those instead.

I recently starting using Kubernetes. My experience is that yes, K8s behaves this way out of the box. If I have e.g. two pods running and I perform a deployment, K8s will create two fresh pods and then, only once those two fresh pods are healthy, will K8s terminate the original two pods.

Related

k8s: Is it possible to have two identical deployments but route different traffic to them?

Here is my use case:
I have a microservice which gets sent traffic via an ingress gateway in real time and via a batch process. What I'd like to be able to do is be able to conceptually define a deployment and have it create two sets of pods:
One set for real time request
Another for batch.
When a new version of the microservice gets deployed, the k8s deployment is updated and both real time and batch use the new version.
Is this possible in k8s or will I need to create two deployments and manage them separately?
This is a community wiki answer posted for better visibility. Feel free to expand it.
Since we don't know the complete information about the architecture used, the following suggestions from comments can be used in the future to solve the problem.
1. With Deployments, Services, Selectors
You can have two identical deployments and route different traffic to them.
It may be implemented by Services:
In Kubernetes, a Service is an abstraction which defines a logical set
of Pods and a policy by which to access them (sometimes this pattern
is called a micro-service). The set of Pods targeted by a Service is
usually determined by a selector.
Such approach has some advantages.
By default, traffic will be routed to endpoints in random way if you are using iptables proxy mode. When you try to send traffic to specific pods covered by the same deployment - it may happen large differences in CPU and Memory usage leading to the resource exhaustion or wasting resources.
It will be easier to manage service versioning, CPU and Memory assignment and rollouts.
2. With Istio
From David M. Karr
If a service is defined as a VirtualService, you can route to
different DestinationRule objects depending on header values (or other
qualifications).
Additionally
If you need to deploy a new version of the microservice, you can choose between different strategies, which is more suitable for your needs.
Kubernetes deployment strategies:
recreate: terminate the old version and release the new one
ramped: release a new version on a rolling update fashion, one after the other
blue/green: release a new version alongside the old version then switch traffic
canary: release a new version to a subset of users, then proceed to a full rollout
a/b testing: release a new version to a subset of users in a precise way (HTTP headers, cookie, weight, etc.). A/B testing is
really a technique for making business decisions based on statistics
but we will briefly describe the process. This doesn’t come out of the
box with Kubernetes, it implies extra work to setup a more advanced
infrastructure (Istio, Linkerd, Traefik, custom nginx/haproxy, etc).

Synchronize and rollback independent deployments in kubernetes

I have k8s setup that contains 2 deployments: client and server deployed from different images. Both deployments have replica sets inside, liveness and readiness probes defined. The client communicates with the server via k8s' service.
Currently, the deployment scripts for both client and server are separated (separate yaml files applied via kustomization). Rollback works correctly for both parts independently but let's consider the following scenario:
1. deployment is starting
2. both deployment configurations are applied
3. k8s master starts replacing pods of server and client
4. server pods start correctly so new replica set has all the new pods up and running
5. client pods have an issue, so the old replica set is still running
In many cases it's not a problem, because client and server work independently, but there are situations when breaking change to the server API is released and both client and server must be updated. In that case if any of these two fails then both should be rolled back (doesn't matter which one fails - both needs to be rolled back to be in sync).
Is there a way to achieve that in k8s? I spent quite a lot of time searching for some solution but everything I found so far describes deployments/rollbacks of one thing at the time and that doesn't solve the issue above.
The problem here is something covered in Blue/Green deployments.
Here is a good reference of Blue/Green deployments with k8s.
The basic idea is, you deploy the new version (Green deployment) while keeping the previous version (Blue deployment) up and running and only allow traffic to the new version (Green deployment) when everything went fine.

Can i only change one pod in kubernetes?

I only want to deploy one pod in k8s.
For example, I deploy several pods in one pool with the same codes, but I only want to change one pod to do some test. Can it be done?
What you're describing in your question is actually the closest to what we call Canary Deployment.
In a nutshell Canary Deployment (also known as Canary Release) is a technique that allows you to reduce potential risk of introducing in production a new software version that may be corrupted. It is achieved by rolling out the change only to a small subset of servers ( in Kubernetes it may be just one pod ) before deploying it to the entire infrastructure and making it available to everybody.
If you decide e.g. to deploy one more pod using new image version and you've got already working deployment consisting let's say of 3 replicas, only 25 % of traffic will be routed to the new pod. Once you decide the test was successful you may continue rolling out the update to other pods.
Here you can find an article describing in detail how you can perform such kind of deployment on Kubernetes.
It's actually similar approach to Blue-Green Deployment already mentioned by #Malathi and has a lot in common with it.
Perhaps you meant Blue-Green Deployments.
The common release process involves, adding new pods with the latest release and perhaps expose a certain percent of the traffic to be routed to the new release pod. If everything goes well you can remove the old pods with old release and replace them with new pods with the new release.
This article talks of blue-green deployments with Kubernetes.
It is also possible to use service mesh-like istio with Kubernetes for advanced blue-green deployments such as redirect traffic to a new release based on header values or cookies.

How to implement Blue-Green Deployment with HPA?

I have two colored tracks where I deployed two different versions of my webapp (nginx+php-fpm), These tracks are available by services, called live and next.
The classic way would be deploying new version of webapp to next, after checking, release it to live by switching their services.
So far so good.
Considering autoscaling with HPA:
Before doing a release I have to prescale next to the amount of live pods to prevent too heavy loads after switch.
Problem here is the nature of HPAs cpu load measuring. In worst case the autoscaler will downscale the prescaled track immediately, cause of calculating cpu load coming from next.
Another problem i found is using keepalive connections, which makes releasing new pods to live very hard without killing old pods.
How to solve the problem?
We have a few deployment strategies (there are more but I will point the most common).
1) Rolling Update - We need only one deployment. It will add pods with new content to current deployment and terminating old version pods in the same time. For a while deployment will contain mix of old and new version.
2) Blue-Green Deployment - It is the safest strategy and it is recommended for production workloads. We need to have two deployments coexisting i.e v1 and v2. Im most cases old deployment is draining (close all connections/sessions to old deployment) and redirected all new sessions/connections to new deployment. Usualy both deployments are keept for a while as Production and Stage.
3) Canary Deployment - The hardest one. Here you also need at least two deployments running at the same time. Some users will be connected to old application, others will be redirected to new one. It can be achieved via load balancig/proxy layer configuration. In this case HPA is not allowed because we are using two deployments at the same time and each deployment will have own independent autoscaler.
Like #Mamuz pointed in comment Blue-Green Strategy without switch on
service level sounds much better in this case than rolling-update.
Another option which might be useful in this scenario is Blue-Green
Deployment with ISTIO using Traffic Shifting. In this option you
could divide traffic as request i.e. from 100-0, 80-20, 60-40, 20-80
to 0-100%
Using ISTIO and HPA step by step is described in this article.
You can read about Traffic Management here.
Example of Istio and K8s here.

Kubernetes job that consists of two pods (that must run on different nodes and communicate with each other)

I am trying to create a Kubernetes job that consists of two pods that have to be scheduled on separate nodes in our Hybrid cluster. Our requirement is that one of the pods runs on a Windows Server node and the other pod is running on a Linux node (thus we cannot just run two Docker containers from the same pod, which I know is possible, but would not work in our scenario). The Linux pod (which you can imagine as a client) will communicate over the network with the Windows pod (which you can imagine as a stateful server) exchanging data while the job runs. When the Linux pod terminates, we want to also terminate the Windows pod. However, if one of the pods fail, then we want to fail both pods (as they are designed to be a single job)
Our current design is to write a K8S service that handles the communication between the pods, and then apply the service and the two pods to the cluster to "emulate" a job. However, this is not ideal since the two pods are not tightly coupled as a single job and adds quite a bit of overhead to manually manage this setup (e.g. when failures or the job, we probably need to manually kill the service and deployment of the Windows pod). Plus we would need to deploy a new service for each "job", as we require the Linux pod to always communicate with the same Windows pod for the duration of the job due to underlying state (thus cannot use a single service for all Windows pods).
Any thoughts on how this could be best achieved on Kubernetes would be much appreciated! Hopefully this scenario is supported natively, and I would not need to resort in this kind of pod-service-pod setup that I described above.
Many thanks
I am trying to distinguish your distaste for creating and wiring the Pods from your distaste at having to do so manually. Because, in theory, a Job that creates Pods is very similar to what you are describing, and would be able to have almost infinite customization for those kinds of rules. With a custom controller like that, one need not create a Service for the client(s) to speak to their server, as the Job could create the server Pod first, obtain its Pod-specific-IP, and feed that to the subsequently created client Pods.
I would expect one could create a Job controller using only bash and either curl or kubectl: generate the json or yaml that describes the situation you wish to have, feed it to the kubernetes API (since the Job would have a service account - just like any other in-cluster container), and use normal traps to cleanup after itself. Without more of the specific edge cases loaded in my head it's hard to say if that's a good idea or not, but I believe it's possible.