I have k8s setup that contains 2 deployments: client and server deployed from different images. Both deployments have replica sets inside, liveness and readiness probes defined. The client communicates with the server via k8s' service.
Currently, the deployment scripts for both client and server are separated (separate yaml files applied via kustomization). Rollback works correctly for both parts independently but let's consider the following scenario:
1. deployment is starting
2. both deployment configurations are applied
3. k8s master starts replacing pods of server and client
4. server pods start correctly so new replica set has all the new pods up and running
5. client pods have an issue, so the old replica set is still running
In many cases it's not a problem, because client and server work independently, but there are situations when breaking change to the server API is released and both client and server must be updated. In that case if any of these two fails then both should be rolled back (doesn't matter which one fails - both needs to be rolled back to be in sync).
Is there a way to achieve that in k8s? I spent quite a lot of time searching for some solution but everything I found so far describes deployments/rollbacks of one thing at the time and that doesn't solve the issue above.
The problem here is something covered in Blue/Green deployments.
Here is a good reference of Blue/Green deployments with k8s.
The basic idea is, you deploy the new version (Green deployment) while keeping the previous version (Blue deployment) up and running and only allow traffic to the new version (Green deployment) when everything went fine.
Related
I have a single page app. It is served by (and talks to) an API server running on a Kubernetes deployment with 2 replicas. I have added a X-API-Version header that my API sends on every request, and my client can compare with, to figure out if it needs to inform the user their client code is outdated.
One issue I am facing however, is when I deploy, I want to ensure only ever 1 version of the API is running. I do not want a situation where a client can be refreshed many times in a loop, as it receives different API versions.
I basically want it to go from 2 replicas running version A, to 2 replicas running Version A, 2 running version B. Then switch the traffic to version B once health checks pass, then tear down the old version A's.
Does Kubernetes support this using the RollingDeploy strategy?
For blue-green deployment in Kubernetes, I will recommend to use some third party solution like Argo Rollouts, NGINX, Istio etc.. They will let you split the traffic between the versions of your application.
However, Kubernentes is introducing Gateway API which has built-in support for traffic splitting.
What you are asking isn't a blue/green deploy really. If you require two pods, or more, to run during the upgrade, for performance issues, you will get an overlap where some pods of version A responds and some from version B.
You can fine tune it a little, for instance you can configure it to start all of the new pods at once and for each one that turn from running->ready one of the old will be removed. If your pods starts fast, or at least equally fast, the overlap will be really short.
Or, if you can accept a temporary downtime there is a deployment strategy that completely decommission all old pods before rolling out the new ones. Depending on how fast your service starts this could give a short or long downtime.
Of, if you don't mind just a little bit extra work, you deploy version B in parallell with version A and you add the version to the set of labels.
Then, in your service you make sure the version label is a part of the selector and once the pods for version B is running you change the service selectors from version A to version B and it will instantly start using those instead.
I recently starting using Kubernetes. My experience is that yes, K8s behaves this way out of the box. If I have e.g. two pods running and I perform a deployment, K8s will create two fresh pods and then, only once those two fresh pods are healthy, will K8s terminate the original two pods.
I only want to deploy one pod in k8s.
For example, I deploy several pods in one pool with the same codes, but I only want to change one pod to do some test. Can it be done?
What you're describing in your question is actually the closest to what we call Canary Deployment.
In a nutshell Canary Deployment (also known as Canary Release) is a technique that allows you to reduce potential risk of introducing in production a new software version that may be corrupted. It is achieved by rolling out the change only to a small subset of servers ( in Kubernetes it may be just one pod ) before deploying it to the entire infrastructure and making it available to everybody.
If you decide e.g. to deploy one more pod using new image version and you've got already working deployment consisting let's say of 3 replicas, only 25 % of traffic will be routed to the new pod. Once you decide the test was successful you may continue rolling out the update to other pods.
Here you can find an article describing in detail how you can perform such kind of deployment on Kubernetes.
It's actually similar approach to Blue-Green Deployment already mentioned by #Malathi and has a lot in common with it.
Perhaps you meant Blue-Green Deployments.
The common release process involves, adding new pods with the latest release and perhaps expose a certain percent of the traffic to be routed to the new release pod. If everything goes well you can remove the old pods with old release and replace them with new pods with the new release.
This article talks of blue-green deployments with Kubernetes.
It is also possible to use service mesh-like istio with Kubernetes for advanced blue-green deployments such as redirect traffic to a new release based on header values or cookies.
I need to keep some configuration maybe files or otehwrise in all instances of a kubernetes docker image deployment.
I need the ability to remotely update the configuration in all of the running pods of the deployment. This is to be followed by invocation of some java code in all of the running pods of the docker image deployment.
Whenever a new pod comes up of the same docker image deployment it should have the updated configuration.
I dont want the configuration stored anywhere centrally as much as possible. Want it in each pod of the docker image deployment.
What are my choices?
As a last resort I could do it as a rolling deployment update.
R
Rolling deployment, or similar- update to a mounted config map, etc- is the kubernetes option. Always results in an application restart.
Having an application support live configuration updates, running some code after receiving those updates, without restart- that's an application feature.
Handwavy way of doing this-
Have the correct configuration live in a ConfigMap.
Have the application listen on a separate port for either a signal to retrieve updated configuration (if the application is k8s aware) or to actually receive the configuration bits themselves. Have the application be able to handle this live configuration update process, the difficulty of which depends on the framework in use.
Have another application be responsible for delivering these updates- watch for changes to the ConfigMap, get the list of Pods in the deployment, deliver either a signal or the updated configuration to each of the Pods.
Have the first application not get to what k8s recognizes as Ready state without having received updated configuration from the second.
I'm currently setting up a POC spinnaker pipeline to deploy to a kubernetes cluster.
Experimenting with spinnaker's red/black strategy, i've noticed that it does not behave as i expect it to. I expect it to guarantee that only 1 version gets traffic with the following steps:
deploy black server group (kubernete's replicaset) & ensure it's healthy
reroute the traffic of the service to the black server group by updating the load balancer's targets
disable the red server group
But in reality, at least when using it with kubernetes, step 2 here seems to map to several steps:
add black targets to the load balancer
remove red targets from the load balancer
Therefore, i get 2 versions serving traffic for a minute here.
To my understanding, blue green can be achieved in kubernetes by updating the service (load balancer) 's pods selector, so i'm confused as for why spinnaker's kubernetes driver does not seem to leverage this.
Can anybody help me see what i'm missing here ?
Thanks
Can you verify if the deployment isn't still in the phase of rolling out? It can be that your spinacker setup just spins up a new version of the current deployment. If this is the case your deployment will doe a rolling upgrade withe the max surge you provided or the default one and that's why you have 2 versions running at the same time.
If I'm not mistaken, most of the people that doe blue/green deploys have 2 separated networks (for example with flannel) and just spin up a new deployment that gets switched either gradually or instant via their ingress controllers.
I've been trying to figure out what happens when the Kubernetes master fails in a cluster that only has one master. Do web requests still get routed to pods if this happens, or does the entire system just shut down?
According to the OpenShift 3 documentation, which is built on top of Kubernetes, (https://docs.openshift.com/enterprise/3.2/architecture/infrastructure_components/kubernetes_infrastructure.html), if a master fails, nodes continue to function properly, but the system looses its ability to manage pods. Is this the same for vanilla Kubernetes?
In typical setups, the master nodes run both the API and etcd and are either largely or fully responsible for managing the underlying cloud infrastructure. When they are offline or degraded, the API will be offline or degraded.
In the event that they, etcd, or the API are fully offline, the cluster ceases to be a cluster and is instead a bunch of ad-hoc nodes for this period. The cluster will not be able to respond to node failures, create new resources, move pods to new nodes, etc. Until both:
Enough etcd instances are back online to form a quorum and make progress (for a visual explanation of how this works and what these terms mean, see this page).
At least one API server can service requests
In a partially degraded state, the API server may be able to respond to requests that only read data.
However, in any case, life for applications will continue as normal unless nodes are rebooted, or there is a dramatic failure of some sort during this time, because TCP/ UDP services, load balancers, DNS, the dashboard, etc. Should all continue to function for at least some time. Eventually, these things will all fail on different timescales. In single master setups or complete API failure, DNS failure will probably happen first as caches expire (on the order of minutes, though the exact timing is configurable, see the coredns cache plugin documentation). This is a good reason to consider a multi-master setup–DNS and service routing can continue to function indefinitely in a degraded state, even if etcd can no longer make progress.
There are actions that you could take as an operator which would accelerate failures, especially in a fully degraded state. For instance, rebooting a node would cause DNS queries and in fact probably all pod and service networking functionality until at least one master comes back online. Restarting DNS pods or kube-proxy would also be bad.
If you'd like to test this out yourself, I recommend kubeadm-dind-cluster, kind or, for more exotic setups, kubeadm on VMs or bare metal. Note: kubectl proxy will not work during API failure, as that routes traffic through the master(s).
Kubernetes cluster without a master is like a company running without a Manager.
No one else can instruct the workers(k8s components) other than the Manager(master node)(even you, the owner of the cluster, can only instruct the Manager)
Everything works as usual. Until the work is finished or something stopped them.(because the master node died after assigning the works)
As there is no Manager to re-assign any work for them, the workers will wait and wait until the Manager comes back.
The best practice is to assign multiple managers(master) to your cluster.
Although your data plane and running applications does not immediately starts breaking but there are several scenarios where cluster admins will wish they had multi-master setup. Key to understanding the impact would be understanding which all components talk to master for what and how and more importantly when will they fail if master fails.
Although your application pods running on data plane will not get immediately impacted but imagine a very possible scenario - your traffic suddenly surged and your horizontal pod autoscaler kicked in. The autoscaling would not work as Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and vertical pod autoscaler ( but your API server is already dead).If your pod memory shoots up because of high load then it will eventually lead to getting killed by k8s OOM killer. If any of the pods die, then since controller manager and scheduler talks to API Server to watch for current state of pods so they too will fail. In short a new pod will not be scheduled and your application may stop responding.
One thing to highlight is that Kubernetes system components communicate only with the API server. They don’t
talk to each other directly and so their functionality themselves could fail I guess. Unavailable master plane can mean several things - failure of any or all of these components - API server,etcd, kube scheduler, controller manager or worst the entire node had crashed.
If API server is unavailable - no one can use kubectl as generally all commands talk to API server ( meaning you cannot connect to cluster, cannot login into any pods to check anything on container file system. You will not be able to see application logs unless you have any additional centralized log management system).
If etcd database failed or got corrupted - your entire cluster state data is gone and the admins would want to restore it from backups as early as possible.
In short - a failed single master control plane although may not immediately impact traffic serving capability but cannot be relied on for serving your traffic.