Speed up rollout of Kubernetes deployment - kubernetes

How can I speedup the rollout of new images in Kubernetes?
Currently, we have an automated build job that modifies a yaml file to point to a new revision and then runs kubectl apply on it.
It works, but it takes long delays (up to 20 minutes PER POD) before all pods with the previous revision are replaced with the latest.
Also, the deployment is configured for 3 replicas. We see one pod at a time is started with the new revision. (Is this the Kubernetes "surge" ?) But that is too slow, I would rather kill all 3 pods and have 3 new ones with the new image.

I would rather kill all 3 pods and have 3 new ones with the new image.
You can do that. Set strategy.type: to Recreate instead of the default RollingUpdate in your Deployment. See strategy.
But you probably get some downtime during deployment.

Jonas and SYN are right but I would like to expand this topic with some additional info and examples.
You have two types of strategies to choose from when specifying the way of updating your deployments:
Recreate Deployment: All existing Pods are killed before new ones are created.
Rolling Update Deployment: The Deployment updates Pods in a rolling update fashion.
The default and more recommended one is the .spec.strategy.type==RollingUpdate. See the examples below:
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
In this example there would be one additional Pod (maxSurge: 1) above the desired number of 3, and the number of available Pods cannot go lower than that number (maxUnavailable: 0).
Choosing this config, the Kubernetes will spin up an additional Pod, then stop an “old” one. If there’s another Node available to deploy this Pod, the system will be able to handle the same workload during deployment. If not, the Pod will be deployed on an already used Node at the cost of resources from other Pods hosted on the same Node.
You can also try something like this:
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
With the example above there would be no additional Pods (maxSurge: 0) and only a single Pod at a time would be unavailable (maxUnavailable: 1).
In this case, Kubernetes will first stop a Pod before starting up a new one. The advantage of that is that the infrastructure doesn’t need to scale up but the maximum workload will be less.
If you chose to use the percentage values for maxSurge and maxUnavailable you need to remember that:
maxSurge - the absolute number is calculated from the percentage by rounding up
maxUnavailable - the absolute number is calculated from percentage by rounding down
With the RollingUpdate defined correctly you also have to make sure your applications provide endpoints to be queried by Kubernetes that return the app’s status. Below it's a /greeting endpoint, that returns an HTTP 200 status when it’s ready to handle requests, and HTTP 500 when it’s not:
readinessProbe:
httpGet:
path: /greeting
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
initialDelaySeconds - Time (in seconds) before the first check for readiness is done.
periodSeconds - Time (in seconds) between two readiness checks after the first one.
successThreshold - Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
timeoutSeconds - Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
More on the topic of liveness/readiness probes can be found here.

Try setting the spec.strategy.rollingUpdate.maxUnavailable (keeping the spec.strategy.type to RollingUpdate).
Setting it to 2, the first two containers should be re-deployed together, keeping service running on the third one. Or go with 3, if you don't care.
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable

Related

kubernetes, have 2 replicas of the pod

I have a front end app on a POD called. Right now, when I deploy and the POD rebuilds, the site content on that PODF will be some minutes down, it will be down all the time that takes to the POD to rebuild, so I need to create a replica of the POD for:
the one which is active and will be available for users while the new one:
a new one that is creating and will replace the active one.
Currently I have this values in my config:
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
What should I change for have those 2 replicas?
maxUnavailable should be 0?
What else? I couldn't find much info about how to have 2 replicas.
In the rolling update, you have the two variables:
maxSurge: The number of pods that can be created above the desired amount of pods during an update
maxUnavailable: The number of pods that can be unavailable during the update process
So if you want to keep (the number of available replicas >= the replicas count) at any given moment, you can set maxUnavailable to 0.

How to scale up all OpenShift pods before scaling down old ones

I have a basic OpenShift deployment configuration:
kind: DeploymentConfig
spec:
replicas: 3
strategy:
type: Rolling
Additionaly I've put:
maxSurge: 3
maxUnavailable: 0%
because I want to scale up all new pods first and after that scale down old pods (so there will be 6 pods running during deploymentm that's why I decided to set up maxSurge).
I want to have all old pods running until all new pods are up but with this set of parameters there is something wrong. During deployment:
all 3 new pods are initialized at once and are trying to start, old pods are running (as expected)
if first new pod started sucessfully then the old one is terminated
if second new pod is ready then another old pod is terminated
I want to terminate all old pods ONLY if all new pods are ready to handle requests, otherwise all the old pods should handle requests.
What did I miss in this confgiuration?
The behavior you document is expected for a deployment rollout (that OpenShift will shut down each old pod as a new pod becomes ready). It will also start routing traffic to the new nodes as they become available, which you say that you don't want either.
A service is pretty much by definition going to route to pods as they are available. And a deployment pretty much handles pods independently, so I don't believe that anything will really give you the behavior you are looking for there either.
If you want a blue green style deployment like you describe, you are essentially going to have deploy the new pods as a separate deployment. Then once the new deployment is completely up, you can change the corresponding service to point at the new pods. Then you can shut down the old deployment.
Service Mesh can help with some of that. So could an operator. Or you could do it manually.
You can combine the rollout strategy with readiness checks with an initial delay to ensure that all the new pods have time to start up before the old ones are all shut down at the same time.
In the case below, the new 3 pods will be spun up (for a total of 6 pods) and then after 60 seconds, the readiness check will occur and the old pods will be shut down. You would just want to adjust your readiness delay to a large enough timeframe to give all of your new pods time to start up.
apiVersion: v1
kind: DeploymentConfig
spec:
replicas: 3
strategy:
rollingParams:
maxSurge: 3
maxUnavailable: 0
type: Rolling
template:
spec:
containers:
- readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8099
initialDelaySeconds: 60

How to handle Rolling updates deployment strategy for RWO volume service?

We have hosted service in AKS which has RWO volumes with Deployment strategy as Recreate.
We recently went live with this new service and we have many features/issues to be delivered everyday. Since the deployment strategy is Recreate, business team is experiencing some down time (2 min max) but it is annoying. Is there a better approach to manage RWO volumes with rolling update strategy ?
You have two types of strategies to choose from when specifying the way of updating your deployments:
Recreate Deployment: All existing Pods are killed before new ones are created.
Rolling Update Deployment: The Deployment updates Pods in a rolling update fashion.
The default and more recommended one is the .spec.strategy.type==RollingUpdate. See the examples below:
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
In this example there would be one additional Pod (maxSurge: 1) above the desired number of 2, and the number of available Pods cannot go lower than that number (maxUnavailable: 0).
Choosing this config, the Kubernetes will spin up an additional Pod, then stop an “old” one. If there’s another Node available to deploy this Pod, the system will be able to handle the same workload during deployment. If not, the Pod will be deployed on an already used Node at the cost of resources from other Pods hosted on the same Node.
You can also try something like this:
spec:
replicas: 2
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0
maxUnavailable: 1
With the example above there would be no additional Pods (maxSurge: 0) and only a single Pod at a time would be unavailable (maxUnavailable: 1).
In this case, Kubernetes will first stop a Pod before starting up a new one. The advantage of that is that the infrastructure doesn’t need to scale up but the maximum workload will be less.
If you chose to use the percentage values for maxSurge and maxUnavailable you need to remember that:
maxSurge - the absolute number is calculated from the percentage by rounding up
maxUnavailable - the absolute number is calculated from percentage by rounding down
With the RollingUpdate defined correctly you also have to make sure your applications provide endpoints to be queried by Kubernetes that return the app’s status. Below it's a /greeting endpoint, that returns an HTTP 200 status when it’s ready to handle requests, and HTTP 500 when it’s not:
readinessProbe:
httpGet:
path: /greeting
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 1
initialDelaySeconds - Time (in seconds) before the first check for readiness is done.
periodSeconds - Time (in seconds) between two readiness checks after the first one.
successThreshold - Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
timeoutSeconds - Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
More on the topic of liveness/readiness probes can be found here.
These are only examples but they should give you the idea of that particular update strategy that could be used in order to eliminate the possibility of downtime.

Does Kubernetes support green-blue deployment?

I would like to ask on the mechanism for stopping the pods in kubernetes.
I read https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods before ask the question.
Supposably we have a application with gracefully shutdown support
(for example we use simple http server on Go https://play.golang.org/p/5tmkPPMiSSt).
Server has two endpoints:
/fast, always send 200 http status code.
/slow, wait 10 seconds and send 200 http status code.
There is deployment/service resource with that configuration:
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
spec:
replicas: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
selector:
matchLabels:
app/name: test
template:
metadata:
labels:
app/name: test
spec:
terminationGracePeriodSeconds: 120
containers:
- name: service
image: host.org/images/grace:v0.1
livenessProbe:
httpGet:
path: /health
port: 10002
failureThreshold: 1
initialDelaySeconds: 1
readinessProbe:
httpGet:
path: /health
port: 10002
failureThreshold: 1
initialDelaySeconds: 1
---
apiVersion: v1
kind: Service
metadata:
name: test
spec:
type: NodePort
ports:
- name: http
port: 10002
targetPort: 10002
selector:
app/name: test
To make sure the pods deleted gracefully I conducted two test options.
First option (slow endpoint) flow:
Create deployment with replicas value equal 1.
Wait for pod readness.
Send request on /slow endpoint (curl http://ip-of-some-node:nodePort/slow) and delete pod (simultaneously, with 1 second out of sync).
Expected:
Pod must not end before http server completed my request.
Got:
Yes, http server process in 10 seconds and return response for me.
(if we pass --grace-period=1 option to kubectl, then curl will write - curl: (52) Empty reply from server)
Everything works as expected.
Second option (fast endpoint) flow:
Create deployment with replicas value equal 10.
Wait for pods readness.
Start wrk with "Connection: close" header.
Randomly delete one or two pods (kubectl delete pod/xxx).
Expected:
No socket errors.
Got:
$ wrk -d 2m --header "Connection: Close" http://ip-of-some-node:nodePort/fast
Running 2m test # http://ip-of-some-node:nodePort/fast
Thread Stats Avg Stdev Max +/- Stdev
Latency 122.35ms 177.30ms 1.98s 91.33%
Req/Sec 66.98 33.93 160.00 65.83%
15890 requests in 2.00m, 1.83MB read
Socket errors: connect 0, read 15, write 0, timeout 0
Requests/sec: 132.34
Transfer/sec: 15.64KB
15 socket errors on read, that is, some pods were disconnected from the service before all requests were processed (maybe).
The problem appears when a new deployment version is applied, scale down and rollout undo.
Questions:
What's reason of that behavior?
How to fix it?
Kubernetes version: v1.16.2
Edit 1.
The number of errors changes each time, but remains in the range of 10-20, when removing 2-5 pods in two minutes.
P.S. If we will not delete a pod, we don't got errors.
Does Kubernetes support green-blue deployment?
Yes, it does. You can read about it on Zero-downtime Deployment in Kubernetes with Jenkins,
A blue/green deployment is a change management strategy for releasing software code. Blue/green deployments, which may also be referred to as A/B deployments require two identical hardware environments that are configured exactly the same way. While one environment is active and serving end users, the other environment remains idle.
Container technology offers a stand-alone environment to run the desired service, which makes it super easy to create identical environments as required in the blue/green deployment. The loosely coupled Services - ReplicaSets, and the label/selector-based service routing in Kubernetes make it easy to switch between different backend environments.
I would also recommend reading Kubernetes Infrastructure Blue/Green deployments.
Here is a repository with examples from codefresh.io about blue green deployment.
This repository holds a bash script that allows you to perform blue/green deployments on a Kubernetes cluster. See also the respective blog post
Prerequisites
As a convention the script expects
The name of your deployment to be $APP_NAME-$VERSION
Your deployment should have a label that shows it version
Your service should point to the deployment by using a version selector, pointing to the corresponding label in the deployment
Notice that the new color deployment created by the script will follow the same conventions. This way each subsequent pipeline you run will work in the same manner.
You can see examples of the tags with the sample application:
service
deployment
You might be also interested in Canary deployment:
Another deployment strategy is using Canaries (a.k.a. incremental rollouts). With canaries, the new version of the application is gradually deployed to the Kubernetes cluster while getting a very small amount of live traffic (i.e. a subset of live users are connecting to the new version while the rest are still using the previous version).
...
The small subset of live traffic to the new version acts as an early warning for potential problems that might be present in the new code. As our confidence increases, more canaries are created and more users are now connecting to the updated version. In the end, all live traffic goes to canaries, and thus the canary version becomes the new “production version”.
EDIT
Questions:
What's reason of that behavior?
When new deployment is being applied old pods are being removed and new ones are being scheduled.
This is being done by Control Plan
For example, when you use the Kubernetes API to create a Deployment, you provide a new desired state for the system. The Kubernetes Control Plane records that object creation, and carries out your instructions by starting the required applications and scheduling them to cluster nodes–thus making the cluster’s actual state match the desired state.
You have only setup a readinessProbe, which tells your service if it should send traffic to the pod or not. This is not a good solution as like you can see in your example if you have 10 pods and remove one or two there is a gap and you receive socket error.
How to fix it?
You have to understand this is not broken so it doesn't need a fix.
This might be mitigated by implementing a check in your application to make sure it's sending request to working address or utilize other features like load balancing like ingress.
Also when you are updating deployment you can do checks before deleting the pod to check if it does have any traffic incoming/outgoing and roll the update to only not used pods.

Kubernetes: Limit number of simultaneous deployments

Is there a way to limit the number of deployments a Kubernetes cluster will implement at once? With rolling deployments and 100% uptime, it's possible that updating all deployments at once could overload the nodes.
I know it is possible to limit the number of pods deployed per-namespace, but i was wondering if it is also possible to limit simultaneous deployments in a similar way. Say, for example, maximum of 10 deployments at once.
I could probably script out a limit to the number of deployments I send to the k8s API at once, but it would be nice if there was a setting I could use instead.
The first thing coming to my mind is to use resource limits and requests to make sure you're not overloading the cluster. This way, even if you update all the deployments, some pods will be in "pending" state until other deployments are successfully updated.
This solution can be helpful
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 50%
template:
---
Setup RollingUpdate in your deployment.
rollingUpdate:
maxUnavailable: 50%
// The maximum number of pods that can be unavailable during the update.
// Value can be an absolute number (ex: 5) or a percentage of desired pods (ex: 10%).
// Absolute number is calculated from percentage by rounding down.
// This can not be 0 if MaxSurge is 0.
// Defaults to 25%.
// Example: when this is set to 30%, the old RC can be scaled down to 70% of desired pods
// immediately when the rolling update starts. Once new pods are ready, old RC
// can be scaled down further, followed by scaling up the new RC, ensuring
// that the total number of pods available at all times during the update is at
// least 70% of desired pods.
In this way you can limit simultaneous deployments.
I have not tried this, but I assume that one can do this with an Admission controller. Set up an ExternalAdmissionWebhook:
https://kubernetes.io/docs/admin/extensible-admission-controllers/#external-admission-webhooks
When it receives an admissionReview request for a Deployment object, check the count and state of Deployments through the API, and reject the request if the criteria for concurrent Deployments are exceeded.
If you set resource requests on all your pods k8 will queue pods if resources are saturated