Kubernetes Deployment with Zero Down Time - kubernetes

As a leaner of Kubernetes concepts, their working, and deployment with it. I have a couple of cases which I don't know how to achieve. I am looking for advice or some guideline to achieve it.
I am using the Google Cloud Platform. The current running flow is described below. A push to the google source repository triggers Cloud Build which creates a docker image and pushes the image to the running cluster nodes.
Case 1: Now I want that when new pods are up and running. Then traffic is routed to the new pods. Kill old pod but after each pod complete their running request. Zero downtime is what I'm looking to achieve.
Case 2: What will happen if the space of running pod reaches 100 and in the Debian case that the inode count reaches full capacity. Will kubernetes create new pods to manage?
Case 3: How to manage pod to database connection limits?

Like the other answer use Liveness and Readiness probes. Basically, a new pod is added to the service pool then it will only serve traffic after the readiness probe has passed. The old pod is removed from the Service pool, then drained and then terminated. This happens on a rolling fashion one pod at a time.
This really depends on the capacity of your cluster and the ability to schedule pods depending on the limits for the containers in them. For more about setting up limits for containers refer to here. In terms of the inode limit, if you reach it on a node, the kubelet won't be able to run any more pods on that node. The kubelet eviction manager also has a mechanism in where evicts some pods using the most inodes. You can also configure your eviction thresholds on the kubelet.
This would be more a limitation at the OS level combined your stateful application configuration. You can keep this configuration in a ConfigMap. And for example in something for MySql the option would be max_connections.

I can answer case 1 since Ive done it myself.
Use Deployments with readinessProbes & livelinessProbes

Related

Advantage of multiple pod on same node

I'm new with Kubernetes, i'm testing with Minikube locally. I need some advice with Kubernetes's horizontal scaling.
In the following scenario :
Cluster composed of only 1 node
There is only 1 pod on this node
Only one application running on this pod
Is there a benefit of deploying new pod on this node only to scale my application ?
If i understand correctly, pod are sharing the system's resources. So if i deploy 2 pods instead of 1 on the same node, there will be no performance increase.
There will be no availability increase either, because if the node fails, the two pods will also shut.
Am i right about my two previous statements ?
Thanks
Yes, you are right. Pods on the same node are anyhow utilizing the same CPU and Memory resources and therefore are expected to go down in event of node failure.
But, you need to consider it at pod level also. There can be situation where the pod itself gets failed but node is working fine. In such cases, multiple pods can help you serve better and make application highly available.
From performance perspective also, more number of pods can serve requests faster, thereby dropping down latency issues for your application.

Is it possible to schedule a pod to run for say 24 hours and then remove deployment/statefulset? or need to use jobs?

We have a bunch of pods running in dev environment. The pods are auto-provisioned by an application on every business action. The problem is that across various namespaces they are accumulating and eating available resources in EKS.
Is there a way without jenkins/k8s jobs to simply put some parameter on the pod manifest to tell it to self destruct say in 24 hours?
Add to your pod.spec:
activeDeadlineSeconds: 86400
After deadline your Pod will be stopped for good with the status DeadlineExceeded
If I understood your situation properly, you would like to scale your cluster down in order to save resources.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
Last, but not least, you can use tool like Ansible to manage all your kubernetes assets (it can create/manage deployments via playbooks).
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics

How to know how long it takes to create a pod in Kubernetes ? Is there any command please?

I have my Kubernetes cluster and I need to know how long it takes to create a pod? Is there any Kubernetes command show me that ?
Thanks in advance
What you are asking for is not existing.
I think you should first understand the Pod Overview.
A Pod is the basic building block of Kubernetes–the smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents a running process on your cluster.
A Pod encapsulates an application container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.
While you are deploying a POD it's going through phases
Pending
The Pod has been accepted by the Kubernetes system, but one or more of the Container images has not been created. This includes time before being scheduled as well as time spent downloading images over the network, which could take a while.
Running
The Pod has been bound to a node, and all of the Containers have been created. At least one Container is still running, or is in the process of starting or restarting.
Succeeded
All Containers in the Pod have terminated in success, and will not be restarted.
Failed
All Containers in the Pod have terminated, and at least one Container has terminated in failure. That is, the Container either exited with non-zero status or was terminated by the system.
Unknown
For some reason the state of the Pod could not be obtained, typically due to an error in communicating with the
host of the Pod.
As for Pod Conditions it have a type which can have following values:
PodScheduled: the Pod has been scheduled to a node;
Ready: the Pod is able to serve requests and should be added to the load balancing pools of all matching Services;
Initialized: all init containers have started successfully;
Unschedulable: the scheduler cannot schedule the Pod right now, for example due to lacking of resources or other constraints;
ContainersReady: all containers in the Pod are ready.
Please refer to the documentation regarding Pod Lifecycle for more information.
When you are deploying your POD, you have to consider how many containers will be running in it.
The image will have to be downloaded, depending on the size it might take longer. Also default pull policy is IfNotPresent, which means that Kubernetes will skip the image pull if it already exists.
You can find more about Updating Images can be found here.
You also need to consider how much resources your Master and Node has.

What makes a kubernetes node unhealthy?

We've experienced 4 AUTO_REPAIR_NODES events(revealed by the command gcloud container operations list) on our GKE cluster during the past 1 month. The consequence of node-auto-repair is that the node gets recreated and gets attached a new external IP, and the new external IP, which was not whitelisted by third-party services, eventually caused failure of services running on that the new node.
I noticed that we have "Automatic node repair" enabled in our Kubernetes cluster and felt tempted to disable that, but before I do that, I need to know more about the situation.
My questions are:
What are some common causes that makes a node unhealthy in the first place? I'm aware of this article https://cloud.google.com/kubernetes-engine/docs/how-to/node-auto-repair#node_repair_process which says, "a node reports a NotReady status on consecutive checks over the given time threshold" would trigger auto repair. But what could cause a node to become NotReady?
I'm also aware of this article https://kubernetes.io/docs/concepts/architecture/nodes/#node-status which mentions the full list of node status: {OutOfDisk, Ready, MemoryPressure, PIDPressure, DiskPressure, NetworkUnavailable, ConfigOK}. I wonder, if any of {OutOfDisk, MemoryPressure, PIDPressure, DiskPressure, NetworkUnavailable} becomes true for a node, would that node becomes NotReady?
What negative consequences could I get after I disable "Automatic node repair" in the cluster? I'm basically wondering whether we could end up in a worse situation than auto-repaired nodes and newly-attached-not-whitelisted IP. Once "Automatic node repair" is disabled, then for the pods that are running on an Unhealthy node that would've been auto-repaired, would Kubernetes create new pods on other nodes?
The confusion lies here in that there are 'Ready' and 'NotReady' states that are shown when you run kubectl get nodes which are reported by the kube-apiserver. But these are independent and unclear from the docs how they relate to the kubelet states described here
You can also see the kubelet states (in events) when you run kubectl describe nodes
To answer some parts of the questions:
As reported by the kube-apiserver
Kubelet down
docker or containerd or crio down (depending on the shim you are using)
kubelet states - unclear.
For these, the kubelet will start evicting or not scheduling pods except for Ready (https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/). Unclear from the docs how these get reported from the kubeapi-server.
You could have nodes on your cluster not being used and you'd be paying for that usage.
Yes, k8s will reschedule the pods after a certain readiness probes fail (configurable). If the kubelet is down or the node down k8s will think the pods are down.
Assuming your nodes go down, you could end up with less capacity than what you need to schedule your workloads to k8s would not be able to schedule them anyway.
Hope it helps!
Not my answer, but this answer on SF points in the right direction, regarding using a NAT gateway and whitelisting that IP
https://serverfault.com/a/930963/429795

Kubernetes automatic shutdown after some idle time

Does kubernetes or Helm support shut down the pods if it is idle for more than a given threshold time?
This would be very useful in the development environment, to provide room for other processes to consume it and save cost.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics provided by Heapster application, that must be run in the cluster. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics