Is it possible to schedule a pod to run for say 24 hours and then remove deployment/statefulset? or need to use jobs? - kubernetes

We have a bunch of pods running in dev environment. The pods are auto-provisioned by an application on every business action. The problem is that across various namespaces they are accumulating and eating available resources in EKS.
Is there a way without jenkins/k8s jobs to simply put some parameter on the pod manifest to tell it to self destruct say in 24 hours?

Add to your pod.spec:
activeDeadlineSeconds: 86400
After deadline your Pod will be stopped for good with the status DeadlineExceeded

If I understood your situation properly, you would like to scale your cluster down in order to save resources.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
Last, but not least, you can use tool like Ansible to manage all your kubernetes assets (it can create/manage deployments via playbooks).
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics

Related

Is it possible to use Kubernetes autoscale on cron job pods

Some context: I have multiple cron jobs running daily, weekly, hourly and some of which require significant processing power.
I would like to add requests and limitations to these container cron pods to try and enable vertical scaling and ensure that the assigned node will have enough capacity when being initialized. This will prevent me from having to have multiple large node available at all times and also letting me modify how many crons I can run in parallel easily.
I would like to try and avoid timed scaling since the cron jobs processing time can increase as the application grows.
Edit - Additional Information :
Currently I am using Digital Ocean and utilizing it's UI for cluster autoscaling. I have it working with HPA's on deployments but not crons. Adding limits to crons does not trigger cluster autoscaling to my knowledge.
I have tried to enable HPA scaling with the cron but with no success. Basically it just sits on a pending status signalling that there is insufficient CPU available and does not generate a new node.
Does HPA scaling work with cron job pods and is there a way to achieve the same type of scaling?
HPA is used to scale more pods when pod loads are high, but this won't increase the resources on your cluster.
I think you're looking for cluster autoscaler (works on AWS, GKE and Azure) and will increase cluster capacity when pods can't be scheduled.
This is a Community Wiki answer so feel free to edit it and add any additional details you consider important.
As Dom already mentioned "this won't increase the resources on your cluster." Saying more specifically, it won't create an additional node as Horizontal Pod Autoscaler doesn't have such capability and in fact it has nothing to do with cluster scaling. It's name is pretty self-exlpanatory. HPA is able only to scale Pods and it scales them horizontaly, in other words it can automatically increase or decrease number of replicas of your "replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics)" as per the docs.
As to cluster autoscaling, as already said by Dom, such solutions are implemented in so called managed kubernetes solutions such as GKE on GCP, EKS on AWS or AKS on Azure and many more. You typically don't need to do anything to enable them as they are available out of the box.
You may wonder how HPA and CA fit together. It's really well explained in FAQ section of the Cluster Autoscaler project:
How does Horizontal Pod Autoscaler work with Cluster Autoscaler?
Horizontal Pod Autoscaler changes the deployment's or replicaset's
number of replicas based on the current CPU load. If the load
increases, HPA will create new replicas, for which there may or may
not be enough space in the cluster. If there are not enough resources,
CA will try to bring up some nodes, so that the HPA-created pods have
a place to run. If the load decreases, HPA will stop some of the
replicas. As a result, some nodes may become underutilized or
completely empty, and then CA will terminate such unneeded nodes.

Kubernetes Deployment with Zero Down Time

As a leaner of Kubernetes concepts, their working, and deployment with it. I have a couple of cases which I don't know how to achieve. I am looking for advice or some guideline to achieve it.
I am using the Google Cloud Platform. The current running flow is described below. A push to the google source repository triggers Cloud Build which creates a docker image and pushes the image to the running cluster nodes.
Case 1: Now I want that when new pods are up and running. Then traffic is routed to the new pods. Kill old pod but after each pod complete their running request. Zero downtime is what I'm looking to achieve.
Case 2: What will happen if the space of running pod reaches 100 and in the Debian case that the inode count reaches full capacity. Will kubernetes create new pods to manage?
Case 3: How to manage pod to database connection limits?
Like the other answer use Liveness and Readiness probes. Basically, a new pod is added to the service pool then it will only serve traffic after the readiness probe has passed. The old pod is removed from the Service pool, then drained and then terminated. This happens on a rolling fashion one pod at a time.
This really depends on the capacity of your cluster and the ability to schedule pods depending on the limits for the containers in them. For more about setting up limits for containers refer to here. In terms of the inode limit, if you reach it on a node, the kubelet won't be able to run any more pods on that node. The kubelet eviction manager also has a mechanism in where evicts some pods using the most inodes. You can also configure your eviction thresholds on the kubelet.
This would be more a limitation at the OS level combined your stateful application configuration. You can keep this configuration in a ConfigMap. And for example in something for MySql the option would be max_connections.
I can answer case 1 since Ive done it myself.
Use Deployments with readinessProbes & livelinessProbes

Kubernetes autoscaler: change min replica on specific day

I'm running a Kubernetes cluster (GCP) with 10 deployments. Each deployment is configured to auto scale on stress.
From my website statistics, I found that Monday is the day with the most load. I want to define Kubernetes deployment to have more min-replicas on this day.
Is this possible?
I read somewhere that I can run a cronjob script before and after this day and change the minimum number of machines. Is that the current way to do it? Is this safe? what if the cronjob wasn't fired? If this is the current way, please link me for some instruction how to do it.
Thanks!
You seem to be talking of two things here.
Pod autoscaling (Add more pods when load on existing pods increases) : HPA will help with this. If your workloads show a spike in CPU or memory and can handle horizontal scaling, then HPA would work fine.
Example : https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/
Now HPA can increase pods only if the cluster has enough nodes to schedule them.
If it is desired to have more nodes with more traffic and reduce them when traffic is low, a cluster autoscaler could be a good option.
https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Ofcourse, the scaling of nodes is not instantaneous as the autoscaler watches for pods which are in Pending state due to resource level constraints. After this, it requests additional nodes from the cloud provider and once these nodes join the cluster, workloads get scheduled.

GKE Cluster Autoscaler pre creates nodes on time base

I'm running a Kubernetes cluster on Google Cloud. Master version is 1.8.9-gke.1, nodes versions are the same, cluster autoscaler is enabled.
I started to notice that nodes are being created without any pending pods. Those nodes can then sit without running pods for 20-30 minutes, then workload will be allocated to them. This is usually happening before peak hours and looks like cluster is trying to predict load base on load in the past.
Is it something that Google Cloud Cluster Autoscaler managing or is it Kubernetes feature? Is it configurable? I was trying to find any clue in the documentation, but without luck.
It sounds like something the Cluster Autoscaler would do.
Go to Stackdriver Logging and query with advanced filter:
resource.type="k8s_cluster"
resource.labels.cluster_name="<your_cluster_name>"
resource.labels.location="<your_cluster_zone>"
protoPayload.methodName="io.k8s.core.v1.nodes.update"
(The last line alone might suffice). I think this should be the way to get the autoscaler logs. If this does not work, let me know.

Kubernetes automatic shutdown after some idle time

Does kubernetes or Helm support shut down the pods if it is idle for more than a given threshold time?
This would be very useful in the development environment, to provide room for other processes to consume it and save cost.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics provided by Heapster application, that must be run in the cluster. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics