Preemptive Cloud Run on GKE - kubernetes

Is it possible to create a Cloud Run on GKE (Anthos) Kubernetes Cluster with Preemptible nodes and if so can you also enable plugins such as gke-node-pool-shifter and gke-pvm-killer or will it interfere with cloud run actions such as autoscaling pods
https://hub.helm.sh/charts/rimusz/gke-node-pool-shifter
https://hub.helm.sh/charts/rimusz/gke-pvm-killer

Technically a Cloud Run on GKE cluster is still a GKE cluster at the end of the day, so it can have preemptive node pools.
However, some Knative Serving components, such as the activator and autoscaler are in the hot path of serving the requests. You need to make sure they don't end up in a preemptible pool. Similarly, the controller and webhook are somewhat central to the control plane lifecycle of Knative API objects, so you also need to make sure these pods end up in a non-preemptible node pool.
Secondly, Knative (for now) does not support node selectors or taints/tolerations: https://knative.tips/pod-config/node-affinity/ It simply doesn't give you a way to specify nodeSelector or other affinity fields in the Pod template of Knative Service object.
Therefore, you gotta find out a way (like implementing your mutating admission webhook for Knative-created pods) to add such node selectors to the Pods, which is quite tedious.
However, by combining node taints and pd tolerations, I think you can have Knative system components end up in a non-preemptible pool, and everything else (i.e. Knative-created pods) in other nodes (i.e. preemptible nodes).

Related

Why there is no concept of nodepool in Kubernetes?

I can see GKE, AKS, EKS all are having nodepool concepts inbuilt but Kubernetes itself doesn't provide that support. What could be the reason behind this?
We usually need different Node types for different requirements such as below-
Some pods require either CPU or Memory intensive and optimized nodes.
Some pods are processing ML/AI algorithms and need GPU-enabled nodes. These GPU-enabled nodes should be used only by certain pods as they are expensive.
Some pods/jobs want to leverage spot/preemptible nodes to reduce the cost.
Is there any specific reason behind Kubernetes not having inbuilt such support?
Node Pools are cloud-provider specific technologies/groupings.
Kubernetes is intended to be deployed on various infrastructures, including on-prem/bare metal. Node Pools would not mean anything in this case.
Node Pools generally are a way to provide Kubernetes with a group of identically configured nodes to use in the cluster.
You would specify the node you want using node selectors and/or taints/tolerations.
So you could taint nodes with a GPU and then require pods to have the matching toleration in order to schedule onto those nodes. Node Pools wouldn't make a difference here. You could join a physical server to the cluster and taint that node in exactly the same way -- Kubernetes would not see that any differently to a Google, Amazon or Azure-based node that was also registered to the cluster, other than some different annotations on the node.
As Blender Fox mentioned Node group is more specific to Cloud provider Grouping/Target options.
In AWS we have Node groups or Target groups, While in GKE Managed/Unmanaged node groups.
You set the Cluster Autoscaler and it scales up & down the count in the Node pool or Node groups.
If you are running Kubernetes on On-prem there may not be the option of a Node pool, as the Node group is mostly a group of VM in the Cloud. While on the on-prem bare metal machines also work as Worker Nodes.
To scale up & Down there is Cluster autoscaler(CA adds or removes nodes from the cluster by creating/deleting VMs) in K8s which uses the Cloud provider node group API while on Bare metal it may not work simply.
Each provider have own implementation and logic which get determined from K8s side by flag --cloud-provider Code link
So if you are on On-prem private cloud write your own cloud client and interface.
It's not necessary to have to node group however it's more of Cloud provider side implementation.
For Scenario
Some pods require either CPU or Memory intensive and optimized nodes.
Some pods are processing ML/AI algorithms and need GPU-enabled nodes.
These GPU-enabled nodes should be used only by certain pods as they
are expensive. Some pods/jobs want to leverage spot/preemptible nodes
to reduce the cost.
You can use the Taints-toleration, Affinity, or Node selectors as per need to schedule the POD on the specific type of Nodes.

EKS Fargate pod isolation

In ECS with Fargate, we can manage service isolation via security group. However that is no longer the case with EKS on Fargate.
Is there a way where pods on the same cluster can be isolated from each other like a Network Policy? I know this is possible with kubernetes but it needs to be implemented by the network plugin. Tried to install the network provider listed here without success as it needs daemonset (limitation of eks fargate: Cannot run Daemonsets, Privileged pods, or pods that use HostNetwork or HostPort.)
This is something we are tracking in this roadmap item. There isn't a viable workaround for now. As you pointed out when using EC2 we'd suggest to use the Calico network policy engine but with Fargate there is no DaemonSet support and it can't be used.
Given the SG associated to a pod is defined at the cluster level, one way to try to mitigate this would be to spread like-pods across different clusters where the pod SG is configured for that specific type of workload BUT this will mean more work and higher control plane costs.

Is it possible to schedule a pod to run for say 24 hours and then remove deployment/statefulset? or need to use jobs?

We have a bunch of pods running in dev environment. The pods are auto-provisioned by an application on every business action. The problem is that across various namespaces they are accumulating and eating available resources in EKS.
Is there a way without jenkins/k8s jobs to simply put some parameter on the pod manifest to tell it to self destruct say in 24 hours?
Add to your pod.spec:
activeDeadlineSeconds: 86400
After deadline your Pod will be stopped for good with the status DeadlineExceeded
If I understood your situation properly, you would like to scale your cluster down in order to save resources.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
Last, but not least, you can use tool like Ansible to manage all your kubernetes assets (it can create/manage deployments via playbooks).
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics

How do managed Kubernetes providers hide the master nodes?

If I run kubectl get nodes on GKE, EKS, or DigitalOcean Kubernetes, I only see the worker nodes. How are these systems architected at the network or application level to create this separation between workers and masters?
You can run the Kubernetes control plane outside Kubernetes as long as the worker nodes have network access to the control plane. This approach is used on most managed Kubernetes solutions.
A Container Engine cluster is a group of Compute Engine instances running Kubernetes. It consists of one or more node instances, and a managed Kubernetes master endpoint.
Every container cluster has a single master endpoint, which is managed by Container Engine. The master provides a unified view into the cluster and, through its publicly-accessible endpoint, is the doorway for interacting with the cluster.
The managed master also runs the Kubernetes API server, which services REST requests, schedules pod creation and deletion on worker nodes, and synchronizes pod information (such as open ports and location) with service information.
More info can be found here

Kubernetes automatic shutdown after some idle time

Does kubernetes or Helm support shut down the pods if it is idle for more than a given threshold time?
This would be very useful in the development environment, to provide room for other processes to consume it and save cost.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics provided by Heapster application, that must be run in the cluster. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics