kubernetes - Why is there a max pods per node? - kubernetes

Why is there a pod limit in Kubernetes?
It makes intuitive sense to me that there'll be some limitation, but I'm curious to know the specific botteleneck which warrants the limit.

The default limit of 110 pods per node is merely a compromise of Kubernetes, I think, not a technical limit.
Some vendors have additional limitations.
For example, on Azure, there's a limit on the number of IP addresses you can assign to a node. So if your Kubernetes cluster is configured to assign a IP address from Azure VNet to each pod, the limit is 30 (See https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni#maximum-pods-per-node).
On IBM Cloud, if you use IBM Block Storage for persistent volumes, they will be mounted as 'Secondary volumes' on your node, and you can only have 12 of those per node, so that's limit of 12 pods with persistent volumes. It sucks when you hit that limit when scaling up the first time :-(
On other vendors or with other storage classes, this limit is larger: https://kubernetes.io/docs/concepts/storage/storage-limits/

Related

GKE Cluster with 0 Node and Autopilot Enabled

I use GKE for years and I wanted to experiment with GKE with AutoPilot mode, and my initial expectation was, it starts with 0 worker nodes, and whenever I deploy a workload, it automatically scales the nodes based on requested memory and CPU. However, I created a GKE Cluster, there is nothing related to nodes in UI, but in kubectl get nodes output I see there are 2 nodes. Do you have any idea how to start that cluster with no node initially?
The principle of GKE autopilot is NOT TO worry about the node, it's managed for you. No matter if there is 1, 2 or 10 node to your cluster, you don't pay for them, you pay only when a POD run in your cluster (CPU and Memory time usage).
So, you can't handle the number of node, number of pools and low level management like that, something similar to serverless product (Google prefers saying "nodeless" cluster)
At the opposite, it's great to already have resources provisioned that you don't pay on your cluster, you will deploy and scale quicker!
EDIT 1
You can have a look to the pricing. You have a flat fee of $74.40 per month ($0.10/hour) for the control plane. And then you pay your pods (CPU + Memory).
You have 1 free cluster per Billing account.

Kubernetes sizing consideration

Need some understanding on sizing consideration of k8s cluster master components, in order to handle maximum 1000 pods how many master will work out and do the job specially in case of multi master mode having load balancer in front to route request to api server.
Will 3 master node(etcd, apiserver, controller, scheduler) enough to handle or require more to process the load.
There is no strict solution for this. As per documentation in Kubernetes v. 1.15 you can create your cluster in many ways, but you must follow below rules:
No more than 5000 nodes
No more than 150000 total pods
No more than 300000 total containers
No more than 100 pods per node
You did not provide any information about infrastructure, if you want to deploy it local or in cloud.
One of advantages of cloud is that cloud kubernetes kube-up automatically configures the proper VM size for your master depending on the number of nodes in your cluster.
You cannot forget about provide proper Quota for CPU, Memory etc.
Please check this documentation for more detailed information.

What's the maximum number of Kubernetes namespaces?

Is there a maximum number of namespaces supported by a Kubernetes cluster? My team is designing a system to run user workloads via K8s and we are considering using one namespace per user to offer logical segmentation in the cluster, but we don't want to hit a ceiling with the number of users who can use our service.
We are using Amazon's EKS managed Kubernetes service and Kubernetes v1.11.
This is quite difficult to answer which has dependency on a lot of factors, Here are some facts which were created on the k8s 1.7 cluster kubernetes-theresholds the Number of namespaces (ns) are 10000 with few assumtions
The are no limits from the code point of view because is just a Go type that gets instantiated as a variable.
In addition to link that #SureshVishnoi posted, the limits will depend on your setup but some of the factors that can contribute to how your namespaces (and resources in a cluster) scale can be:
Physical or VM hardware size where your masters are running
Unfortunately, EKS doesn't provide that yet (it's a managed service after all)
The number of nodes your cluster is handling.
The number of pods in each namespace
The number of overall K8s resources (deployments, secrets, service accounts, etc)
The hardware size of your etcd database.
Storage: how many resources can you persist.
Raw performance: how much memory and CPU you have.
The network connectivity between your master components and etcd store if they are on different nodes.
If they are on the same nodes then you are bound by the server's memory, CPU and storage.
There is no limit on number of namespaces. You can create as many as you want. It doesn't actually consume cluster resources like cpu, memory etc.

GCP: Kubernetes engine allocatable resources

According to the documentation, Kubernetes reserves a significant amount of resources on the nodes in the cluster in order to run itself. Are the numbers in the documentation correct or is Google trying to sell me bigger nodes?
Aside: Taking kube-system pods and other reserved resources into account, am I right in saying it's better resource-wise to rent one machine equiped with 15GB of RAM instead of two with 7.5GB of RAM each?
Yes, kubernetes reserves a significant amount of resources on the nodes. So better consider that before renting the machine.
You can deploy custom machines in GCP. For the pricing you can use this calculator by Google

HPA + Cluster Autoscaler + OPA within Federated Kubernetes cluster on GKE

I'm setting up a federated kubernetes cluster with kubefed on the Google Container Engine (GKE) 1.8.3-gke.0.
And it seems like for a good HPA and cluster autoscaler I have to use Open Policy Agent as a kubernetes Admission Controller because of this:
By default, replicas are spread equally in all the underlying
clusters. For example: if you have 3 registered clusters and you
create a Federated Deployment with spec.replicas = 9, then each
Deployment in the 3 clusters will have spec.replicas=3.
But in my case, the load would be dynamically changed in every region and every cluster should have dynamic pods number.
I can't find (or just can't see) examples or manuals regarding cases like mine. So, the question is:
What scenario should a policy have, if I have three clusters in my federated context, one for every region of GKE:
eu (1000 rps, nodes labeled with "region=eu")
us (200 rps, nodes labeled with "region=us")
asia (100 rps, nodes labeled with "region=asia")
It should be a single deployment to dynamically spread pods across those three clusters.
One pod should:
serve 100 rps
request 2 vCPUs + 2Gb RAM
be placed on a node solely (with anti-affinity)
How can I configure OPA to make that schema work, if this is possible?
Thanks in advance for any links to corresponding manuals.
What you are trying to do should be achivable through "Federated Horizontal Pod Autoscalers", one of their main use cases is exactly your scenario:
Quoting from the Requirements & Design Document of the Federated Pod Autoscaler:
Users can schedule replicas of same application, across the federated clusters, using replicaset (or deployment). Users however further might need to let the replicas be scaled independently in each cluster, depending on the current usage metrics of the replicas; including the CPU, memory and application defined custom metrics.
And from the actual documentation this passage from the conclusion describe the behaviour:
The use of federated HPA is to ensure workload replicas move to the cluster(s) where they are needed most, or in other words where the load is beyond expected threshold. The federated HPA feature achieves this by manipulating the min and max replicas on the HPAs it creates in the federated clusters. It actually relies on the in-cluster HPA controllers to monitor the metrics and update relevant fields [...] The federated HPA controller, on the other hand, monitors only the cluster-specific HPA object fields and updates the min replica and max replica fields of those in cluster HPA objects, which have replicas matching thresholds.
Therefore If I didn't misunderstood your needs, there is no reason to use a third product like Open Policy Agent or create policies.