Need some understanding on sizing consideration of k8s cluster master components, in order to handle maximum 1000 pods how many master will work out and do the job specially in case of multi master mode having load balancer in front to route request to api server.
Will 3 master node(etcd, apiserver, controller, scheduler) enough to handle or require more to process the load.
There is no strict solution for this. As per documentation in Kubernetes v. 1.15 you can create your cluster in many ways, but you must follow below rules:
No more than 5000 nodes
No more than 150000 total pods
No more than 300000 total containers
No more than 100 pods per node
You did not provide any information about infrastructure, if you want to deploy it local or in cloud.
One of advantages of cloud is that cloud kubernetes kube-up automatically configures the proper VM size for your master depending on the number of nodes in your cluster.
You cannot forget about provide proper Quota for CPU, Memory etc.
Please check this documentation for more detailed information.
Related
Is there a maximum number of namespaces supported by a Kubernetes cluster? My team is designing a system to run user workloads via K8s and we are considering using one namespace per user to offer logical segmentation in the cluster, but we don't want to hit a ceiling with the number of users who can use our service.
We are using Amazon's EKS managed Kubernetes service and Kubernetes v1.11.
This is quite difficult to answer which has dependency on a lot of factors, Here are some facts which were created on the k8s 1.7 cluster kubernetes-theresholds the Number of namespaces (ns) are 10000 with few assumtions
The are no limits from the code point of view because is just a Go type that gets instantiated as a variable.
In addition to link that #SureshVishnoi posted, the limits will depend on your setup but some of the factors that can contribute to how your namespaces (and resources in a cluster) scale can be:
Physical or VM hardware size where your masters are running
Unfortunately, EKS doesn't provide that yet (it's a managed service after all)
The number of nodes your cluster is handling.
The number of pods in each namespace
The number of overall K8s resources (deployments, secrets, service accounts, etc)
The hardware size of your etcd database.
Storage: how many resources can you persist.
Raw performance: how much memory and CPU you have.
The network connectivity between your master components and etcd store if they are on different nodes.
If they are on the same nodes then you are bound by the server's memory, CPU and storage.
There is no limit on number of namespaces. You can create as many as you want. It doesn't actually consume cluster resources like cpu, memory etc.
I am exploring about implementation of Kubernetes cluster and deployment into Kubernetes cluster using Jenkins via CI/CD pipeline. When exploring I found that we don't need to define the worker machine node where we need to deploy our pods. Kubernetes master will take care for where to deploy / free pod in worker machine for deployment. We only need to define how much memory need to that pod in definition.
Here my confusion is that, Already we assigned and configured Kubernetes cluster for deployment. That all nodes containing its own memory according to creation of AWS EC2 (since I am planning to use AWS Ec2 - Ubuntu 16.04 LTS).
So why we again need to define memory in pod ? Is that proper way of pod deployment ?
I am only started in CI/CD pipeline world.
Specifying memory and cpu in the pod specification is completely optional. Still there are a couple of aspects to specifying memory and CPU at pod level:
As explained here, if you don't specify CPU/memory - the pod/container can consume all resources on that node and potentially affect other pod/containers running on that node.
Each application should specify the memory and CPU they need for running the application. This information is used by Kubernetes during scheduling the pod on one of the nodes in the cluster where enough resources are available. This information ensures better scheduling decisions.
It enables the Horizontal Pod Autoscaler (HPA) to scale the pods when the resource consumption beyond a certain limit. The details are explained in this doc. Unless there is a memory/cpu limit specified, you can not calculate that the pod is running 80% of that metric and it should be scaled into two replicas.
You can also enable a certain default at namespace level and then only override for specific applications, details here
Does kubernetes or Helm support shut down the pods if it is idle for more than a given threshold time?
This would be very useful in the development environment, to provide room for other processes to consume it and save cost.
Kubernetes is featured with the ability to autoscale your application in a cluster. Literally, it means that Kubernetes can start additional pods when the load is increasing and terminate excessive pods when the load is decreasing.
It is possible to downscale the application to zero pods, but, in this case, you will have a delay serving the first request while the pod is starting.
This functionality relies on performance metrics provided by Heapster application, that must be run in the cluster. From the practical side, it means that autoscaling doesn't happen instantly, because it takes some time to performance metrics reach the configured threshold.
The mentioned Kubernetes feature called HPA(horizontal pod autoscale) is described in this document.
In case you are running your cluster on GCP or GKE, you are able to go further and automatically start additional nodes for your cluster when you need more computing capacity and shut down nodes when they are not running application pods anymore.
More information about this functionality can be found following the link.
If you decide to give it a try, you might find this information useful:
Creating a Container cluster in GKE
70% cheaper Kubernetes cluster on AWS
How to build a Kubernetes Horizontal Pod Autoscaler using custom metrics
I'm setting up a federated kubernetes cluster with kubefed on the Google Container Engine (GKE) 1.8.3-gke.0.
And it seems like for a good HPA and cluster autoscaler I have to use Open Policy Agent as a kubernetes Admission Controller because of this:
By default, replicas are spread equally in all the underlying
clusters. For example: if you have 3 registered clusters and you
create a Federated Deployment with spec.replicas = 9, then each
Deployment in the 3 clusters will have spec.replicas=3.
But in my case, the load would be dynamically changed in every region and every cluster should have dynamic pods number.
I can't find (or just can't see) examples or manuals regarding cases like mine. So, the question is:
What scenario should a policy have, if I have three clusters in my federated context, one for every region of GKE:
eu (1000 rps, nodes labeled with "region=eu")
us (200 rps, nodes labeled with "region=us")
asia (100 rps, nodes labeled with "region=asia")
It should be a single deployment to dynamically spread pods across those three clusters.
One pod should:
serve 100 rps
request 2 vCPUs + 2Gb RAM
be placed on a node solely (with anti-affinity)
How can I configure OPA to make that schema work, if this is possible?
Thanks in advance for any links to corresponding manuals.
What you are trying to do should be achivable through "Federated Horizontal Pod Autoscalers", one of their main use cases is exactly your scenario:
Quoting from the Requirements & Design Document of the Federated Pod Autoscaler:
Users can schedule replicas of same application, across the federated clusters, using replicaset (or deployment). Users however further might need to let the replicas be scaled independently in each cluster, depending on the current usage metrics of the replicas; including the CPU, memory and application defined custom metrics.
And from the actual documentation this passage from the conclusion describe the behaviour:
The use of federated HPA is to ensure workload replicas move to the cluster(s) where they are needed most, or in other words where the load is beyond expected threshold. The federated HPA feature achieves this by manipulating the min and max replicas on the HPAs it creates in the federated clusters. It actually relies on the in-cluster HPA controllers to monitor the metrics and update relevant fields [...] The federated HPA controller, on the other hand, monitors only the cluster-specific HPA object fields and updates the min replica and max replica fields of those in cluster HPA objects, which have replicas matching thresholds.
Therefore If I didn't misunderstood your needs, there is no reason to use a third product like Open Policy Agent or create policies.
I am working on writing some automation to setup a Kubernetes Cluster. The automation deploys the Kubernetes Master and once that is setup, it starts adding Minions in parallel. What is the most efficient way to determine programmatically if a Minion has joined the Kubernetes Cluster?
Currently I am querying the REST endpoint /v1/api/nodes exposed by the Kubernetes API-Server. My concern is that as the size of the cluster increases, querying the API-Server to pull details about all the minions may be compute and I/O intensive for the API-Server. I also did not find paging support in this API.
Thanks,
Sufian
You should look into kube-register https://github.com/kelseyhightower/kube-register. It uses fleet to register minions as they spin up. You should probably have it as a systemd unit so it runs on start up. Then for status, let the Api-server do it's thing with the polling status. Most clusters probably wouldn't be larger than 9 main nodes (you can have plenty worker nodes, I recommend looking at coreos's etcd docs to see about clustering) due to etcd's latency constraints in it's quorum over RAFT, so I wouldn't worry too much about the size of the cluster.
this is a mix between answer and comment on the other answer (I can not comment yet, sorry...)
As far as I know using the REST endpoint /v1/api/nodes is the best way to check if nodes are registered. How often do you call that endpoint? I wouldn't expect compute or I/O problems too fast.
kube-register was a useful tool to register new CoreOS nodes to the kubernetes cluster, but it is not needed anymore, since the kubelet registers itself in the meanwhile.
I think there is some misunderstanding in the other answer. I think you talk about 2 different clusters:
the etcd cluster: CoreOS recommends to run 3, 5 or 7 etcd instances in a cluster (https://coreos.com/etcd/docs/latest/admin_guide.html#cluster-management). On the remaining nodes you can configure etcd to run as a proxy (https://coreos.com/etcd/docs/latest/proxy.html). This should solve your etcd connection problem.
the kubernetes cluster: here you typically run 1 master and x "worker" nodes, just as you do already.