I want to use the Kubernetes namespace for each user of my application. So potentially I'll need to create thousands of namespaces each with kubernetes resources in them. I want to make sure this is scalable, so I want to ensure that I can have millions of namespaces on a Kubernetes Cluster before I use this construct on a per user basis.
I'm building a web hosting application. So I'm giving resources to each user, but I want them separated by namespaces.
Are there any limitations to the number of Kubernetes namespaces you can create?
"In majority of cases, thresholds are NOT hard limits - crossing the limit results in degraded performance and doesn't mean cluster immediately fails over.
Many of the thresholds (for cluster scope) are given for the largest possible cluster. For smaller clusters, the limits are proportionally lower.
"
#Namespaces = 10000 scope=cluster
source with more data
kube Talk explaining how the data is computed
You'll usually run into limitations with resources and etcd long before you hit a namespace limit.
For scaling, you're probably going to want to scale your clusters which most companies treat as cattle rather than create a giant cluster which will be a Pet, which is not a scenario you want to be dealing with.
Related
I am fairly new to Kubernetes, and I think I understand the basics of provisioning nodes and setting memory limits for pods. Here's the problem I have: my application can require dramatically different amounts of memory, depending on the input (and there is no fool-proof way to predict it). Some jobs require 50MB, some require 50GB. How can I set up my K8s deployment to handle this situation?
I have one strategy that I'd like to try out, but I don't know how to do it: start with small instances (nodes with not a lot of memory), and if the job fails with out-of-memory, then automatically send it to increasingly bigger instances until it succeeds. How hard would this be to implement in Kubernetes?
Thanks!
Natively K8S supports horizontal autoscalling i.e. automatically deplying more replicas of a deployment basing on chosen metric like CPU usage, memory usage etc.: Horizontal Pod Autoscaling
What you are describing here though is vertical scaling. It is not supported out of the box, but there is a subproject that seems to be able to fulfill your requirements: vertical-pod-autoscaler
I have a bunch of pods in a cluster that is almost requesting all (7.35/8) available CPU resources on a node:
even though their actual total usage is almost nothing (0.34/8).
The pod that is currently requesting the most only requests 210m which I guess is not an outrageous amount - also I would like to enforce some sensible minimum request size for all pods in the cluster. Of course that will accumulate when there are lots of pods.
It seems I could easily scale down the request by a factor of 10 and leave the limits where they are to begin with.
But is there something else that I should look into instead before doing that - reducing replica count etc.?
Also it looks a bit strange that the pods are not more evenly distributed between the nodes.
Your request values seems overestimated.
You need time and metrics to find the right request/limit for your workload.
Keep in mind that if you change those values, your pods will restart.
Also, It's normal that you can find some unbalance nodes on your cluster. Kubernetes will never remove a pod if you don't ask.
For example, if your create a cluster with 3 nodes, fill those 3 nodes with pods and then add another 3 nodes. The new nodes will stay empty.
You can setup some HorizontalPodAutoScaler on your cluster to adapt your number of pod to your workload.
Doing that, your workload will spread among nodes and with a correct balance. (if you use the default Scheduling Policy
I suggest following:
Resource Allocation: Based on history value set your request to meaningful value with buffer. Also to have guaranteed pod resource allocation it may be a good idea to set request and limit as same value. But that means you pod cannot burst for new resource. One more thing to note is scheduling only happens based on requested value, so if node has no more resource left, then pod will be killed and rescheduled if you request is trying to burst to limit.
Resource quotas: Check Kubernetes Resource Quotas to have sensible namespace level quotas to control overly provisioned resources by developers
Affinity/AntiAffinity: Check concept of Anti-affinity to have your replicas or different pods scheduled across your cluster. You can ensure for eg., that one host or Avalability zone etc can have only one replica of your pod (helps in HA), spread different pods to different nodes (layer scheduling etc) - Check this video
There are good answers already but I would like to add some more info.
It is very important to have a good strategy when calculating how much resources you would need for each container.
Optimally, your pods should be using exactly the amount of resources you requested but that's almost impossible to achieve. If the usage is lower than your request, you are wasting resources. If it's higher, you are risking performance issues. Consider a 25% margin up and down the request value as a good starting point. Regarding limits, achieving a good setting would depend on trying and adjusting. There is no optimal value that would fit everyone as it depends on many factors related to the application itself, the demand model, the tolerance to errors etc.
Kubernetes best practices: Resource requests and limits is a very good guide explaining the idea behind these mechanisms with a detailed explanation and examples.
Also, Managing Resources for Containers will provide you with the official docs regarding:
Requests and limits
Resource types
Resource requests and limits of Pod and Container
Resource units in Kubernetes
How Pods with resource requests are scheduled
How Pods with resource limits are run, etc
Just in case you'll need a reference.
We are migrating our infrastructure to kubernetes. I am talking about a part of it, that contains of an api for let's say customers (we have this case for many other resources). Let's consider we have a billion customers, each with some data etc. and we decided they deserve a specialized api just for them, with its own db, server, domain, etc.
Kubernetes has the notion of nodes and pods. So we said "ok, we dedicate node X with all its resources to this particular api". And now the question:
Why would I used multiple pods each of them containing the same nginx + fpm and code, and limit it to a part of traffic and resources, and add an internal lb, autoscale, etc., instead of having a single pod, with all node resources?
Since each pod adds a bit of extra memory consumption this seems like a waste to me. The only upside being the fact that if something fails only part of it goes down (so maybe 2 pods would be optimal in this case?).
Obviously, would scale the nodes when needed.
Note: I'm not talking about a case where you have multiple pods with different stuff, I'm talking about that particular case.
Note 2: The db already is outside this node, on it's own pod.
Google fails me on this topic. I find hundreds of post with "how to configure things, but 0 with WHY?".
Why would I used multiple pods each of them containing the same nginx + fpm and code, and limit it to a part of traffic and resources, and add an internal lb, autoscale, etc., instead of having a single pod, with all node resources?
Since each pod adds a bit of extra memory consumption this seems like a waste to me. The only upside being the fact that if something fails only part of it goes down (so maybe 2 pods would be optimal in this case?).
This comes down to the question, should I scale my app vertically (larger instance) or horizontally (more instances).
First, try to avoid using only a single instance since you probably want more redundancy if you e.g. upgrade a Node. A single instance may be a good option if you are OK with some downtime sometimes.
Scale app vertically
To scale an app vertically, by changing the instance to a bigger, is a viable alternative that sometimes is a good option. Especially when the app can not be scaled horizontally, e.g. an app that use leader election pattern - typically listen to a specific event and react. There is however a limit by how much you can scale an app vertically.
Scale app horizontally
For a stateless app, it is usually much easier and cheaper to scale an app horizontally by adding more instances. You typically want more than one instance anyway, since you want to tolerate that a Node goes down for maintenance. This is also possible to do for a large scale app to very many instances - and the cost scales linearly. However, not every app can scale horizontally, e.g. a distributed database (replicated) can typically not scale well horizontally unless you shard the data. You can even use Horizontal Pod Autoscaler to automatically adjust the number of instances depending on how busy the app is.
Trade offs
As described above, horizontal scaling is usually easier and preferred. But there are trade offs - you would probably not want to run thousands of instances when you have low traffic - an instance has some resource overhead costs, also in maintainability. For availability you should run at least 2 pods and make sure that they does not run on the same node, if you have a regional cluster, you want to make sure that they does not run on the same Availability Zone - for availability reasons. Consider 2-3 pods when your traffic is low, and use Horizontal Pod Autoscaler to automatically scale up to more instance when you need. In the end, this is a number game - resources cost money - but you want to provide a good service for your customers as well.
In my cluster there are 30 VMs which are located in 3 different physical servers. I want to deploy different replicas of each workload on different physical server.
I know I can use podAntiAffinity to deploy replicas on different VMs but I cant find any way to guarantee spread replication on different physical server.
I want to know is there any way to solve this challenge?
I believe you gave the answer ;)
I went to the Kubernetes Patterns book (PDF available for free in here) to see if there was something related to that over there, and found exactly that:
To express how Pods should be spread to achieve high availability, or be packed and co-located together to improve latency, Pod affinity and antiaffinity can be used.
Node affinity works at node granularity, but Pod affinity is not limited to nodes and
can express rules at multiple topology levels. Using the topologyKey field, and the
matching labels, it is possible to enforce more fine-grained rules, which combine
rules on domains like node, rack, cloud provider zone, and region [...]
I really like the k8s docs as well, they are super complete and full of examples, so maybe you can get some ideas from here. I think the main idea will be to create your own affinity/antiaffinity rule.
----------------------------------- EDIT -----------------------------------
There is a new feature within k8s version 1.18 that may be a better solution.
It's called: Pod Topology Spread Constraints:
You can use topology spread constraints to control how Pods are spread across your cluster among failure-domains such as regions, zones, nodes, and other user-defined topology domains. This can help to achieve high availability as well as efficient resource utilization.
I would like to define a policy to dynamically assigns resource limits to pods and containers. For example, if there are 4 number of pods scheduled in a specific node, and the memory capacity is 100mi, each pod to be assigned with 25mi memory limit. In other words, the fair share of the node capacity.
So, is it necessary to change the codes in scheduler.go or I need to change other objects as well?
I do agree with Arslanbekov answer, it's contrary to the ideology of scalability used by kubernetes.
The principle is that you define what resources is needed by your application and the cluster do all it can to give this resource to the pod, scalling the resources (pod, nodes) depending on the global consumption of all apps.
What you are asking is the reverse, give resources to the pod depending on the node resources, this way could prove very difficult to allow automatic scallability of the nodes as it would be the resource aim to attain (I may be confusing in my explanation but that shows how difficult it could be).
One way to do what you want would be to size all your pod to the same size to use 80% of the nodes but this would prove wrong if an app need more resources.
I think this is contrary to the ideology of the kubernetes. In this approach, the new application will not be able to get to the node.
At each point in time for the scheduler will be the utilization of 100% each node.