I have encountered a scalability problem when trying out the kubernetes cluster. To simplify the topology in my test machine, NodePort type is used to expose the individual service externally. The baremetal to host the node and master is a RHEL 7 with 24 CPUs and 32G RAM; I don't yet have a dedicated load balancer, or a cloud provider like infrastructure. A snippet of the service definition looks just like below
"spec": {
"ports": [{
"port": 10443,
"targetPort": 10443,
"protocol": "TCP",
"nodePort": 30443
} ],
"type": "NodePort",
Through this way the application can be accessible via https://[node_machine]:30443/[a_service]
Such service is only backed by one Pod. Ideally I would want to have several services deployed on the same node (but using different NodePort's), and and running concurrently.
Things were working well until it became evident that for a similar workload, increasing the number of services deployed (therefore backend pods as well) makes the applications degrade in performance. Surprisingly, when breaking down the service loading time, I noticed there's dramatic degradation in 'Connection Time' which seems to indicate there is a slowdown somewhere in the 'network' layer. Please note that the load isn't high enough to drive much of the CPU on the node yet. I read about the shortcomings in the doc, but not sure if what I hit is exactly the limitation of the kube-proxy/Service described there.
The questions are:
Is there any suggestion on how to make it more scalable? I.e. to be able to support more services/Pods without scarifying the applications' performance? The NodePort type is the easiest way to setup the 'public' address for our services but is there any limitation for scalability or performance if all services and Pods are setting up this way?
Would there be any difference if we change the type to LoadBalancer?
"type": "LoadBalancer"
Further more, is there a benefit to have a dedicated LoadBalancer or reverse proxy to improve the scalability, e.g. HAProxy or alike, that routes traffic from external to the backend Pods (or Services)? I noticed there's some work done for Nginx darkgaro/kubernetes-reverseproxy - unfortunately the doc seems incomplete and there's no concrete example. In some of the other threads folks talked about Vulcan - is it the recommended LB tool for kubernetes?
Your recommendation and help are highly appreciated!
Hello am kinda new to kubernetes but I have similar questions and concerns. Will try to answer some of them or redirect you to the relevant sections of the user guide.
In case you are deploying Kubernetes on a non cloud enabled providers like for example vagrant /local, etc then some features are not currently offered or automated by the platform for u.
One of those things is the 'LoadBalancer' type of Service. The automatic provision and assignment of a PUBLIC IP to the service (acting a L.B) happens currently only in platforms like Google Container engine.
See issue here and here.
The official documentation states
On cloud providers which support external load balancers, setting the
type field to "LoadBalancer" will provision a load balancer for your
Service.
Currently an alternative is being developed and documented, see here using HAProxy.
Maybe in the near future, kubernetes will eventually support that kind of feature in all the available platforms that can be deployed and operate, so always check their updated features.
What you are referring as performance degrade is most probably due to the fact, PublicIP (NodePort from version 1.0 and onwards) feature is working. Meaning that with the use of NodePort service type, kubernetes assigns a port on ALL nodes of the cluster for this kind of service. Then the kube-proxy intercepts the calls to this ports to the actual service etc.
An example on using HaProxy trying to solve the very same problem can be found here.
Hope that helped a bit.
I'm facing same problems. It seems that internal kube-proxy is not intended to be external load balancer. More specifically, we wanted to setup some timeout on kube-proxy or do re-tries etc.
I've found this article which describes similar issues. He recommends to look at vulcan as it internally uses etcd and probably the direction of this project is to provide fully featured LB for k8s in the future.
Related
I have read this question which is very similar to what I am asking, but still wanted to write a new question since the accepted answer there seems very incomplete and also potentially wrong.
Basically, it seems like there is some missing or contradictory information regarding built in load-balancing for regular Kubernetes Services (I am not talking about LoadBalancer services). For example, the official Cilium documentation states that "Kubernetes doesn't come with an implementation of Load Balancing". In addition, I couldn't find any information in the official Kubernetes documentation about load balancing for internal services (there was only a section discussing this under ingresses).
So my question is - how does load balancing or distribution of requests work when we make a request from within a Kubernetes cluster to the internal address of a Kubernetes service?
I know there's a Kubernetes proxy on each node that creates the DNS records for such services, but what about services that span multiple pods and nodes? There's got to be some form of request distribution or load-balancing, or else this just wouldn't work at all, no?
A standard Kubernetes Service provides basic load-balancing. Even for a ClusterIP-type Service, the Service has its own cluster-internal IP address and DNS name, and forwards requests to the collection of Pods specified by its selector:.
In normal use, it is enough to create a multiple-replica Deployment, set a Service to point at its Pods, and send requests only to the Service. All of the replicas will receive requests.
The documentation discusses the implementation of internal load balancing in more detail than an application developer normally needs. Unless your cluster administrator has done extra setup, you'll probably get round-robin request routing – the first Pod will receive the first request, the second Pod the second, and so on.
... the official Cilium documentation states ...
This is almost certainly a statement about external load balancing. As a cluster administrator (not a programmer) a "plain" Kubernetes installation doesn't include an external load-balancer implementation, and a LoadBalancer-type Service behaves identically to a NodePort-type Service.
There are obvious deficiencies to round-robin scheduling, most notably if you do wind up having individual network requests that take a long time and a lot of resource to service. As an application developer the best way to address this is to make these very-long-running requests run asynchronously; return something like an HTTP 201 Created status with a unique per-job URL, and do the actual work in a separate queue-backed worker.
I'm learning Kubernetes at the moment and just had a question I'd like to have clarified regarding exposing Services and making Pods accessible to the public internet.
Lets say I have an Java Spring boot application which has an embedded Tomcat server using JSP, MySQL Pod and Memcached (All on separate Pods), and I'd like to expose them as a Service making them publicly available.
I'm confused as to which type of Service each of these Pods would need , and also why. I'm aware that of Ingress and using a single Load Balancer to route requests from Services instead of multiple Load Balancers, but the actual Service type is what I'm finding hard to understand based on the what work the Pod needs to do.
Answering which Service type do you need: it's always ClusterIP.
LoadBalancers and NodePort are reserved for very specific use cases. One requiring to be integrated with a cloud (provisioning loadbalancers), the other requiring your kubernetes nodes being exposed to external clients, allowing connections to non-default ports.
When you don't know or you're not sure: just assume you can't use NodePort or LoadBalancers. As a cluster end-user, developer or Kubernetes beginner: ClusterIP is the only Service type you need.
Exposing your application to clients outside of your SDN, you want to use Ingresses. As again, while LoadBalancers or NodePorts might be suitable technical solutions on paper, they usually aren't in practice -- and when they are, there are security aspect to consider: better dealt with by your cluster administrator.
I've dockerized a legacy desktop app. This app does resource-intensive graphical rendering from a command line interface.
I'd like to offer this rendering as a service in a "compute farm", and I wondered if Kubernetes could be used for this purpose.
If so, how in Kubernetes would I ensure that each pod only serves one request at a time (this app is resource-intensive and likely not thread-safe)? Should I write a single-threaded wrapper/invoker app in the container and thus serialize requests? Would K8s then be smart enough to route subsequent requests to idle pods rather than letting them pile up on an overloaded pod?
Interesting question.
The inbuilt default Service object along with kube-proxy does route the requests to different pods, but only does so in a round-robin fashion which does not fit our use case.
Your use-case would require changes to be made to the kube-proxy setup during the cluster setup. This approach is tedious and will require you to have your own cluster setup (not supported by cloud services). As described here.
Best bet would be to setup a service-mesh like Istio which provides the features with little configuration along with a lot of other useful functionalities.
See if this helps.
I have 4 microservices running on my laptop listening at various ports. Can I use Istio to create a service mesh on my laptop so the services can communicate with each other through Istio? All the links on google about Istio include kubernetes but I want to run Istio without Kubernetes. Thanks for reading.
In practice, not really as of this writing, since pretty much all the Istio runbooks and guides are available for Kubernetes.
In theory, yes. Istio components are designed to be 'platform independent'. Quote from the docs:
While Istio is platform independent, using it with Kubernetes (or infrastructure) network policies, the benefits are even greater, including the ability to secure pod-to-pod or service-to-service communication at the network and application layers.
But unless you know really well the details of each of the components: Envoy, Mixer, Pilot, Citadel, and Galley and you are willing to spend a lot of time it becomes not practically feasible to get it running outside of Kubernetes.
If you want to use something less tied to Kubernetes you can take a look at Consul, although it doesn't have all the functionality Istio has, it has overlap with some of its features.
I do some googles, and found that istio claim to support apps running outside k8s, like in vm. But I never try.
https://istio.io/latest/news/releases/0.x/announcing-0.2/#cross-environment-support
https://jimmysong.io/blog/istio-vm-odysssey/
Long time I did not come here and I hope you're fine :)
So for now, i have the pleasure of working with kubernetes ! So let's start ! :)
[THE EXISTING]
I have an operationnal kubernetes cluster with which I work every day.it consists of several applications, one of which is of particular interest to us, which is the web management interface.
I currently own one master and four nodes in my cluster.
For my web application, pod contain 3 containers : web / mongo /filebeat, and for technical reasons, we decided to assign 5 users max for each web pod.
[WHAT I WANT]
I want to deploy a web pod on each nodes (web0,web1,web2,web3), what I can already do, and that each session (1 session = 1 user) is distributed as follows:
For now, all HTTP requests are processed by web0.
[QUESTIONS]
Am I forced to go through an external loadbalancer (haproxy)?
Can I use an internal loadbalancer, configuring a service?
Does anyone have experience on the implementation described above?
I thank in advance those who can help me in this process :)
This generally depends how and where you've deployed your Kubernetes infrastructure, but you can do this natively with a few options.
Firstly, you'll need to scale your web deployment. This is very simple to do:
kubectl scale --current-replicas=2 --replicas=3 deployment/web
If you're deployed into a cloud provider (such as AWS using kops, or GKE) you can use a service. Just specify the type as LoadBalancer. Services will spread the sessions for your users.
Another option is to use an Ingress. In order to do this, you'll need to use an Ingress Controller, such as the nginx-ingress-controller which is the most featureful and widely deployed.
Both of these options will automatically loadbalance your incoming application sessions, but they may not necessarily do it in the order you've described in your image, it'll be random across the available web deployments