I am confusing about kong api gateway and Haproxy/F5 load balancer.
Will kong api gateway handle the load balancing also!
My scenario is, if i have 5 micro-services on kong. 2nd service consumes more load compare with others, the kong will manages load or not.
If not means what needs to be do?
Currently, Kong will not handle the load balancing. Even more, you will have to load balance Kong yourself as well.
There are plans to enable multiple upstream targets in one of the next versions, but right now you will have to load balance your microservice end points in addition to running them past Kong.
Kong is able to secure your microservices and do other nifty things, such as rate limiting and applying OAuth2.0 to them, but load balancing is not one of those things.
Related
Recently I have built several microservices within a k8s cluster with Nginx ingress controller and they are working normally.
When dealing with communications among microservices, I attempted gRPC and it worked. Then I discover when microservice A -> gRPC -> microservice B, all requests were only occurred at 1 pod of microservice B (e.g. total 10 pods available for microservice B). In order to load balance the requests to all pods of microservice B, I attempted linkerd and it worked. However, I realized gRPC sometimes will produce internal error (e.g. 1 error out of 100 requests), making me changed to using the k8s DNS way (e.g. my-svc.my-namespace.svc.cluster-domain.example). Then, the requests never fail. I started to hold up gRPC and linkerd.
Later, I was interested in istio. I successfully deployed it to the cluster. However, I observe it always creates its own load balancer, which is not so matching with the existing Nginx ingress controller.
Furthermore, I attempted prometheus and grafana, as well as k9s. These tools let me have better understanding on cpu and memory usage of the pods.
Here I have several questions that I wish to understand:-
If I need to monitor cluster resources, we have prometheus, grafana and k9s. Are they doing the same monitoring role as service mesh (e.g. linkerd, istio)?
if k8s DNS can already achieve load balancing, do we still need service mesh?
if using k8s without service mesh, is it lag behind the normal practice?
Actually I also want to use service mesh every day.
The simple answer is
Service mesh for a kubernetes server is not necessary
Now to answer your questions
If I need to monitor cluster resources, we have prometheus, grafana and k9s. Are they doing the same monitoring role as service mesh (e.g. linkerd, istio)?
K9s is a cli tool that is just a replacement to the kubectl cli tool. It is not a monitor tool. Prometheus and grafana are monitoring tools that will need use the data provided by applications(pods) and builds the time-series data which can be visualized as charts, graphs etc. However the applications have to provide the monitoring data to Prometheus. Service meshes may use a sidecar and provide some default metrics useful for monitoring such as number of requests handled in a second. Your application doesn't need to have any knowledge or implementation of the metrics. Thus service meshes are optional and it offloads the common things such as monitoring or authorization.
if k8s DNS can already achieve load balancing, do we still need service mesh?
Service meshes are not needed for load balancing. When you have multiple services running in the cluster and want to use a single entry point for all your services to simplify maintenance and to save cost, Ingress controllers such as Nginx, Traefik, HAProxy are used. Also, service meshes such as Istio comes with its own ingress controller.
if using k8s without service mesh, is it lag behind the normal practice?
No, there can be clusters that don't have service meshes today and still use Kubernetes.
In the future, Kubernetes may bring some functionalities from service meshes.
Service mesh is not a silver bullet and it doesn't fit into every use case. Service mesh will not do everything for you, it also have bugs and limited features.
You can use Prometheus without Istio and have a very nice app monitoring. Service mesh can simplify some monitoring tasks for you, but it doesn't mean you cannot do it yourself.
Please don't think of DNS as load balancing solution. Kubernetes have Services and Ingresses to do load balancing. Nginx Ingress today is very powerful and have many advanced features.
It heavily depends on your use case.
We have all our services running with Kubernetes. We want to know what is the best practice to deploy our own API gateway, we thought of 2 solutions:
Deploy API gateways outside the Kubernetes cluster(s), i.e. with Kong. This means the clusters' ingress will connect to the external gateways. The gateway is either VM or physical machines, and you can scale by replicating many gateway instances
Deploy gateway from within Kubernetes (then maybe connect to external L4 load balancer), i.e. Ambassador. However, with this approach, each cluster can only have 1 gateway. The only way to prevent fault-tolerance is to actually replicate the entire K8s cluster
What is the typical setup and what is better?
The typical setup for an api gateway in kubernetes is either using a load balancer service, if the cloud provider that you are using support dynamic provision of load balancers (all major cloud vendors like gcp, aws or azure support it), or even more common to use an ingress controller.
Both of these options can scale horizontally so you have fault tolerance, in fact there is already a solution for ingress controller using kong
https://github.com/Kong/kubernetes-ingress-controller
We have a gRPC application deployed in a cluster (v 1.17.6) with Istio (v 1.6.2) setup. The cluster has istio-ingressgateway setup as the edge LB, with SSL termination. The istio-ingressgateway is fronted by an AWS ELB (classic LB) in passthrough mode. This setup is fully functional and the traffic flows as intended, in general. So the setup looks like:
ELB => istio-ingressgateway => virtual service => app service => [(envoy)pods]
We are running load tests on this setup using GHZ (ghz.sh), running external to the application cluster. From the tests we’ve run, we have observed that each of the app container seems to get about 300 RPS routed to it, no matter the configuration of the GHZ test. For reference, we have tried various combos of --concurrency and --connection settings for the tests. This ~300 RPS is lower than what we expect from the app and, hence, requires a lot more PODs to provide the required throughput.
We are really interested in understanding the details of the physical connection (gRPC/HTTP2) setup in this case, all the way from the ELB to the app/envoy and the details of the load balancing being done. Of particular interest is the the case when the same client, GHZ e.g., opens up multiple connections (specified via the --connection option). We have looked at Kiali and it doesn’t give us the appropriate visibility.
Questions:
How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
How is the “per request gRPC” load balancing happening?
What options might exist to optimize the various components involved in this setup?
Thanks.
1.How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
If Kiali doesn't show what exactly you need, maybe you could try with Jaeger?
Jaeger is an open source end to end distributed tracing system, allowing users to monitor and troubleshoot transactions in complex distributed systems.
There is istio documentation about Jaeger.
Additionally Prometheus and Grafana might be helpful here, take a look here.
2.How is the “per request gRPC” load balancing happening?
As mentioned here
By default, the Envoy proxies distribute traffic across each service’s load balancing pool using a round-robin model, where requests are sent to each pool member in turn, returning to the top of the pool once each service instance has received a request.
If you wan't to change the default round-robin model you can use Destination Rule for that. Destination rules let you customize Envoy’s traffic policies when calling the entire destination service or a particular service subset, such as your preferred load balancing model, TLS security mode, or circuit breaker settings.
There is istio documentation about that.
More about load balancing in envoy here.
3.What options might exist to optimize the various components involved in this setup?
I'm not sure if there is anything to optimize in istio components, maybe some custom configuration in Destination Rule?
Additional Resources:
itnext.io
medium.com
programmaticponderings.com
When using on premise (running on my own) api gateway like Kong, should it be run in a node as 1 withing the main kubernetes cluster or should it be ran as separate kubernetes cluster?
Unless you have an amazing reason to do otherwise: run Kong within the cluster. Pretty much the last thing you'd want is for all API requests to bomb because of a severed connection between cluster-A and cluster-B, not to mention the horrible latency as requests hop from one layer of abstraction to another.
Taking a page from the nginx Ingress controller, you also have the opportunity to use the Endpoint API to bypass the iptables-based Service machinery, saving even more latency and system resources -- a trick that would be almost impossible with a multi-cluster configuration.
It is my recollection there are even Kong-based Ingress controllers, which could save you even more heartache if their featureset and your needs align
Recently I started working with Microservices, I wrote a library for service discovery using Redis to store every service's url and port number, along with a TTL value for the entry. It turned out to be an expensive approach since for every cross service call to any other service required one call to Redis. Caching didn't seem to be a good idea, since the services won't be up all the times, there can be possible downtimes as well.
So I wanted to write a separate microservice which could take care of the orchestration part. For this I need to figure out a really low level network protocol to take care of the exchange of heartbeats(which would help me figure out if any of the service instance goes unavailable). How do applications like zookeeperClient, redisClient take care of heartbeats?
Moreover what is the industry's preferred protocol for cross service calls?
I have been calling REST Api's over HTTP and eliminated every possibility of Joins across different collections.
Is there a better way to do this?
Thanks.
I think the term "Orchestration" is not good for what you are asking. From what I've encountered so far in microservices world the term "Orchestration" is used when a complex business process is involved and not for service discovery. What you need is a Service registry combined with a Load balancer. You can find here all the information you need. Here are some relevant extras that great article:
There are two main service discovery patterns: client‑side discovery and server‑side discovery. Let’s first look at client‑side discovery.
The Client‑Side Discovery Pattern
When using client‑side discovery, the client is responsible for determining the network locations of available service instances and load balancing requests across them. The client queries a service registry, which is a database of available service instances. The client then uses a load‑balancing algorithm to select one of the available service instances and makes a request.
The network location of a service instance is registered with the service registry when it starts up. It is removed from the service registry when the instance terminates. The service instance’s registration is typically refreshed periodically using a heartbeat mechanism.
Netflix OSS provides a great example of the client‑side discovery pattern. Netflix Eureka is a service registry. It provides a REST API for managing service‑instance registration and for querying available instances. Netflix Ribbon is an IPC client that works with Eureka to load balance requests across the available service instances. We will discuss Eureka in more depth later in this article.
The client‑side discovery pattern has a variety of benefits and drawbacks. This pattern is relatively straightforward and, except for the service registry, there are no other moving parts. Also, since the client knows about the available services instances, it can make intelligent, application‑specific load‑balancing decisions such as using hashing consistently. One significant drawback of this pattern is that it couples the client with the service registry. You must implement client‑side service discovery logic for each programming language and framework used by your service clients.
The Server‑Side Discovery Pattern
The client makes a request to a service via a load balancer. The load balancer queries the service registry and routes each request to an available service instance. As with client‑side discovery, service instances are registered and deregistered with the service registry.
The AWS Elastic Load Balancer (ELB) is an example of a server-side discovery router. An ELB is commonly used to load balance external traffic from the Internet. However, you can also use an ELB to load balance traffic that is internal to a virtual private cloud (VPC). A client makes requests (HTTP or TCP) via the ELB using its DNS name. The ELB load balances the traffic among a set of registered Elastic Compute Cloud (EC2) instances or EC2 Container Service (ECS) containers. There isn’t a separate service registry. Instead, EC2 instances and ECS containers are registered with the ELB itself.
HTTP servers and load balancers such as NGINX Plus and NGINX can also be used as a server-side discovery load balancer. For example, this blog post describes using Consul Template to dynamically reconfigure NGINX reverse proxying. Consul Template is a tool that periodically regenerates arbitrary configuration files from configuration data stored in the Consul service registry. It runs an arbitrary shell command whenever the files change. In the example described by the blog post, Consul Template generates an nginx.conf file, which configures the reverse proxying, and then runs a command that tells NGINX to reload the configuration. A more sophisticated implementation could dynamically reconfigure NGINX Plus using either its HTTP API or DNS.
Some deployment environments such as Kubernetes and Marathon run a proxy on each host in the cluster. The proxy plays the role of a server‑side discovery load balancer. In order to make a request to a service, a client routes the request via the proxy using the host’s IP address and the service’s assigned port. The proxy then transparently forwards the request to an available service instance running somewhere in the cluster.
The server‑side discovery pattern has several benefits and drawbacks. One great benefit of this pattern is that details of discovery are abstracted away from the client. Clients simply make requests to the load balancer. This eliminates the need to implement discovery logic for each programming language and framework used by your service clients. Also, as mentioned above, some deployment environments provide this functionality for free. This pattern also has some drawbacks, however. Unless the load balancer is provided by the deployment environment, it is yet another highly available system component that you need to set up and manage.