Control/Intercept Load Balancer traffic using Istio - kubernetes

I want to control/intercept the load balancer traffic using Istio. Istio gives you the ability to add a mixer on a service level but I want to add some code on a higher level just before the request traffic rules get executed.
Thus instead of adding actions per service I want to have some actions executed just after the request was received from the load balancer.

As per official Istio Documentation istio-ingressgateway is the main entry point for exposing nested services outside the cluster. Therefore, Istio Gateway collects information about incoming or outgoing HTTP/TCP connections and also specifies the set of ports that should be exposed, the type of protocol to use, etc. Gateway can be applied on the corresponded Envoy sidecar in the Pod through the labels.
Keep in mind that Istio Gateway operates within L4-L6 layers of load balancing and it's not aware of network source provider.
More information about Istio load balancing architecture you can find here.

Related

If 'Destination Rule' on Istio is applied, does load balancing of k8s Service not work?

I understand that the k8s Service basically performs Round Robin on Pods.
If I set the weight of the pods using the 'Destination Rule' of the Istio, what happens to RR of the existing k8s Service? Are load balancing rules in k8s Service ignored?
I understand that the k8s Service basically performs ROUND ROBIN on Pods.`
That's correct. It's explained in the kubernetes documentation here.
If I set the weight of the pods using the 'Destination Rule' of the Istio, what happens to RR of the existing k8s Service? Are load balancing rules in k8s Service ignored?
I couldn't find the exact information how it works, so I will explain how I understand it.
As kubernetes service uses kube-proxy's iptables rules to distribute the requests, I assume that istio destination rule can override it with his own rules, and apply them through envoy sidecar. As all traffic that your mesh services send and receive (data plane traffic) is proxied through Envoy, making it easy to direct and control traffic around your mesh without making any changes to your services.
So if you want to change the default ROUND ROBIN to other algorithm (e.g. LEAST_CONN, RANDOM) you can just configure that in your destination rule LoadBalancerSettings.SimpleLB. Note that by default the algorithm is also ROUND ROBIN, same as with kubernetes service.
More about it here.
Additional resources:
https://istio.io/latest/docs/reference/config/networking/destination-rule/#Subset
https://istio.io/latest/docs/concepts/traffic-management/#load-balancing-options

Istio Ingress Gateway - Visibility into gRPC connections and load balancing

We have a gRPC application deployed in a cluster (v 1.17.6) with Istio (v 1.6.2) setup. The cluster has istio-ingressgateway setup as the edge LB, with SSL termination. The istio-ingressgateway is fronted by an AWS ELB (classic LB) in passthrough mode. This setup is fully functional and the traffic flows as intended, in general. So the setup looks like:
ELB => istio-ingressgateway => virtual service => app service => [(envoy)pods]
We are running load tests on this setup using GHZ (ghz.sh), running external to the application cluster. From the tests we’ve run, we have observed that each of the app container seems to get about 300 RPS routed to it, no matter the configuration of the GHZ test. For reference, we have tried various combos of --concurrency and --connection settings for the tests. This ~300 RPS is lower than what we expect from the app and, hence, requires a lot more PODs to provide the required throughput.
We are really interested in understanding the details of the physical connection (gRPC/HTTP2) setup in this case, all the way from the ELB to the app/envoy and the details of the load balancing being done. Of particular interest is the the case when the same client, GHZ e.g., opens up multiple connections (specified via the --connection option). We have looked at Kiali and it doesn’t give us the appropriate visibility.
Questions:
How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
How is the “per request gRPC” load balancing happening?
What options might exist to optimize the various components involved in this setup?
Thanks.
1.How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
If Kiali doesn't show what exactly you need, maybe you could try with Jaeger?
Jaeger is an open source end to end distributed tracing system, allowing users to monitor and troubleshoot transactions in complex distributed systems.
There is istio documentation about Jaeger.
Additionally Prometheus and Grafana might be helpful here, take a look here.
2.How is the “per request gRPC” load balancing happening?
As mentioned here
By default, the Envoy proxies distribute traffic across each service’s load balancing pool using a round-robin model, where requests are sent to each pool member in turn, returning to the top of the pool once each service instance has received a request.
If you wan't to change the default round-robin model you can use Destination Rule for that. Destination rules let you customize Envoy’s traffic policies when calling the entire destination service or a particular service subset, such as your preferred load balancing model, TLS security mode, or circuit breaker settings.
There is istio documentation about that.
More about load balancing in envoy here.
3.What options might exist to optimize the various components involved in this setup?
I'm not sure if there is anything to optimize in istio components, maybe some custom configuration in Destination Rule?
Additional Resources:
itnext.io
medium.com
programmaticponderings.com

Openshift - calling another API within same namespace

I have an two container in same namespace. Service-A & Service-B.
And in my case, i want to talk to Service-B from Service-A. Through RestTemplate, i am making post call for communication like below.
public Response fetchData(Request request) {
return restTemplate.postForEntity("http://Service-B:8080/api", request, Response.class).getBody();
}
It is working fine in my lower region as i have only one POD for Service-B. My doubt here is, if i have more PODS (lets say three POD) in Production for handling load. Will the load balancing happen between PODS, if i use service url instead of router url?
If using kubernetes service then kube-proxy component provides load balancing at L4 layer via linux iptables.
Kube proxy running in IPVS mode provides least conns, locality, weighted, persistence based load balancing
Kube proxy running in userspace or iptables mode provides round robin load balancing
But if you need advanced loadbalancing at L7 layer then you need to use ingress or router.
https://kubernetes.io/docs/concepts/services-networking/#the-gory-details-of-virtual-ips

Q: Efficient Kubernetes load balancing

I've been looking into Kubernetes networking, more specifically, how to serve HTTPS users the most efficient.
I was watching this talk: https://www.youtube.com/watch?v=0Omvgd7Hg1I and from 22:18 he explains what the problem is with a load balancer that is not pod aware. Now, how they solve this in kubernetes is by letting the nodes also act as a 'router' and letting the node pass the request on to another node. (explained at 22:46). This does not seem very efficient, but when looking around SoundCloud (https://developers.soundcloud.com/blog/how-soundcloud-uses-haproxy-with-kubernetes-for-user-facing-traffic) actually seems to do something similar to this but with NodePorts. They say that the overhead costs less than creating a better load balancer.
From what I have read an option might be using an ingress controller. Making sure that there is not more than one ingress controller per node, and routing the traffic to the specific nodes that have an ingress controller. That way there will not be any traffic re-routing needed. However, this does add another layer of routing.
This information is all from 2017, so my question is: is there any pod aware load balancer out there, or is there some other method that does not involve sending the http request and response over the network twice?
Thank you in advance,
Hendrik
EDIT:
A bit more information about my use case:
There is a bare-metal setup with kubernetes. The firewall load balances the incomming data between two HAProxy instances. These HAProxy instances do ssl termination and forward the traffic to a few sites. This includes an exchange setup, a few internal IIS sites and a nginx server for a static web app. The idea is to transform the app servers into kubernetes.
Now my main problem is how to get the requests from HAProxy into kubernetes. I see a few options:
Use the SoundCloud setup. The infrastructure could stay almost the same, the HAProxy server can still operate the way they do now.
I could use an ingress controller on EACH node in the kubernetes cluster and have the firewall load balance between the nodes. I believe it is possible to forward traffic from the ingress controller to server outside the cluster, e.g. exchange.
Some magic load balancer that I do not know about that is pod aware and able to operate outside of the kubernetes cluster.
Option 1 and 2 are relatively simple and quite close in how they work, but they do come with a performance penalty. This is the case when the node that the requests gets forwarded to by the firewall does not have the required pod running, or if another pod is doing less work. The request will get forwarded to another node, thus, using the network twice.
Is this just the price you pay when using Kubernetes, or is there something that I am missing?
How traffic heads to pods depend on whether a managed cluster is used.
Almost all cloud providers can forward traffic in a cloud-native way in their managed K8s clusters. First, you can a managed cluster with some special network settings (e.g. vpc-native cluster of GKE). Then, the only thing you need to do is to create a LoadBalancer typed Service to expose your workload. You can also create Ingresses for your L7 workloads, they are going to be handled by provided IngressControllers (e.g. ALB of AWS).
In an on-premise cluster without any cloud provider(OpenStack or vSphere), the only way to expose workloads is NodePort typed Service. It doesn't mean you can't improve it.
If your cluster is behind reverse proxies (the SoundCloud case), setting externalTrafficPolicy: Local to Services could break traffic forwarding among work nodes. When traffic received through NodePorts, they are forwarded to local Pods or dropped if Pods reside on other nodes. Reserve proxy will mark these NodePort as unhealthy in the backend health check and reject to forward traffic to them. Another choice is to use topology-aware service routing. In this case, local Pods have priorities and traffic is still forwarded between node when no local Pods matched.
For IngressController in on-prem clusters, it is a little different. You may have some work nodes that have EIP or public IP. To expose HTTP(S) services, an IngressController usually deployed on those work nodes through DaemeaSet and HostNetwork such that clients access the IngressController via the well-known ports and EIP of nodes. These work nodes regularly don't accept other workloads (e.g. infra node in OpenShift) and one more forward on the Pod network is needed. You can also deploy the IngressController on all work nodes as well as other workloads, so traffic could be forwarded to a closer Pod if the IngressController supports topology-aware service routing although it can now.
Hope it helps!

Can i use a GCP HTTPS Load Balancer to route between a bucket backend and a Kubernetes service?

i wanted to understand what are my load balancing options in a scenario where i want to use a single HTTPS Load Balancer on GCP to serve some static content from a bucket and dynamic content using a combination of react front end and express backend on Kubernetes.
Additional info:
i have a domain name registered outside of Google Domains
I want to serve all content over https
I'm not starting with anything big. Just getting started with a more or less hobby type project which will attract very little traffic in the near future.
I dont mind serving my react front end, express backend from app engine if that helps simplify this somehow. however, in such a case, i would like to understand if i still want something on kubernetes, will i be able to communicate between app engine and kubernetes without hassles using internal IPs. And how would i load balance that traffic!!
Any kind of network blueprint in the public domain that will guide me will be helpful.
I did quite a bit of reading on NodePort/LoadBalancer/Ingress which has left me confused. from what i understand, LoadBalancer does not work with HTTP(S) traffic, operates more at TCP L4 Level, so probably not suitable for my use case.
Ingress provisions a dedicated Load Balancer of its own on which i cannot put my own routes to a backend bucket etc, which means i may need a minimum of two load balancers? and two IPs?
NodePort exposes a port on all nodes, which means i need to handle load balancing myself even if my HTTPS Load balancer routing can somehow help.
Any guidance/pointers will be much appreciated!
EDIT: Found some information on Network Endpoint Groups (NEG) while researching. Looking promising. will investigate. Any thoughts about taking this route? https://cloud.google.com/kubernetes-engine/docs/how-to/standalone-neg
EDIT: Was able to get this working using a combination of NEGs and Nginx reverse proxies.
In order to resolve your concerns please start with:
Choosing the right loadbalncer:
Network load balancer (Layer 4 load balancing or proxy for applications that rely on TCP/SSL protocol) the load is forwarding into your systems based on incoming IP protocol data, such as address, port, and protocol type.
The network load balancer is a pass-through load balancer, so your backends receive the original client request. The network load balancer doesn't do any Transport Layer Security (TLS) offloading or proxying. Traffic is directly routed to your VMs.
Network loadbalancers terminatese TLS on backends that are located in regions appropriate to your needs
HTTP(s) loadbalancer is a proxy-based, regional Layer 7 load balancer that enables you to run and scale your services behind a private load balancing IP address that is accessible only in the load balancer's region in your VPC network.
HTTPS and SSL Proxy load balancers terminate TLS in locations that are distributed globally.
An HTTP(S) load balancer acts as a proxy between your clients and your application. If you want to accept HTTPS requests from your clients
You have the option to use Google-managed SSL certificates (Beta) or to use certificates that you manage yourself.
Technical Details
When you create an Ingress object, the GKE Ingress controller configures a GCP HTTP(S) load balancer according to the rules in the Ingress manifest and the associated Service manifests. The client sends a request to the HTTP(S) load balancer. The load balancer is an actual proxy; it chooses a node and forwards the request to that node's NodeIP:NodePort combination. The node uses its iptables NAT table to choose a Pod. kube-proxy manages the iptables rules on the node. Routes traffic is going to a healthy Pod for the Service specified in your rules.
Per buckets documentation:
An HTTP(S) load balancer can direct traffic from specified URLs to either a backend bucket or a backend service.
Bucket should be public while using Loadbalncer- Creating buckets bucket
During LoaBalancer set-up you can choose backend service and backend bucket. You can find more information in the docs.
Please take a look also for this two tutorials here and here how to build application using cloud storage.
Hope this help.
Additional resources:
Loadbalancers, Controllers