Kubernetes and Service Mesh load-balancing misalignments - kubernetes

Kubernetes has a support of Pod load-balancing, session affinity through its kube-proxy. Kubernetes’ kube-proxy is essentially an L4 load balancer so we cannot rely on it to load balance L7-transport, e.g. muliple gRPC live connections or load-balancing based on http-headers, cookies, etc.
Service Mesh implementation like e.g. istio can handle these patterns on L7-level including gRPC. But I always thought that Service Mesh is just another layer on top of Kubernetes with additional capabilities(encrypted traffic, blue/green deployments/etc). E.g. My assumption always was that Kubernetes applications should be able to work on both vanilla Kubernetes without Mesh (e.g. for development/testing) or with a Mesh on. Adding this advanced traffic management on L7 breaks this assumption. I won't be able to work on a vanilla Kubernetes anymore, I will be tied to a specific implementation of Istio dataplane(Envoy).
Please let know if my assumption is correct or why not? There's not much information about this type of separation of concerns on this internet.

Let me refer first to the following statement of yours:
My assumption always was that Kubernetes applications should be able
to work on both vanilla Kubernetes without Mesh (e.g. for
development/testing) or with a Mesh on. Adding this advanced traffic
management on L7 breaks this assumption.
I have different view at that, Service Meshes are transparent to the application, so they don't break anything in them, but just add an extra (network, security, monitoring) functions at no cost (ok, the cost it quite complex configuration from Mesh operator perspective). The Service Mesh like Istio doesn't need to occupy all K8S namespaces, so you can still have mixed type of workloads in your cluster (with and w/o proxies). If we speak about Istio, to enable full interoperatbility between them (mixed workloads) you may combine two its features together:
Peer authentication set to PERMISSIVE, so that workloads without sidecars (proxy) can accept both mutual TLS and plain text traffic.
Manual protocol selection, e.g. if you prefer your app speak raw TCP instead of app protocol determined by Envoy itself (e.g. http) - to avoid extra decorations injected by proxy to intercepted requests.
Alternatively you can write your own custom tcp_proxy EnvoyFilter to use Envoy as a L4 network proxy.


Kubernetes - is Service Mesh a must?

Recently I have built several microservices within a k8s cluster with Nginx ingress controller and they are working normally.
When dealing with communications among microservices, I attempted gRPC and it worked. Then I discover when microservice A -> gRPC -> microservice B, all requests were only occurred at 1 pod of microservice B (e.g. total 10 pods available for microservice B). In order to load balance the requests to all pods of microservice B, I attempted linkerd and it worked. However, I realized gRPC sometimes will produce internal error (e.g. 1 error out of 100 requests), making me changed to using the k8s DNS way (e.g. my-svc.my-namespace.svc.cluster-domain.example). Then, the requests never fail. I started to hold up gRPC and linkerd.
Later, I was interested in istio. I successfully deployed it to the cluster. However, I observe it always creates its own load balancer, which is not so matching with the existing Nginx ingress controller.
Furthermore, I attempted prometheus and grafana, as well as k9s. These tools let me have better understanding on cpu and memory usage of the pods.
Here I have several questions that I wish to understand:-
If I need to monitor cluster resources, we have prometheus, grafana and k9s. Are they doing the same monitoring role as service mesh (e.g. linkerd, istio)?
if k8s DNS can already achieve load balancing, do we still need service mesh?
if using k8s without service mesh, is it lag behind the normal practice?
Actually I also want to use service mesh every day.
The simple answer is
Service mesh for a kubernetes server is not necessary
Now to answer your questions
If I need to monitor cluster resources, we have prometheus, grafana and k9s. Are they doing the same monitoring role as service mesh (e.g. linkerd, istio)?
K9s is a cli tool that is just a replacement to the kubectl cli tool. It is not a monitor tool. Prometheus and grafana are monitoring tools that will need use the data provided by applications(pods) and builds the time-series data which can be visualized as charts, graphs etc. However the applications have to provide the monitoring data to Prometheus. Service meshes may use a sidecar and provide some default metrics useful for monitoring such as number of requests handled in a second. Your application doesn't need to have any knowledge or implementation of the metrics. Thus service meshes are optional and it offloads the common things such as monitoring or authorization.
if k8s DNS can already achieve load balancing, do we still need service mesh?
Service meshes are not needed for load balancing. When you have multiple services running in the cluster and want to use a single entry point for all your services to simplify maintenance and to save cost, Ingress controllers such as Nginx, Traefik, HAProxy are used. Also, service meshes such as Istio comes with its own ingress controller.
if using k8s without service mesh, is it lag behind the normal practice?
No, there can be clusters that don't have service meshes today and still use Kubernetes.
In the future, Kubernetes may bring some functionalities from service meshes.
Service mesh is not a silver bullet and it doesn't fit into every use case. Service mesh will not do everything for you, it also have bugs and limited features.
You can use Prometheus without Istio and have a very nice app monitoring. Service mesh can simplify some monitoring tasks for you, but it doesn't mean you cannot do it yourself.
Please don't think of DNS as load balancing solution. Kubernetes have Services and Ingresses to do load balancing. Nginx Ingress today is very powerful and have many advanced features.
It heavily depends on your use case.

Istio Ingress Gateway - Visibility into gRPC connections and load balancing

We have a gRPC application deployed in a cluster (v 1.17.6) with Istio (v 1.6.2) setup. The cluster has istio-ingressgateway setup as the edge LB, with SSL termination. The istio-ingressgateway is fronted by an AWS ELB (classic LB) in passthrough mode. This setup is fully functional and the traffic flows as intended, in general. So the setup looks like:
ELB => istio-ingressgateway => virtual service => app service => [(envoy)pods]
We are running load tests on this setup using GHZ (ghz.sh), running external to the application cluster. From the tests we’ve run, we have observed that each of the app container seems to get about 300 RPS routed to it, no matter the configuration of the GHZ test. For reference, we have tried various combos of --concurrency and --connection settings for the tests. This ~300 RPS is lower than what we expect from the app and, hence, requires a lot more PODs to provide the required throughput.
We are really interested in understanding the details of the physical connection (gRPC/HTTP2) setup in this case, all the way from the ELB to the app/envoy and the details of the load balancing being done. Of particular interest is the the case when the same client, GHZ e.g., opens up multiple connections (specified via the --connection option). We have looked at Kiali and it doesn’t give us the appropriate visibility.
How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
How is the “per request gRPC” load balancing happening?
What options might exist to optimize the various components involved in this setup?
1.How can we get visibility into the physical connections being setup from the ingress gateway to the pod/proxy?
If Kiali doesn't show what exactly you need, maybe you could try with Jaeger?
Jaeger is an open source end to end distributed tracing system, allowing users to monitor and troubleshoot transactions in complex distributed systems.
There is istio documentation about Jaeger.
Additionally Prometheus and Grafana might be helpful here, take a look here.
2.How is the “per request gRPC” load balancing happening?
As mentioned here
By default, the Envoy proxies distribute traffic across each service’s load balancing pool using a round-robin model, where requests are sent to each pool member in turn, returning to the top of the pool once each service instance has received a request.
If you wan't to change the default round-robin model you can use Destination Rule for that. Destination rules let you customize Envoy’s traffic policies when calling the entire destination service or a particular service subset, such as your preferred load balancing model, TLS security mode, or circuit breaker settings.
There is istio documentation about that.
More about load balancing in envoy here.
3.What options might exist to optimize the various components involved in this setup?
I'm not sure if there is anything to optimize in istio components, maybe some custom configuration in Destination Rule?
Additional Resources:

Is that possible to deploy an openshift or kubernetes in DMZ zone?

Is that possible to deploy an openshift in DMZ zone ( Restricted zone ).What are the challenges i will face?.What are the things i have to do in DMZ zone network?
You can deploy Kubernetes and OpenShift in DMZ.
You can also add DMZ in front of Kubernetes and OpenShift.
The Kubernetes and OpenShift network model is a flat SDN model. All pods get IP addresses from the same network CIDR and live in the same logical network regardless of which node they reside on.
We have ways to control network traffic within the SDN using the NetworkPolicy API. NetworkPolicies in OpenShift represent firewall rules and the NetworkPolicy API allows for a great deal of flexibility when defining these rules.
With NetworkPolicies it is possible to create zones, but one can also be much more granular in the definition of the firewall rules. Separate firewall rules per pod are possible and this concept is also known as microsegmentation (see this post for more details on NetworkPolicy to achieve microsegmentation).
The DMZ is in certain aspects a special zone. This is the only zone exposed to inbound traffic coming from outside the organization. It usually contains software such as IDS (intrusion detection systems), WAFs (Web Application Firewalls), secure reverse proxies, static web content servers, firewalls and load balancers. Some of this software is normally installed as an appliance and may not be easy to containerize and thus would not generally be hosted within OpenShift.
Regardless of the zone, communication internal to a specific zone is generally unrestricted.
Variations on this architecture are common and large enterprises tend to have several dedicated networks. But the principle of purpose-specific networks protected by firewall rules always applies.
In general, traffic is supposed to flow only in one direction between two networks (as in an osmotic membrane), but often exceptions to this rule are necessary to support special use cases.
Useful article: openshift-and-network-security-zones-coexistence-approache.
It's very secure if you follow standard security practices for your cluster. But nothing is 100% secure. So adding a DMZ would help reduce your attack vectors.
In terms of protecting your Ingress from outside, you can limit your access for your external load balancer just to HTTPS, and most people do that but note that HTTPS and your application itself can also have vulnerabilities.
As for pods and workloads, you can increase security (at some performance cost) using things like a well-crafted seccomp profile and or adding the right capabilities in your pod security context. You can also add more security with AppArmor or SELinux, but lots of people don't since it can get very complicated.
There are also other alternatives to Docker in order to more easily sandbox your pods (still early in their lifecycle as of this writing): Kata Containers, Nabla Containers and gVisor.
Take look on: dmz-kubernetes.
Here is similar problem: dmz.

Best way to go between private on-premises network and kubernetes

I have setup an on-premises Kubernetes cluster, and I want to be ensure that my services that are not in Kubernetes, but exist on a separate class B are able to consume those services that have migrated to Kubernetes. There's a number of ways of doing this by all accounts and I'm looking for the simplest one.
Ingress + controller seems to be the one favoured - and it's interesting because of the virtual hosts and HAProxy implementation. But where I'm getting confused is how to set up the Kubernetes service:
We've not a great deal of choice - ClusterIP won't be sufficient to expose it to the outside, or NodePort. LoadBalancer seems to be a simpler, cut down way of switching between network zones - and although there are OnPrem solutions (metalLB), seems to be far geared towards cloud solutions.
But if I stick with NodePort, then my entry into the network is going to be on a non-standard port number, and I would prefer it to be over standard port; particuarly if running a percentage of traffic for that service over non-kube, and the rest over kubernetes (for testing purposes, I'd like to monitor the traffic over a period of time before I bite the bullet and move 100% of traffic for the given microservice to kubernetes). In that case it would be better those services would be available across the same port (almost always 80 because they're standard REST micro-services). More than that, if I have to re-create the service for whatever reason, I'm pretty sure the port will change, and then all traffic will not be able to enter the Kubernetes cluster and that's a frightening proposition.
What are the suggested ways of handling communication between existing on-prem and Kubernetes cluster (also on prem, different IP/subnet)?
Is there anyway to get traffic coming in without changing the network parameters (class B's the respective networks are on), and not being forced to use NodePort?
NodePort service type may be good at stage or dev environments. But i recommend you to go with LoadBalancer type service (Nginx ingress controller is one). The advantage for this over other service types are
You can use standard port (Rather random Nodeport generated by your kubernetes).
Your service is load balanced. (Load balancing will be taken care by ingress controller).
Fixed port (it will not change unless you modify something in ingress object).

How to configure Kubernetes to encrypt the traffic between nodes, and pods?

In preparation for HIPAA compliance, we are transitioning our Kubernetes cluster to use secure endpoints across the fleet (between all pods). Since the cluster is composed of about 8-10 services currently using HTTP connections, it would be super useful to have this taken care of by Kubernetes.
The specific attack vector we'd like to address with this is packet sniffing between nodes (physical servers).
This question breaks down into two parts:
Does Kubernetes encrypts the traffic between pods & nodes by default?
If not, is there a way to configure it such?
Many thanks!
Actually the correct answer is "it depends". I would split the cluster into 2 separate networks.
Control Plane Network
This network is that of the physical network or the underlay network in other words.
k8s control-plane elements - kube-apiserver, kube-controller-manager, kube-scheduler, kube-proxy, kubelet - talk to each other in various ways. Except for a few endpoints (eg. metrics), it is possible to configure encryption on all endpoints.
If you're also pentesting, then kubelet authn/authz should be switched on too. Otherwise, the encryption doesn't prevent unauthorized access to the kubelet. This endpoint (at port 10250) can be hijacked with ease.
Cluster Network
The cluster network is the one used by the Pods, which is also referred to as the overlay network. Encryption is left to the 3rd-party overlay plugin to implement, failing which, the app has to implement.
The Weave overlay supports encryption. The service mesh linkerd that #lukas-eichler suggested can also achieve this, but on a different networking layer.
The replies here seem to be outdated. As of 2021-04-28 at least the following components seem to be able to provide an encrypted networking layer to Kubernetes:
Calico (via Wireguard)
(the list above was gained via consultation of the respective projects home pages)
Does Kubernetes encrypts the traffic between pods & nodes by default?
Kubernetes does not encrypt any traffic.
There are servicemeshes like linkerd that allow you to easily introduce https communication between your http service.
You would run a instance of the service mesh on each node and all services would talk to the service mesh. The communication inside the service mesh would be encrypted.
your service -http-> localhost to servicemesh node - https-> remoteNode -http-> localhost to remote service.
When you run the service mesh node in the same pod as your service the localhost communication would run on a private virtual network device that no other pod can access.
No, kubernetes does not encrypt traffic by default
I haven't personally tried it, but the description on the Calico software defined network seems oriented toward what you are describing, with the additional benefit of already being kubernetes friendly
I thought that Calico did native encryption, but based on this GitHub issue it seems they recommend using a solution like IPSEC to encrypt just like you would a traditional host