GKE No load balancing between backends pod with session affinity(sticky sessions) - kubernetes

I have 2 backend application running on the same cluster on gke. Applications A and B. A has 1 pod and B has 2 pods. A is exposed to the outside world and receives IP address that he then sends to B via http requests in the header.
B has a Kubernetes service object that is configured like that.
apiVersion: v1
kind: Service
metadata:
name: svc-{{ .Values.component_name }}
namespace: {{ include "namespace" .}}
spec:
ports:
- port: 80
targetPort: {{.Values.app_port}}
protocol: TCP
selector:
app: pod-{{ .Values.component_name }}
type: ClusterIP
In that configuration, The http requests from A are equally balanced between the 2 pods of application B, but when I add sessionAffinity: ClientIP to the configuration, every http requests are sent to the same B pod even though I thought it should be a round robin type of interaction.
To be clear, I have the IP adress stored in the header X-Forwarded-For so the service should look at it to be sure to which B pod to send the request as the documentation says https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
In my test I tried to create has much load has possible to one of the B pod to try to contact the second pod without any success. I made sure that I had different IPs in my headers and that it wasn't because some sort of proxy in my environment. The IPs were not previously used for test so it is not because of already existing stickiness.
I am stuck now because I don't know how to test it further and have been reading the doc and probably missing something. My guess was that sessionAffinity disable load balancing for ClusterIp type but this seems highly unlikely...
My questions are :
Is the comportment I am observing normal? What am I doing wrong?
This might help to understand if it is still unclear what I'm trying to say : https://stackoverflow.com/a/59109265/12298812
EDIT : I did test on the client upstream and I saw at least a little bit of the requests get to the second pod of B, but this load test was performed from the same IP for every request. So this time I should have seen only a pod get the traffic...

The behaviour suggests that x-forward-for header is not respected by cluster-ip service.
To be sure I would suggest to load test from upstream client service which consumes the above service and see what kind of behaviour you get. Chances are you will see the same incorrect behaviour there which will affect scaling your service.
That said, using session affinity for internal service is highly unusual as client IP addresses do not vary as much. Session affinity limits scaling ability of your application. Typically you use memcached or redis as session store which is likely to be more scalable than session affinity based solutions.

Related

Within a k8s cluster Should I always call the Ingress Rule Or Node Port Service Name?

I have a number of restful services within our system
Some are our within the kubernetes cluster
Others are on legacy infrasture and are hosted on VM's
Many of our restful services make synchronous calls to each other (so not asynchronously using message queues)
We also have a number of UI's (fat clients or web apps) that make use of these services
We might define a simple k8s manifest file like this
Pod
Service
Ingress
apiVersion: v1
kind: Pod
metadata:
name: "orderManager"
spec:
containers:
- name: "orderManager"
image: "gitlab-prem.com:5050/image-repo/orderManager:orderManager_1.10.22"
---
apiVersion: v1
kind: Service
metadata:
name: "orderManager-service"
spec:
type: NodePort
selector:
app: "orderManager"
ports:
- protocol: TCP
port: 50588
targetPort: 50588
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: orderManager-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- http:
paths:
- path: /orders
pathType: Prefix
backend:
service:
name: "orderManager-service"
port:
number: 50588
I am really not sure what the best way for restful services on the cluster to talk to each other.
It seems like there is only one good route for callers outside the cluster which is use the url built by the ingress rule
Two options within the cluster
This might illustrate it further with an example
Caller
Receiver
Example Url
UI
On Cluster
http://clusterip/orders
The UI would use the cluster ip and the ingress rule to reach the order manager
Service off cluster
On Cluster
http://clusterip/orders
Just like the UI
On Cluster
On Cluster
http://clusterip/orders
Could use ingress rule like the above approach
On Cluster
On Cluster
http://orderManager-service:50588/
Could use the service name and port directly
I write cluster ip a few times above but in real life we put something top so there is a friendly name like http://mycluster/orders
So when caller and reciever are both on cluster is it either
Use the ingress rule which is also used by services and apps outside the cluster
Use the nodeport service name which is used in the ingress rule
Or perhaps something else!
One benefit of using nodeport service name is that you do not have to change your base URL.
The ingress rule appends an extra elements to the route (in the above case orders)
When I move a restful service from legacy to k8s cluster it will increase the complexity
It depends on whether you want requests to be routed through your ingress controller or not.
Requests sent to the full URL configured in your Ingress resource will be processed by your ingress controller. The controller itself — NGINX in this case — will proxy the request to the Service. The request will then be routed to a Pod.
Sending the request directly to the Service’s URL simply skips your ingress controller. The request is directly routed to a Pod.
The trade offs between the two options depend on your setup.
Sending requests through your ingress controller will increase request latency and resource consumption. If your ingress controller does nothing other than route requests, I would recommend sending requests directly to the Service.
However, if you use your ingress controller for other purposes, like authentication, monitoring, logging, or tracing, then you may prefer that the controller process internal requests.
For example, on some of my clusters I use the NGINX ingress controller to measure request latency and track HTTP response statuses. I route requests between apps running in the same cluster through the ingress controller in order to have that information available. I pay the cost of increased latency and resource usage in order to have improved observability.
Whether the trade offs are worth it in your case depends on you. If your ingress controller does nothing more that basic routing, then my recommendation is to skip it entirely. If it does more, then you need to weigh the pros and cons of routing requests through it.

Kubernetes pods can not make https request after deploying istio service mesh

I am exploring the istio service mesh on my k8s cluster hosted on EKS(Amazon).
I tried deploying istio-1.2.2 on a new k8s cluster with the demo.yml file used for bookapp demonstration and most of the use cases I understand properly.
Then, I deployed istio using helm default profile(recommended for production) on my existing dev cluster with 100s of microservices running and what I noticed is my services can can call http endpoints but not able to call external secure endpoints(https://www.google.com, etc.)
I am getting :
curl: (35) error:1400410B:SSL routines:CONNECT_CR_SRVR_HELLO:wrong
version number
Though I am able to call external https endpoints from my testing cluster.
To verify, I check the egress policy and it is mode: ALLOW_ANY in both the clusters.
Now, I removed the the istio completely from my dev cluster and install the demo.yml to test but now this is also not working.
I try to relate my issue with this but didn't get any success.
https://discuss.istio.io/t/serviceentry-for-https-on-httpbin-org-resulting-in-connect-cr-srvr-hello-using-curl/2044
I don't understand what I am missing or what I am doing wrong.
Note: I am referring to this setup: https://istio.io/docs/setup/kubernetes/install/helm/
This is most likely a bug in Istio (see for example istio/istio#14520): if you have any Kubernetes Service object, anywhere in your cluster, that listens on port 443 but whose name starts with http (not https), it will break all outbound HTTPS connections.
The instance of this I've hit involves configuring an AWS load balancer to do TLS termination. The Kubernetes Service needs to expose port 443 to configure the load balancer, but it receives plain unencrypted HTTP.
apiVersion: v1
kind: Service
metadata:
name: breaks-istio
annotations:
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:...
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: http
spec:
selector: ...
ports:
- name: http-ssl # <<<< THIS NAME MATTERS
port: 443
targetPort: http
When I've experimented with this, changing that name: to either https or tcp-https seems to work. Those name prefixes are significant to Istio, but I haven't immediately found any functional difference between telling Istio the port is HTTPS (even though it doesn't actually serve TLS) vs. plain uninterpreted TCP.
You do need to search your cluster and find every Service that listens to port 443, and make sure the port name doesn't start with http-....

Exposed Service and Replica Set Relation in Kubernetes

I have a question about how kubernetes decides the serving pod when there are several replicas of the pod.
For Instance, let's assume I have a web application running on a k8s cluster as multiple pod replicas and they are exposed by a service.
When a client sends a request it goes to service and kube-proxy. But where and when does kubernetes make a decision about which pod should serve the request?
I want to know the internals of kubernetes for this matter. Can we control this? Can we decide which pod should serve based on client requests and custom conditions?
can we decide which pod should serve based on client requests and custom conditions?
As kube-proxy works on L4 load balancing stuff thus you can control the session based on Client IP. it does not read the header of client request.
you can control the session with the following field service.spec.sessionAffinityConfig in service obejct
following command provide the explanation
kubectl explain service.spec.sessionAffinityConfig
Following paragraph and link provide detail answer.
Client-IP based session affinity can be selected by setting service.spec.sessionAffinity to “ClientIP” (the default is “None”), and you can set the max session sticky time by setting the field service.spec.sessionAffinityConfig.clientIP.timeoutSeconds if you have already set service.spec.sessionAffinity to “ClientIP” (the default is “10800”)-service-proxies
Service object would be like this
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10000
Kubernetes service creates a load balancer(and an endpoint for it) and will use round robin by default to distribute requests among pods.
You can alter this behaviour.
As Suresh said you can also use sessionAffinity to ensure that requests for a particular session value always go to the same pod.

HAProxy with Kubernetes in a DR setup

We have Kubernetes setup hosted on premises and are trying to allow clients outside of K8s to connect to services hosted in the K8s cluster.
In order to make this work using HA Proxy (which runs outside K8s), we have the HAProxy backend configuration as follows -
backend vault-backend
...
...
server k8s-worker-1 worker1:32200 check
server k8s-worker-2 worker2:32200 check
server k8s-worker-3 worker3:32200 check
Now, this solution works, but the worker names and the corresponding nodePorts are hard-coded in this config, which obviously is inconvenient as and when more workers are added (or removed/changed).
We came across the HAProxy Ingress Controller (https://www.haproxy.com/blog/haproxy_ingress_controller_for_kubernetes/) which sounds promising, but (we feel) effectively adds another HAProxy layer to the mix..and thus, adds another failure point.
Is there a better solution to implement this requirement?
Now, this solution works, but the worker names and the corresponding nodePorts are hard-coded in this config, which obviously is inconvenient as and when more workers are added (or removed/changed).
You can explicitly configure the NodePort for your Kubernetes Service so it doesn't pick a random port and you always use the same port on your external HAProxy:
apiVersion: v1
kind: Service
metadata:
name: <my-nodeport-service>
labels:
<my-label-key>: <my-label-value>
spec:
selector:
<my-selector-key>: <my-selector-value>
type: NodePort
ports:
- port: <service-port>
nodePort: 32200
We came across the HAProxy Ingress Controller (https://www.haproxy.com/blog/haproxy_ingress_controller_for_kubernetes/) which sounds promising, but (we feel) effectively adds another HAProxy layer to the mix..and thus, adds another failure point.
You could run the HAProxy ingress inside the cluster and remove the HAproxy outside the cluster, but this really depends on what type of service you are running. The Kubernetes Ingress is Layer 7 resource, for example. The DR here would be handled by having multiple replicas of your HAProxy ingress controller.

Is it possible to route traffic to a specific Pod?

Say I am running my app in GKE, and this is a multi-tenant application.
I create multiple Pods that hosts my application.
Now I want:
Customers 1-1000 to use Pod1
Customers 1001-2000 to use Pod2
etc.
If I have a gcloud global IP that points to my cluster, is it possible to route a request based on the incoming ipaddress/domain to the correct Pod that contains the customers data?
You can guarantee session affinity with services, but not as you are describing. So, your customers 1-1000 won't use pod-1, but they will use all the pods (as a service makes a simple load balancing), but each customer, when gets back to hit your service, will be redirected to the same pod.
Note: always within time specified in (default 10800):
service.spec.sessionAffinityConfig.clientIP.timeoutSeconds
This would be the yaml file of the service:
kind: Service
apiVersion: v1
metadata:
name: my-service
spec:
selector:
app: my-app
ports:
- name: http
protocol: TCP
port: 80
targetPort: 80
sessionAffinity: ClientIP
If you want to specify time, as well, this is what needs to be added:
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10
Note that the example above would work hitting ClusterIP type service directly (which is quite uncommon) or with Loadbalancer type service, but won't with an Ingress behind NodePort type service. This is because with an Ingress, the requests come from many, randomly chosen source IP addresses.
Not with Pods by themselves, but you should be able to with Services.
Pods are intended to be stateless and indistinguishable from one another.
But you should be able to create a Deployment per customer group, and a Service per Deployment. The Ingress nginx should be able to be told to map incoming requests by whatever attributes are relevant to specific customer group Services.