Is there a way to isolate deployments in Istio service mesh? - kubernetes

I am trying to learn about the microservice architecture and different microservices interacting with each other. I have written a simple microservice based web app and had a doubt regarding it. If a service has multiple versions running, load balancing is easily managed by the envoy siedcar in Istio. My question is that in case there is some vulnerability detected in one of the versions, is there a way to isolate the pod from receiving any more traffic.
We can manually do this with the help of a virtual service and the appropriate routing rule. But can it be dynamically performed based on some trigger event?
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: VirtualServiceName
spec:
hosts:
- SomeHost
http:
- route:
- destination:
host: SomeHost
subset: v1
weight: 0
- destination:
host: SomeHost
subset: v2
weight: 100
Any help is appreciated

According to istio documentation you can configure failover with LocalityLoadBalancerSetting.
If the goal of the operator is not to distribute load across zones and regions but rather to restrict the regionality of failover to meet other operational requirements an operator can set a ‘failover’ policy instead of a ‘distribute’ policy.
The following example sets up a locality failover policy for regions. Assume a service resides in zones within us-east, us-west & eu-west this example specifies that when endpoints within us-east become unhealthy traffic should failover to endpoints in any zone or sub-zone within eu-west and similarly us-west should failover to us-east.
failover:
- from: us-east
to: eu-west
- from: us-west
to: us-east
Failover requires outlier detection to be in place for it to work.
But it's rather for regions/zones not pods.
If it's about pods you could take a look at this istio documentation
While Istio failure recovery features improve the reliability and availability of services in the mesh, applications must handle the failure or errors and take appropriate fallback actions. For example, when all instances in a load balancing pool have failed, Envoy returns an HTTP 503 code. The application must implement any fallback logic needed to handle the HTTP 503 error code..
And take a look at this and this github issues.
During HTTP health checking Envoy will send an HTTP request to the upstream host. By default, it expects a 200 response if the host is healthy. Expected response codes are configurable. The upstream host can return 503 if it wants to immediately notify downstream hosts to no longer forward traffic to it.
I hope you find this useful.

Related

Availability with Kubernetes

We run an internal a healthcheck of the service every 5 seconds. And we run Kubernetes liveness probes every 1 second. So in the worst scenario the Kubernetes loadbalancer has up-to-date information every 6 seconds.
My question is what happens when a client request hits a pod which is broken but not seen by the loadbalancer as unhealthy? Should the client implement a logic with retries? Or should we implement backend logic to handle the cases when a request hits a pod which is not yet seen as unhealthy by the loadbalancer?
Not sure how your architecture is however LoadBalancers are generally set with the ingress controller like Nginx and etc.
Load Balancer backed by the ingress controller forwards the traffic to the K8s service, and the K8s service mostly manages the request routing to PODs, not LB.
Based on the Readiness K8s service route the request to PODs, so if your POD is NotReady, the request won't reach there. Due to any delay if the request reaches to that POD there could be a chance you get internal error or so in return.
Retries
yes, you implement the retries at the client side but if you are on K8s, you can offload the retries part to the service mesh. it would be easy to maintain and integrate retries logic with the K8s and service mesh.
You can use the service mesh like Istio and implement the retries policy at virtual service level
retries:
attempts: 5
retryOn: 5xx
Virtual service
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ratings
spec:
hosts:
- ratings
http:
- route:
- destination:
host: ratings
subset: v1
retries:
attempts: 3
perTryTimeout: 2s
Read more at : https://istio.io/latest/docs/concepts/traffic-management/#retries

How to replicate some traffic on Kubernetes and divert it to the Service for investigation

I'm using istio and I know that I can define weights in Virtual Service and divert traffic to different services.
My question is: how to amplify some of the traffic and direct the amplified traffic to the validation service? This amplified traffic will not go back to the original source but will be closed within the cluster. In other words, it does not bother the user.
I'm not even sure if there is an ecosystem, feature or application that provides this kind of mechanism. I don't even know if there is an ecosystem or application that provides such a mechanism and I don't know what it is called, so I'm having trouble finding it.
Thanks.
OP found a soultion by themselves, in the comments, hence the CW.
In this scenario, the best solution is to use Istio Mirroring - also called shadowing.
When traffic gets mirrored, the requests are sent to the mirrored service with their Host/Authority headers appended with -shadow. For example, cluster-1 becomes cluster-1-shadow.
Also, it is important to note that these requests are mirrored as “fire and forget”, which means that the responses are discarded.
To create mirroring rule, you have to create VirtualService with mirror and mirrorPercentage fields.
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
spec:
hosts:
- httpbin
http:
- route:
- destination:
host: httpbin
subset: v1
weight: 100
mirror:
host: httpbin
subset: v2
mirrorPercentage:
value: 100.0
This route rule sends 100% of the traffic to v1. The last stanza specifies that you want to mirror (i.e., also send) 100% of the same traffic to the httpbin:v2 service.
[source]

istio-proxy closing long running TCP connection after 1 hour

TL;DR: How can we configure istio sidecar injection/istio-proxy/envoy-proxy/istio egressgateway to allow long living (>3 hours), possibly idle, TCP connections?
Some details:
We're trying to perform a database migration to PostgreSQL which is being triggered by one application which has Spring Boot + Flyway configured, this migration is expected to last ~3 hours.
Our application is deployed inside our kubernetes cluster, which has configured istio sidecar injection. After exactly one hour of running the migration, the connection is always getting closed.
We're sure it's istio-proxy closing the connection as we attempted the migration from a pod without istio sidecar injection and it was running for longer than one hour, however this is not an option going forward as this may imply some downtime in production which we can't consider.
We suspect this should be configurable in istio proxy setting the parameter idle_timeout - which was implemented here. However this isn't working, or we are not configuring it properly, we're trying to configure this during istio installation by adding --set gateways.istio-ingressgateway.env.ISTIO_META_IDLE_TIMEOUT=5s to our helm template.
If you use istio version higher than 1.7 you might try use envoy filter to make it work. There is answer and example on github provided by #ryant1986.
We ran into the same problem on 1.7, but we noticed that the ISTIO_META_IDLE_TIMEOUT setting was only getting picked up on the OUTBOUND side of things, not the INBOUND. By adding an additional filter that applied to the INBOUND side of the request, we were able to successfully increase the timeout (we used 24 hours)
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: listener-timeout-tcp
namespace: istio-system
spec:
configPatches:
- applyTo: NETWORK_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.filters.network.tcp_proxy
patch:
operation: MERGE
value:
name: envoy.filters.network.tcp_proxy
typed_config:
'#type': type.googleapis.com/envoy.config.filter.network.tcp_proxy.v2.TcpProxy
idle_timeout: 24h
We also created a similar filter to apply to the passthrough cluster (so that timeouts still apply to external traffic that we don't have service entries for), since the config wasn't being picked up there either.
for ingress gateway, we use env.ISTIO_META_IDLE_TIMEOUT to set the idle-timeout for TCP or HTTP protocol.
for sidecar, you can use the similar envoyfilter (listener-timeout-tcp) to configure INBOUND direction or OUTBOUND direction.

GKE No load balancing between backends pod with session affinity(sticky sessions)

I have 2 backend application running on the same cluster on gke. Applications A and B. A has 1 pod and B has 2 pods. A is exposed to the outside world and receives IP address that he then sends to B via http requests in the header.
B has a Kubernetes service object that is configured like that.
apiVersion: v1
kind: Service
metadata:
name: svc-{{ .Values.component_name }}
namespace: {{ include "namespace" .}}
spec:
ports:
- port: 80
targetPort: {{.Values.app_port}}
protocol: TCP
selector:
app: pod-{{ .Values.component_name }}
type: ClusterIP
In that configuration, The http requests from A are equally balanced between the 2 pods of application B, but when I add sessionAffinity: ClientIP to the configuration, every http requests are sent to the same B pod even though I thought it should be a round robin type of interaction.
To be clear, I have the IP adress stored in the header X-Forwarded-For so the service should look at it to be sure to which B pod to send the request as the documentation says https://kubernetes.io/docs/concepts/services-networking/service/#ssl-support-on-aws
In my test I tried to create has much load has possible to one of the B pod to try to contact the second pod without any success. I made sure that I had different IPs in my headers and that it wasn't because some sort of proxy in my environment. The IPs were not previously used for test so it is not because of already existing stickiness.
I am stuck now because I don't know how to test it further and have been reading the doc and probably missing something. My guess was that sessionAffinity disable load balancing for ClusterIp type but this seems highly unlikely...
My questions are :
Is the comportment I am observing normal? What am I doing wrong?
This might help to understand if it is still unclear what I'm trying to say : https://stackoverflow.com/a/59109265/12298812
EDIT : I did test on the client upstream and I saw at least a little bit of the requests get to the second pod of B, but this load test was performed from the same IP for every request. So this time I should have seen only a pod get the traffic...
The behaviour suggests that x-forward-for header is not respected by cluster-ip service.
To be sure I would suggest to load test from upstream client service which consumes the above service and see what kind of behaviour you get. Chances are you will see the same incorrect behaviour there which will affect scaling your service.
That said, using session affinity for internal service is highly unusual as client IP addresses do not vary as much. Session affinity limits scaling ability of your application. Typically you use memcached or redis as session store which is likely to be more scalable than session affinity based solutions.

Weighted routing over kubernetes services

I have one master service and multiple slave services. The master service continuously polls a topic using subscriber from Google PubSub. The Slave services are REST APIs. Once the master service receives a message, it delegates the message to a slave service. Currently I'm using ClusterIP service in Kubernetes. Some of my requests are long running and some are pretty short.
I happen to observe that sometimes if there's a short running request while a long running request is in process, it has to wait until the long running request to finish even though many pods are available without serving any traffic. I think it's due to the round robin load balancing. I have been trying to find a solution and looked into approaches like setting up external HTTP load balancer with ingress and internal HTTP load balancer. But I'm really confused about the difference between these two and which one applies for my use case. Can you suggest which of the approaches would solve my use case?
TL;DR
assuming you want 20% of the traffic to go to x service and the rest 80% to y service. create 2 ingress files for each of the 2 targets, with same host name, the only difference is that one of them will carry the following ingress annotations: docs
nginx.ingress.kubernetes.io/canary: "true" #--> tell the controller to not create a new vhost
nginx.ingress.kubernetes.io/canary-weight: "20" #--> route here 20% of the traffic from the existing vhost
WHY & HOW TO
weighted routing is a bit beyond the ClusterIP. as you said yourself, its time for a new player to enter the game - an ingress controller.
this is a k8s abstraction for a load balancer - a powerful server sitting in front of your app and routing the traffic between the ClusterIPs.
install ingress controller on gcp cluster
once you have it installed and running, use its canary feature to perform a weighted routing. this is done using the following annotations:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: http-svc
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/canary: "true"
nginx.ingress.kubernetes.io/canary-weight: "20"
spec:
rules:
- host: echo.com
http:
paths:
- backend:
serviceName: http-svc
servicePort: 80
here is the full guide.
External vs internal load balancing
(this is the relevant definition from google cloud docs but the concept is similar among other cloud providers)
GCP's load balancers can be divided into external and internal load
balancers. External load balancers distribute traffic coming from the
internet to your GCP network. Internal load balancers distribute
traffic within your GCP network.
https://cloud.google.com/load-balancing/docs/load-balancing-overview