Does a K8s service re-send requests when readinessProbe fails? - kubernetes

A service dispatches to multiple replicas of a deployment (default round-robin).
The backend instances temporarily have to go offline, i.e. they will close their port 80, take some time, and then open the port again.
deployment.yaml is using readinessProbe to regularly check which backend instances are ready to serve requests.
But what happens in the following scenario?
1) readiness check backend A: OK
2) backend A goes offline
3) requests to service is forwarded to backend A
4) readiness check backend A: fail
Will the service send the request again but to another backend instance, or is it lost?

It depends on the type of Service.
If the Service is a ClusterIP or NodePort, it's instantiated as iptables rules. Packets destined for the now offline pod will be undeliverable, causing the request to timeout.
If the Service is a LoadBalancer, the implementation is an application, like nginx or an equivalent. It will watch for timeouts, and generally speaking- though dependent on configuration- will retry, allowing the request to make it to an online pod.

Related

Google Kubernetes engine inter-cluster session affinity(Sticky Session)

The situation is that I have 2 applications: A and B that are in the same namespace of a cluster on gke. A is on 1 pod and B is on 2 pods.
Everytime a client communicates with our service. It connects first on A with websockets. A then sends http request to B. Since there is 2 pods of B, I would like to have session affinity between the Client from outside and with my application B so that everytime a client connects to A, it will always process his requests through the same pod of B.
Every session affinity option I saw are based on Ingress gateway or services, but since I'm already in the cluster, I don't need an Ingress.
I also saw that there is some services that provides support for http cookies. That would be good but it is always an external service like Nginx or Istio and since I'm working in a highly restricted development environment it is kind of a pain to add those service in the cluster.
Is there anything native to the gke that can provide me with http cookie session affinity or something similar?
In a GKE cluster, when you create a Kubernetes Ingress object, the GKE ingress controller wakes up and creates a Google Cloud Platform HTTP(S) load balancer. The ingress controller configures the load balancer and also configures one or more backend services that are associated with the load balancer.
Beginning with GKE version 1.11.3-gke.18, you can use an Ingress to configure these properties of a backend service:
Timeout
Connection draining timeout
Session affinity
This will be useful for you and it's native of GKE Ingress.
You could have set Client IP based session affinity, which happens at service level, like this one:
apiVersion: v1
kind: Service
metadata:
name: svc-sa
spec:
ports:
- port: 80
targetPort: 80
protocol: TCP
name: http
selector:
app: nginx
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 3600
So, this service would route the request to the same backend (Pod), depending on the source IP address if the request.
You need to set a service anyways in front of the application B targeting the two pods. Now, the problem here is that your application A is acting as proxy, so all the requests will be considered from application A.
I know this is now a complete answer, but you might be able to do something in the application A, header wise (X-Forwarded-For), to bypass Pod A IP address, by the the IP address of the original request.
Everytime a client communicates with our service. It connects first on A with websockets. A then sends http request to B. Since there is 2 pods of B, I would like to have session affinity between the Client from outside and with my application B so that everytime a client connects to A, it will always process his requests through the same pod of B.
A communicate with B. A should connect to a specific instance of B depending on what end-user is connected to B.
This is sharding not session affinity but I understand what you mean. This means that your service B need a stable network identity.
Sharding by User identity
Service B need to be deployed as a StatefulSet to get a stable network identity. Then Service A can do sharding e.g. based on username or a range of IP-address or something, so requests for user X is always handled by instance Y.
With Service B deployed as StatefulSet, the instances will be named e.g. app-b-0 and app-b-1 so each instance can be addressed from Service A with a stable identity.

Traefik health checks via kubernetes annotation

I want setup Traefik backend health check via Kubernetes annotation, but looks like Kubernetes Ingress does not support that functionality according to official documentation.
Is any particular reason why Traefik does not support that functionality for Kubernetes Ingress? I'm wondering because Mesos support health checks for a backend.
I know that in Kubernetes you can configure readiness/liveness probe for the pods, but I have leader/follower fashion service, so Traefik should route the traffic only to the leader.
UPD:
The only leader can accept the connection from Traefik; a follower will refuse the connection.
I have two readiness checks in my mind:
Service is up and running, and ready to be elected as a leader (kubernetes readiness probe)
Service is up and running and promoted as a leader (traefik health check)
Traefik relies on Kubernetes to provide an indication of the health of the underlying pods to ascertain whether they are ready to provide service. Kubernetes exposes two mechanisms in a pod to communicate information to the orchestration layer:
Liveness checks to provide an indication to Kubernetes when the process(es) running in the pod have transitioned to a broken state. A failing liveness check will cause Kubernetes to destroy the pod and recreate it.
Readiness checks to determine when a pod is ready to provide service. A failing readiness check will cause the Endpoint Controller to remove the pod from the list of endpoints of any services it provides. However, it will remain running.
In this instance, you would expose information to Traefik via a readiness check. Configure your pods with a readiness check which fails if they are in a state in which they should not receive any traffic. When the readiness state changes, Kubernetes will update the list of endpoints against any services which route traffic to the pod to add or remove the pod. Traefik will accordingly update its view of the world to add or remove the pod from the list of endpoints backing the Ingress.
There is no reason for this model to be incompatible with your master/follower architecture, provided each pod can ascertain whether it is the master or follower and provide an appropriate indication in its readiness check. However, without taking special care, there will be races between the master/follower state changing and Kubernetes updating its endpoints, as readiness probes are only made periodically. I recommend assuming this will be the case and building-in logic to reject requests received by non-master pods.
As a future consideration to increase robustness, you might split the ingress layer of your service from the business logic implementing the master/follower system, allowing all instances to communicate with Traefik and enqueue work for consideration by whatever is the "master" node at this point.

Kubernetes - Sending message to all pods from service or from Ingress

Is it possible to send requests to all pods behind a service/ingress controller based on the requests?
My requirement is to send requests to all the pods if the request is /send/all.
Thanks
It's not possible because ingress controller can't do this (for sure nginx and GLBC based ingress can't do it, bud due to the way how to http works I assume this is the case for all ingress controllers).
Depending what your exact case is you have few options.
If your case is just monitoring and you can afford using control on number of request sending to your pods you can just set http liveness probe for your pods. Then you will be sure that if pod doesn't return correct response k8s won't send traffic to it.
If you need to trigger some action on all pods you have few options:
Use messaging - for example you can use rabbitmq chart to deploy rabbitmq and write some application that will handle your traffic.
Using DB - create some app that will set some flag in DB abd add some logic to your app to monitor the flag, or create cron job and to monitor the flag and trigger and trigger required actions on pods (in this case you can use service account to give your cron job pod to k8s API to list pods.

What does Traefik do to connections to deleted Pods?

Imagine you have a k8s cluster set up with Traefik as an Ingress Controller.
An HTTP API application is deployed to the cluster (with an ingress resource) that is able to handle SIGTERM and does not exit until all active requests are handled.
Let's say you deploy the application with 10 replicas, get some traffic to it and scale the deployment down to 5 replicas. Those 5 Pods will be pulled out from the matching Service resource.
For those 5 Pods, the application will receive SIGTERM and start the graceful shutdown.
The question is, what will Traefik do with those active connections to the pulled out 5 Pods?
Will it wait until all the responses are received from the 5 Pods and not send any traffic to them during and after that?
Will it terminate the ongoing connections to those 5 Pods and forget about them?
Traefik will do the first: it will gracefully let those pending, in-flight requests finish but not forward any further requests to the terminating pods.
To add some technical background: once a pod is deemed to terminate from Kubernetes' perspective, the Endpoints controller (also part of the Kubernetes control plane) will remove its IP address from the associated endpoints object. Traefik watches for updates on the endpoints, receives a notification, and updates its forwarding rules accordingly. So technically, it is not able to forward any further traffic while those final requests will continue to be served (by previously established goroutines from Go's http package).

Openshift + Readiness Check

On Openshift/Kubernetes, when a readiness check is configured with a HTTP GET with a path on for example a spring boot app with a service and route, is the HTTP GET request calling the Openshift service or route or something else and expecting 200-399?
Thanks,
B.
The kubernetes documentation on readiness and liveness probes states that
For an HTTP probe, the kubelet sends an HTTP request to the specified path and port to perform the check. The kubelet sends the probe to the pod’s IP address, unless the address is overridden by the optional host field in httpGet. [...] In most scenarios, you do not want to set the host field. Here’s one scenario where you would set it. Suppose the Container listens on 127.0.0.1 and the Pod’s hostNetwork field is true. Then host, under httpGet, should be set to 127.0.0.1. If your pod relies on virtual hosts, which is probably the more common case, you should not use host, but rather set the Host header in httpHeaders.
So it uses the Pod's IP unless you set the host field on the probe. The service or ingress route are not used here because the readiness and liveness probes are used to decide if the service or ingress route should send traffic to the Pod.
The HTTP request comes from the Kubelet. Each kubernetes node runs the Kubelet process, which is responsible for node registration, and management of pods. The Kubelet is also responsible for watching the set of Pods that are bound to its node and making sure those Pods are running. It then reports back status as things change with respect to those Pods.
When talking about the HTTP probe, the docs say that
Any code greater than or equal to 200 and less than 400 indicates success. Any other code indicates failure.
Correct, it is using a webhook to determine if the container is ready to serve requests or not. By default the request is made to the Pod IP directly since when it fails, the container IP is removed from all endpoints for all services. This can be overridden by the host filed in the probe definition.
Any response code from 200-399 is considered a success as you have mentioned.