Is it possible to take pods directly offline in kubernetes loadbalancer - kubernetes

I have an app running on three pods behind a loadbalancer, all set up with Kubernetes. My problem is that when I take pods down or update them, this results in a couple of 503s before the loadbalancer notices the pod is unavailable and stops sending traffic to it. Is there any way to inform the loadbalancer directly that it should stop sending traffic to a pod? So we can avoid the 503s on pod update

You need to keep in mind if the pods are down the loadbalancer will still be redirecting the traffic to the designated service ports, and as no pod is servicing those ports.
Hence you should use rolling update mechanism in kubernetes which gives zero down time to the deployment. link

Since there are 3 pods running behind a Load balancer, I believe you must be using Deployment/Statefulset to manage them.
If by updating the pods you mean updating docker image version running in the pod then you can make use of Update strategies in Deployment to do rolling update. This will update your pods with zero downtime.
Additionally you can also make use of startup, readiness and liveness probe to only direct traffic to the pod when the pod is ready/live to serve traffic.

You should implement probes for pods. Read Configure Liveness, Readiness and Startup Probes.

There are readinessProbe and LivenessProbes which you can make use of. In your case I think you can make use of readinessProbe, only when your readinessProbe will pass, kubernetes will start sending traffic to your Pod.
For example
apiVersion: v1
Kind: Pod
metadata:
name: my-nginx-pod
spec:
containers:
- name: my-web-server
image: nginx
readinessProbe:
httpGet:
path: /login
port: 3000
in this above example, the nginx Pod will only receive traffic in case it passed the readinessProbe.
You can find more about probes here https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

Related

What's the default Kubernetes policy to distribute requests for an internal ClusterIP service?

I have been wondering how an internal Kubernetes service distributes the load of requests made from within the cluster to all PODs associated with the service. For example, given the following simple service from K8s docs.
apiVersion: v1
kind: Service
metadata:
name: my-service
spec:
selector:
app.kubernetes.io/name: MyApp
ports:
- protocol: TCP
port: 80
targetPort: 9376
I understand that the default service type is ClusterIP when the type property is not specified. But I couldn't find any docs clearly stating how all requests for this kind of service are distributed across the selected PODs.
The far I've got was to this post where there's a comment from Tim Hockin stating the following
The load is random, but the distribution should be approximately equal for non-trivial loads. E.g. when we run tests for 1000 requests you can see it is close to equal.
Is this the policy followed by CluterIP services? Can someone give more clarity on this topic?
The request load distribuition depends on what proxy mode is configured in your cluster on the kube-proxy. Often the chosen configuration is iptables. And according to the documentation on it:
In this mode, kube-proxy watches the Kubernetes control plane for the addition and removal of Service and EndpointSlice objects. For each Service, it installs iptables rules, which capture traffic to the Service's clusterIP and port, and redirect that traffic to one of the Service's backend sets. For each endpoint, it installs iptables rules which select a backend Pod.
By default, kube-proxy in iptables mode chooses a backend at random.
Usually this configuration is fine as probability will spread your requests somewhat evenly across pods. But if you need more more control over that you can change that configuration to IPVS mode where you can use round robin, least connections, among other options. More information on it can be seen here.
I hope this helps.

Can a deployment be completed even when readiness probe is failing

I have an application running in Kubernetes as a StatefulSet that starts 2 pods. It has configured a liveness probe and a readiness probe.
The liveness probe call a simple /health endpoint that responds when the server is done loading
The readiness probe, wait for some start-up job to complete. The job can take several minutes in some cases, and only when it finish the api of the application is ready to start accepting requests.
Even when the api is not available my app also run side jobs that don't depend on it, and I expect them to be done while the startup is happening too.
Is it possible to force Kubernetes deployment to complete and deploy 2 pods, even when the readiness probe is still not passing?
From the docs I get that the only effect of a readiness probe not passing is that the current pod won't be included as available in the loadbalancer service (which is actually the only effect that I want).
If the readiness probe fails, the endpoints controller removes the
Pod's IP address from the endpoints of all Services that match the
Pod.
However I am also seeing that the deployment never finishes, since pod 1 readiness probe is not passing and pod 2 is never created.
kubectl rollout restart statefulset/pod
kubectl get pods
NAME READY STATUS RESTARTS AGE
pod-0 1/2 Running 0 28m
If the readiness probe failure, always prevent the deployment, Is there other way to selectively expose only ready pods in the loadbalancer, while not marking them as Unready during the deployment?
Thanks in advance!
StatefulSet deployment
Is it possible to force kubernetes deployment to complete and deploy 2
pods, even when the readiness probe is still not passing?
Assuming it's meant statefulSet instead of deployment as object, the answer is no, it's not possible by design, most important is second point:
For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.
Before a scaling operation is applied to a Pod, all of its predecessors must be Running and Ready.
When Pods are being deleted, they are terminated in reverse order, from {N-1..0}.
When the nginx example above is created, three Pods will be deployed
in the order web-0, web-1, web-2. web-1 will not be deployed before
web-0 is Running and Ready, and web-2 will not be deployed until web-1
is Running and Ready
StatefulSets - Deployment and scaling guaranties
Readyness probe, endpoints and potential workaround
If the readiness probe failure, always prevent the deployment, Is
there other way to selectively expose only ready pods in the load
balancer, while not marking them as Unready during the deployment?
This is by design, pods are added to service endpoints once they are in ready state.
Some kind of potential workaround can be used, at least in simple example it does work, however you should try and evaluate if this approach will suit your case, this is fine to use as initial deployment.
statefulSet can be started without readyness probe included, this way statefulSet will start pods one by one when previous is run and ready, liveness may need to set up initialDelaySeconds so kubernetes won't restart the pod thinking it's unhealthy. Once statefulSet is fully run and ready, you can add readyness probe to the statefulSet.
When readyness probe is added, kubernetes will restart all pods again starting from the last one and your application will need to start again.
Idea is to start all pods and they will be able to serve requests +- at the same time, while with readyness probe applied, only one pod will start in 5 minutes for instance, next pod will take 5 minutes more and so on.
Example
Simple example to see what's going on based on nginx webserver and sleep 30 command which makes kubernetes think when readyness probe is setup that pod is not ready.
Apply headless service
Comment readyness probe in statefulSet and apply manifest
Observe that all pods are created right after previous pod is running and ready
Uncomment readyness probe and apply manifest
Kubernetes will recreate all pods starting from the last one waiting this time readyness probe to complete and flag a pod as running and ready.
Very convenient to use this command to watch for progress:
watch -n1 kubectl get pods -o wide
nginx-headless-svc.yaml:
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
nginx-statefulset.yaml:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nginx
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 3
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
command: ["/bin/bash", "-c"]
args: ["sleep 30 ; echo sleep completed ; nginx -g \"daemon off;\""]
readinessProbe:
tcpSocket:
port: 80
initialDelaySeconds: 1
periodSeconds: 5
Update
Thanks to #jesantana for this much easier solution.
If all pods have to be scheduled at once and it's not necessary to wait for pods readyness, .spec.podManagementPolicy can be set to Parallel. Pod Management Policies
Useful links:
Kubernetes statefulsets
kubernetes liveness, readyness and startup probes

Kubernetes local cluster Pod hostPort - application not accessible

I am trying to access a web api deployed into my local Kubernetes cluster running on my laptop (Docker -> Settings -> Enable Kubernetes). The below is my Pod Spec YAML.
kind: Pod
apiVersion: v1
metadata:
name: test-api
labels:
app: test-api
spec:
containers:
- name: testapicontainer
image: myprivaterepo/testapi:latest
ports:
- name: web
hostPort: 55555
containerPort: 80
protocol: TCP
kubectl get pods shows the test-api running. However, when I try to connect to it using http://localhost:55555/testapi/index from my laptop, I do not get a response. But, I can access the application from a container in a different pod within the cluster (I did a kubectl exec -it to a different container), using the URL
http://test-api pod cluster IP/testapi/index
. Why cannot I access the application using the localhost:hostport URL?
I'd say that this is strongly not recommended.
According to k8s docs: https://kubernetes.io/docs/concepts/configuration/overview/#services
Don't specify a hostPort for a Pod unless it is absolutely necessary. When you bind a Pod to a hostPort, it limits the number of places the Pod can be scheduled, because each <hostIP, hostPort, protocol> combination must be unique. If you don't specify the hostIP and protocol explicitly, Kubernetes will use 0.0.0.0 as the default hostIP and TCP as the default protocol.
If you only need access to the port for debugging purposes, you can use the apiserver proxy or kubectl port-forward.
If you explicitly need to expose a Pod's port on the node, consider using a NodePort Service before resorting to hostPort.
So... Is the hostPort really necessary on your case? Or a NodePort Service would solve it?
If it is really necessary , then you could try using the IP that is returning from the command:
kubectl get nodes -o wide
http://ip-from-the-command:55555/testapi/index
Also, another test that may help your troubleshoot is checking if your app is accessible on the Pod IP.
UPDATE
I've done some tests locally and understood better what the documentation is trying to explain. Let me go through my test:
First I've created a Pod with hostPort: 55555, I've done that with a simple nginx.
Then I've listed my Pods and saw that this one was running on one of my specific Nodes.
Afterwards I've tried to access the Pod in the port 55555 through my master node IP and other node IP without success, but when trying to access through the Node IP where this Pod was actually running, it worked.
So, the "issue" (and actually that's why this approach is not recommended), is that the Pod is accessible only through that specific Node IP. If it restarts and start in a different Node, the IP will also change.

Same node affinity on Kubernetes

I have nginx deployment pods as front that communicates to uwsgi deployment pods as back with ClusterIP service.
I want the nginx pod to use in priority the uwsgi pod that's running on its node.
Is it possible to do that with node affinity without naming nodes?
If you want to run nginx pod on the same node as uwsgi pod, use pod affinity.
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- uwsgi
For more details about pod affinity and anti-affinity click here
The provisioning of pods and there order/scheduling of on the same node can be achieved via node affinity. However, if you want Kubernetes to decide it for you will have to use inter-pod affinity.
Also just to verify if you are doing everything the right way please refer pod-affinity example.
From what I understand you have nginx pods and uwsgi pods,
and nginx pods proxy traffic to uwsgi pods.
And you are trying to make nxinx pods proxy traffic to uwsgi pods that are on the same node.
Previously posted answers are only partialy valid. Let me explain why.
Using PodAffinity will indeed schedule nginx and uwsgi pods together but it won't affect loadBalancing. Nginx <-> uwsgi loadbalancing will stay unchanged (will be random).
The easiest thing you can do is to run nginx container and uwsgi container in the same pod and make them communicate with localhost. In such way you are making sure that:
nginx and uwsgi always get scheduled on the same node
connection over localhost forces traffic to stay inside of a pod.
Let me know if this approach solves your problem or maybe for some reason we should try different approach.

traefik kubernetes crd health check

I am using Traefik 2.1 with kubernetes CRD's. My setup is very similar to the user guide. In my application, I have defined a livenessProbe and readinessProbe on the deployment. I assumed that traefik would route requests to the kubernetes load balancer, and kubernetes would know if the pod was ready or not. Kubernetes would also restart the container if the livenessProbe failed. Is there a default healthCheck for kubernetes CRD's? Does Traefik use the load balancer provided by the kubernetes service, or does it get the IP's for the pods underneath the service and route directly to them? Is it recommended to use a healthCheck with Traefik CRD's? Is there a way to not have to repeat the config for the readinessProbe and Traefik CRD healthCheck? Thank you
Is there a default healthCheck for kubernetes CRD's?
No
Does Traefik use the load balancer provided by the kubernetes service,
or does it get the IP's for the pods underneath the service and route
directly to them?
No. It directly gets IPs from the endpoint object
Is there a way to not have to repeat the config for the readinessProbe
and Traefik CRD healthCheck ?
Traefik will update its configuration when it sees that endpoint object do not have IPs which happens when liveness/readiness probe fails.So you can configure readiness and liveness probe on your pods and expect traefik do honour that.
Is there a way to not have to repeat the config for the readinessProbe
and Traefik CRD healthCheck
The benefit of using CRD approach is its dynamic in nature. Even If you are using the CRD along with health check mechanism provided by the CRD , the liveness and rediness probe of pods are still necessary for kubernetes to restart the pods and not send traffic to the pods from other pods which uses the kubernetes service corresponding to those pods.