Azure Traffic Manager and Kubernetes Service showing Degraded - kubernetes

We're trying to implement a traffic manager over the top of our Azure Kubernetes services so we can run a cluster in 2 regions (uk west and south) and balance across both regions.
The actual traffic manager seems to be working ok, but in the azure portal its showing as being degraded, and in the ingress controller logs on the k8 cluster i can see a request that looks like this
[18/Sep/2019:10:40:58 +0000] "GET / HTTP/1.1" 404 153 "-" "Azure Traffic Manager Endpoint Monitor" 407 0.000 [-]
So the traffic manager is firing of a request, its hitting the ingress controller but it obviously cant resolve that path so its returning a 404.
I've had a play about with the Custom host header setting to point them to a health check endpoint in on of the pods, it did kind of work for a bit but then it seemed to go back to doing a GET on / so it went into degraded again (yeah i know sounds odd).
Even if that worked i dont really want to have to point it at a specific pod endpoint in case that is really down for some reason. Is there something we can do in the ingress controller config to make it respond with a 200 so the traffic manager knows that its up?
Cheers

I would suggest you to switch to TCP based probing for a quick fix. You can change the protocol to TCP and choose the port where your AKS is listening.
If the 3 way handshake to the port fails, then the probe is considered failed.

Why not expose a simple health check endpoint on the same pod where app is hosted rather than a different pod? If at all you deploy a work around to return http 200 from ingress controller and if the backend is down then the traffic will still be routed which defeats the reason to have a probe.

Related

Service-to-Service Communication in Kubernetes

I have deployed my Kubernetes cluster on EKS. I have an ingress-nginx which is exposed via load balancer to route traffic to different services. In ingress-nginx first request goes to auth service for authentication and if it is a valid request then I allow it move forward.
Let say the request is in Service 1 and now from there, it wants to communicate to Service 2. So if I somehow want my request to go directly to ingress not via load balancer and then from ingress to service 2.
Is is possible to do so?
Will it help in improving performance as I bypassed load balancer?
As the request is not moving through load balancer so load balancing won't take place, is it a serious concern?
1/ Is it possible: short answer, no.
There are edge cases, that would require for someone to create another Ingress object exposing Service2 in the first place. Then, you could trick the Ingress into routing you to some service that might not otherwise be reachable (if the DNS doesn't exist, some VIP was not yet exposed, ...)
There's no real issue with external clients bypassing the ELB, as long as they can not join all ports on your nodes, just the ones bound by your ingress controller.
2/ Bypassing the loadbalancer: won't change much in terms of performance.
If we're talking about a TCP loadbalancer, getting it away would help track real client IPs, though. Figuring out how to change it for an HTTP loadbalancer may be better -- though not always easy.
3/ Removing the LoadBalancer: if you have several nodes hosting replicas of your incress controller, then you would still be able to do some kind of DNS-based loadbalancing. Though for sure, it's not the same as having a real LB.
In AWS, you could find a middle ground setting up health-check based Route53 Records: set one for each node hosting an ingress controller, create another regrouping all healthy ingress nodes, then change your existing ingress FQDN records so they'ld all point to your new route53 name. You'ld be able to do TCP/HTTP checks against EC2 instances IPs, that's usually good enough. But again: DNS loadbalancing can suffer from outdated browser caches, some ISP not refreshing zones, ... LB is the real thing.

GCP kubernetes - Ingress reporting: All backend services are in UNHEALTHY state

I'm fairly new to K8s but not so new that I haven't got a couple of running stacks and even a production site :)
I've noticed in a new deployment the ingress as below:
Type: Ingress
Load balancer: External HTTP(S) LB
Is reporting All backend services are in UNHEALTHY state which is odd since the service is working and traffic is/has been being served from it for a week.
Now, on closer inspection Backend services: k8s-be-32460--etcetc is what's unhappy. So using the GUI I click that...
Then I start to see the frontend with a funnel for ASIA, Europe, & America. Which seems to be funneling all traffic to Europe. Presumably, this is normal for the distributed external load balancer service (as per the docs) and my cluster resides in Europe. Cool. Except...
k8s-ig--etcetc europe-west1-b 1 of 3 instances healthy
1 out of 3 instances you say? eh?
And this is about as far as I've got so far. Can anyone shed any light?
Edit:
Ok, so one of the nodes reporting as unhealthy was in fact a node from the default-node-pool. I have now scaled back to 0 nodes since as far I'm aware the preference is to manage them explicitly. Leaving us with just 2 nodes. 1 of which is un-healthy according to the ingress, despite both being in the same zone.
Digging even further somehow it is reporting in the GUI that only one of the instance group instances is healthy. Yet these instances are auto-created by GCP I don't manage them.
Any ideas?
Edit 2:
I followed this right the way through SSH to each of the VM's in the instance group and executing the health check on each node. One does indeed fail.
Just a simple curl localhost:32460 one routes & the other doesn't. Though there is something listening on 32460 as shown here
tcp6 0 0 :::32460 :::* LISTEN -
The healthcheck is HTTP / 32460
Any ideas why a single node will have stopped working. As I say, I'm not savvy with how this underlying VM has been configured.
Wondering now whether it's just some sort of straightforward routing issue but it's extremely convoluted at this point.
This works for me:
In my case, I was exposing an API, this API hasn't a default route, that is to say, if I type myIp, It's was returning a 404 error (not found). So, I made a test and I put a "default" route on my Startup.cs, like this:
app.UseEndpoints(endpoints =>
{
endpoints.MapGet("/", async context =>
{
await context.Response.WriteAsync("Hola mundillo");
});
endpoints.MapControllers();
});
Then, the status passed from unhealthy to Ok. Maybe that isn't a definitive solution, but maybe, It can help someone to find the error.

Disable healthckecks logs in ingress

I would like to disable the logging of the health checks produced by my Ingress, on my pods.
I have a GCE ingress, distributing two pods, and I would like to clear up the logs i get from them.
Do you have any idea ?
Thanks,
(It's not clear what do you mean by disabling logs. So I'll make an assumption.)
If your application is logging something when it gets a request, you can check the user agent of the request to disable requests from Google Load Balancer health checking.
When you provision a GCE ingress, your app will get a Google Cloud HTTP Load Balancer (L7). This LB will make health requests with header:
User-agent: GoogleHC/1.0
I recommend checking for a case-insensitive header ("user-agent") and again a case-insenstive check to see if its value is starting with "googlehc".
This way, you can distinguish Google HTTP (L7) load balancer health requests and leave them out of your logs.

How to expose redirection between pods outside k8s cluster

I'm to setup druid cluster with k8s and I'm seeking for help about how to expose redirection between pods outside k8s cluster.
Say I have two ClusterIP services to expose pods outside k8s.
service(10.0.0.1:8080) -> pod(hostname: coordinator)
service1(10.0.0.2:8080) -> pod1(hostname: coordinator1)
pod and pod1 are druid coordinator groups communicating via Zookepper. Since pod1 is the leader, every request to pod1 will be redirect to pod.
In this setup, I'm good with service but facing redirection issues while visiting service.
When I visit service1(10.0.0.2:8080) via browser, I'll be redirect to pod via its hostname, i.e. coordinator:8081.
However, coordinator is unkown outside k8s cluster and thus unreachable.
Could you please give me some suggestion on how to deal with this situation? Any tips is appreciate.
Here is the return after running wget -S -O - 10.0.0.1:8081
--2017-07-21 16:36:18-- http://10.0.0.1:8081/
Connecting to 10.0.0.1:8081... connected.
HTTP request sent, awaiting response...
HTTP/1.1 307 Temporary Redirect
Date: Fri, 21 Jul 2017 08:36:25 GMT
Location: http://coordinator:8081/
Content-Length: 0
Server: Jetty(9.3.16.v20170120)
Location: http://coordinator:8081/ [following]
--2017-07-21 16:36:18-- http://coordinator:8081/
Resolving coordinator (coordinator)... failed: Temporary failure in name resolution.
wget: unable to resolve host address 'coordinator'
One solution that comes to my mind (although a bit overenginered) is to make sure you always hit only the leader.
If you can create a readiness check against your pods that returns ok if it is the leader and link them to a common service, that service will always direct to the active leader.
The problem is not with kubernetes. The application server (Jetty) issues the redirect (to 'coordinator'; not 'coordinator1/2' which is another problem), not kube-proxy.
Probably the simplest solution is to setup an nginx inside your cluster to do reverse proxy to handle that redirect.
(You also don't need 2 services like you have now either)
nginx + confd
I had exactly the same problem and found the best solution in installing an nginx reverse proxy, forwarding all traffic to the current leader. The nginx config gets updated by confd, which is watching zookeeper for changes at the overlord/coordinator discovery keys.
To set things up, I found this article quite helpful.

How to get the real ip in the request in the pod in kubernetes

I have to get the real ip from the request in my business.actually I got the 10.2.100.1 every time at my test environment. any way to do this ?
This is the same question as GCE + K8S - Accessing referral IP address and How to read client IP addresses from HTTP requests behind Kubernetes services?.
The answer, copied from them, is that this isn't yet possible in the released versions of Kubernetes.
Services go through kube_proxy, which answers the client connection and proxies through to the backend (your web server). The address that you'd see would be the IP of whichever kube-proxy the connection went through.
Work is being actively done on a solution that uses iptables as the proxy, which will cause your server to see the real client IP.
Try to get that service IP which service is associated with that pods.
One very roundabout way right now is to set up an HTTP liveness probe and watch the IP it originates from. Just be sure to also respond to it appropriately or it'll assume your pod is down.