I'm fairly new to K8s but not so new that I haven't got a couple of running stacks and even a production site :)
I've noticed in a new deployment the ingress as below:
Type: Ingress
Load balancer: External HTTP(S) LB
Is reporting All backend services are in UNHEALTHY state which is odd since the service is working and traffic is/has been being served from it for a week.
Now, on closer inspection Backend services: k8s-be-32460--etcetc is what's unhappy. So using the GUI I click that...
Then I start to see the frontend with a funnel for ASIA, Europe, & America. Which seems to be funneling all traffic to Europe. Presumably, this is normal for the distributed external load balancer service (as per the docs) and my cluster resides in Europe. Cool. Except...
k8s-ig--etcetc europe-west1-b 1 of 3 instances healthy
1 out of 3 instances you say? eh?
And this is about as far as I've got so far. Can anyone shed any light?
Edit:
Ok, so one of the nodes reporting as unhealthy was in fact a node from the default-node-pool. I have now scaled back to 0 nodes since as far I'm aware the preference is to manage them explicitly. Leaving us with just 2 nodes. 1 of which is un-healthy according to the ingress, despite both being in the same zone.
Digging even further somehow it is reporting in the GUI that only one of the instance group instances is healthy. Yet these instances are auto-created by GCP I don't manage them.
Any ideas?
Edit 2:
I followed this right the way through SSH to each of the VM's in the instance group and executing the health check on each node. One does indeed fail.
Just a simple curl localhost:32460 one routes & the other doesn't. Though there is something listening on 32460 as shown here
tcp6 0 0 :::32460 :::* LISTEN -
The healthcheck is HTTP / 32460
Any ideas why a single node will have stopped working. As I say, I'm not savvy with how this underlying VM has been configured.
Wondering now whether it's just some sort of straightforward routing issue but it's extremely convoluted at this point.
This works for me:
In my case, I was exposing an API, this API hasn't a default route, that is to say, if I type myIp, It's was returning a 404 error (not found). So, I made a test and I put a "default" route on my Startup.cs, like this:
app.UseEndpoints(endpoints =>
{
endpoints.MapGet("/", async context =>
{
await context.Response.WriteAsync("Hola mundillo");
});
endpoints.MapControllers();
});
Then, the status passed from unhealthy to Ok. Maybe that isn't a definitive solution, but maybe, It can help someone to find the error.
Related
I was using NodePort to host a webapp on Google Container Engine (GKE). It allows you to directly point your domains to the node IP address, instead of an expensive Google load balancer. Unfortunately, instances are created with HTTP ports blocked by default, and an update locked down manually changing the nodes, as they are now created using and Instance Group/and an Immutable Instance Template.
I need to open port 443 on my nodes, how do I do that with Kubernetes or GCE? Preferably in an update resistant way.
Related github question: https://github.com/nginxinc/kubernetes-ingress/issues/502
Using port 443 on your Kubernetes nodes is not a standard practice. If you look at the docs you and see the kubelet option --service-node-port-range which defaults to 30000-32767. You could change it to 443-32767 or something. Note that every port under 1024 is restricted to root.
In summary, it's not a good idea/practice to run your Kubernetes services on port 443. A more typical scenario would be an external nginx/haproxy proxy that sends traffic to the NodePorts of your service. The other option you mentioned is using a cloud load balancer but you'd like to avoid that due to costs.
Update: A deamonset with a nodeport can handle the port opening for you. nginx/k8s-ingress has a nodeport on 443 which gets exposed by a custom firewall rule. the GCE UI will not show「Allow HTTPS traffic」as checked, because its not using the default rule.
You can do everything you do on the GUI Google Cloud Console using the Cloud SDK, most easily through the Google Cloud Shell. Here is the command for adding a network tag to a running instance. This works, even though the GUI disabled the ability to do so
gcloud compute instances add-tags gke-clusty-pool-0-7696af58-52nf --zone=us-central1-b --tags https-server,http-server
This also works on the beta, meaning it should continue to work for a bit.
See https://cloud.google.com/sdk/docs/scripting-gcloud for examples on how to automate this. Perhaps consider running on a webhook when downtime is detected. Obviously none of this is ideal.
Alternatively, you can change the templates themselves. With this method you can also add a startup to new nodes, which allows you do do things like fire a webhook with the new IP Address for a round robin low downtime dynamic dns.
Source (he had the opposite problem, his problem is our solution): https://stackoverflow.com/a/51866195/370238
If I understand correctly, if nodes can be destroyed and recreated themselves , how are you going to rest assured that certain service behind port reliably available on production w/o any sort of load balancer which takes care of route orchestration diverting port traffic to new node(s)
So I'm setting up a NATS cluster at work in OpenShift. I can easily get things to work by having each NATS server instance broadcast its Pod IP to the cluster. The guy I talked to at work strongly advised against using the Pod IP and suggested using the Pod name. In the email, he said something about if a pod restarted. But like I tried deleting the pod and the new Pod IP was in the list of connect urls for NATS and it worked fine. I know Kubernetes has DNS and you can use the headless service but it seems somewhat flaky to me. The Pod IP works.
I believe "the guy at work" has a point, to a certain extent, but it's hard to tell to which extent it's cargo-culting and what is half knowledge. The point being: the pod IPs are not stable, that is, every time a pod gets re-launched (on the same node or somewhere else, doesn't matter) it will get a new IP from the pod CIDR-range assigned.
Now, services provide stability by introducing a virtual IP (VIP): this acts as a cluster-internal mini-load balancer sitting in front of pods and yes, the recommended way to talk to pods, in the general case, is via services. Otherwise, you'd need to keep track of the pod IPs out-of-band, no bueno.
Bottom-line: if NATS manages that for you, keeps track and maps pod IPs then fine, use it, no harm done.
While the answer from Michael is mostly true, it is important to understand there is no 100% guarantee that a service IP (aka ClusterIP) service will not change it's IP. There is a specific case of service recreation (delete/create) that will cause service IP change.
That said, the situation is somewhat different for services that have their own means of autodiscovery and/or clustering. Usually it will not be fine or enough to have a single regular service. They need to connect to seed, or discover all nodes etc. One of the means that you might use here are headless services, which return, under given name a full list of all, direct pod IPs.
Mind that using headles service has its tiny quirks as well, ie. not all software re-resolves DNS over time after initial startup, so you might end up with cached endpoints that become obsolete over time.
You might also want to leverage StatefulSets capability to retain a deterministic name (aka network identity) for each pod (ie. mypod-1, mypod-2 etc.) which, combined with headless Service, will give you static per pod names to use.
I do think that using only pod IPs will probably lead to some issues at one edge case or another, so you should at least use one of the above solutions for cluster discovery/registration. For actual communication during and after the pod was registered in the cluster, use of pod IPs can actually be for the best.
I have a StatefulSet with pods server-0, server-1, etc. I want to expose them directly to the internet with URLs like server-0.mydomain.com or like mydomain.com/server-0.
I want to be able to scale the StatefulSet and automatically be able to access the new pods from the internet. For example, if I scale it up to include a server-2, I want mydomain.com/server-2 to route requests to the new pod when it's ready. I don't want to have to also scale some other resource or create another Service to achieve that effect.
I could achieve this with a custom proxy service that just checks the request path and forwards to the correct pod internally, but this seems error-prone and wasteful.
Is there a way to cause an Ingress to automatically route to different pods within a StatefulSet, or some other built-in technique that would avoid custom code?
I don't think you can do it. Being part of the same statefulSet, all pods up to pod-x, are targeted by a service. As you can't define which pod is going to get a request, you can't force "pod-1.yourapp.com" or "yourapp.com/pod-1" to be sent to pod-1. It will be sent to the service, and the service might sent it to pod-4.
Even though if you could, you would need to dynamically update your ingress rules, which can cause a downtime of minutes, easily.
With the custom proxy, I see it impossible too. Note that it would need to basically replace the service behind the pods. If your ingress controller knows that it needs to deliver a packet to a service, now you have to force it to deliver to your proxy. But how?
A Kubernetes service is a set of iptables (or IPVS) rules that will redirect a packet with the ServiceIP as a destination address to ONE OF THE PODS that have the same label.
from Kubernetes Services documentation
The service installs iptables rules which select a backend Pod. By default, the choice of backend is random.
Which refers to the fact that a service is not able to distinguish between different pods in the same set.
If you want to force the selection of a specific Pod out of the set by changing the iprules (fairly simple), or by adding any type of proxy is problematic:
let's say you configured pod-1 and pod-2 (1.1.1.1 and 1.1.1.2 respectively), and you configured iptables rules to DNAT requests with destination pod-1.myserver.com to 1.1.1.1 and same for pod-2. (you may ask why the IP, and it's simply because it's the only way to distinguish between these pods)
This approach will fail whenever a pod restarts, let's say pod-1 failed, Kubernetes won't recreate the same pod with same IP and name, instead will create pod-3 with a different IP and updates the iptables accordingly. As a result, all the packets going toward 1.1.1.1 will be dropped until you update the proxy or iptables again.
In fact, that's one of the reasons why we use service to access pods instead of accessing them directly since the Pod IP can change however the service IP won't.
However, since this very specific part of kubernetes was my work for the last 4 months, I have developed a python script to edit the iptables and to choose a specific pod, my conclusion of that work was it's costy and time-consuming and will impose the server to go offline for a couple of seconds when the pods are changed, you can take a look at the code, it definitely works but its not recommended.
This problem is a kubernetes problem and the solution is changing the source code of Kube-proxy, which is my current work.
I suggest you read my answer explaining how kubernetes services exactly work in this question: Which service is doing load balancing between kubernetes nodes?
We have bunch of owin based web api services hosted in azure service fabric cluster. All of those services have been mapped to different ports in associated load balancer. There are 2 out-of-the-box probes when cluster is created. They are: FabricGatewayProbe and FabricHttpGatewayProbe. We added our port rules and used FabricGatewayProbe in them.
For some reason, these service endpoints seem to be going to sleep after a period of inactivity because clients of those services are timing out. We tried adjusting load balancer idle time out period to 30 minutes (which is maximum). It seems to help immediately but only for a brief period and then we are back to time out errors.
Where else should I be looking for resolution of this problem?
So further to our comments I agree that the documentation is open to interpretation, but after doing some testing I can confirm the following:
When creating a new cluster via the portal it will give you a 1:1 relation of rule to probe and I have also been able to reproduce your issue when modifying one of my existing ARM templates to use the same existing probe as you have.
On reflection this makes sense as a probe is effectively being bound to a service, if you attempt to share a probe for rules on different ports how will the load balancer know if one of the services is actually up, also Service Fabric (depending on your instance count settings) will move the services between nodes.
So if you had two services on different ports using the same probe on different nodes the service not using the port from the probe will receive the error that the request took too long to respond.
A little long winded so hopefully a quick illustration will help show what I mean.
one of The kubelet's start parameter is
-api-servers=[]: List of Kubernetes API servers for publishing events, and reading pods and services. (ip:port), comma separated.
It appears that it's designed for api-server's HA, only if one of the api server is alive, that everything will work well.
But I found that the kubelet would only choose the first api server, even if I gave it 3 api -servers. If the first api-server was stopped, all the services were unavailable.
The version I used is:
Kubernetes v1.2.1
So are there any ways to avoid this issue, Hopefully I just use it in a wrong way. Or I may fix it in the kubelet..
Any comments are appreciated.
This is expected.
In short, current model for HA expects load balancing (e.g., gcplb/elb/nginx/haproxy) in front of the apiserver, so that node components don't have to be aware of multiple apiservers. However, it's recognized that there is a need to pass multiple apiserver endpoints to kubernetes components, and is slotted for to be fixed for kubernetes v1.4.
See the detailed discussions in https://github.com/kubernetes/kubernetes/issues/18174