Multiple Host Kubernetes Ingress Controller - kubernetes

I've been studying Kubernetes for a few weeks now, and using the kube-lego NGINX examples (https://github.com/jetstack/kube-lego) have successfully deployed services to Kubernetes cluster using Rancher on DigitalOcean.
I've deployed sample static sites, Wordpress, Laravel, Craft CMS, etc. All of which use custom Namespaces, Deployment, Secrets, Containers with external registries, Services, and Ingress Definitions.
Using the example (lego) NGINX Ingress Controller setup, I'm able to apply DNS to the exposed IP address of my K8s cluster, and have the resulting sites appear.
What I don't know, though, is how to allow for multiple hosts to have Ingress Controllers service the same deployments, and thus provide HA Ingress to the cluster. (by applying an external load balancer service, or geo-ip, or what-have-you).
Rancher (stable) allows me to add multiple hosts, I've spun up 3 to 5 at a time, and Kubernetes is configured and deployed across all Hosts. Furthermore, I'll define many replicas and/or deployments (listed above) and they will be spread over the cluster and accessible as would be expected. I've even specified multiple replicas of the Ingress Controller, but of course they all get scheduled on the same host, giving me only one IP address of Ingress.
So how do I allow multiple hosts (each with their own public facing IP address) to allow ingress into the cluster? I've also read about setting up multiple Ingress Controllers, but then you must specify what deployment/services are being serviced by what Ingress Controller, which then totally defeats the purpose.
Maybe I'm missing something, but if K8s multi-host is supposed to provide HA, and the Host with the Ingress Controller goes down, then the service will be rescheduled on the other Hosts, but the IP address that everything is pointing to will be dead, and thus an outage. Any way to have multiple IP Addresses to the same set of deployment/services?

I investigated my setup a bit more today, and I think I found out why I was having difficulty. The "LoadBalancer" is often mentioned as for use with Cloud Providers (in both docs, and what #fiunchinho describes). I was using it with a Rancher setup, which auto creates an HA-Proxy LoadBalancer ingress for you on the hosts.
By default, it will just schedule it on one of the hosts. You can specify that you want it scheduled globally buy providing an 'annotation' of io.rancher.scheduler.global: "true".
Like so:
annotations:
# Create load balancers on every host in the environment
io.rancher.scheduler.global: "true"
http://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
I preferred LoadBalancer over NodePort because I wanted the ability to send port 80 (and in the future port 443) to any of the Nodes, and have them successfully fulfil my request by inspecting the Host header, and directing as-needed.
These LBs can also be setup in the Rancher UI under the "Infrastructure Stack" menu. I have successfully removed the single LB, and re-added one with an "Always run one instance of this container on every host" option enabled.
After this was configured, I could make a request to any of the Hosts for any of the Ingresses, and get a response, no matter what host the container was scheduled on.
https://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
So cool!

The ingress controller is deployed like any regular pod. That means that you can have as many replicas as you'd like, which will be spread among all your nodes.
You need a Service object that group all the pods for the ingress controller.
Then you just need to expose that Service to outside the cluster. You can do that using a LoadBalancer service if you are on a cloud provider. Or you can use just a NodePort service.
The point is that the service will balance the traffic that your ingress controller receives between all the pods that are running on different kubernetes nodes. If one of the nodes goes down, it doesn't really matter, because there are other nodes containing ingress controller pods.

Related

How to expose app deployed on Rancher k3s to the Internet

I've different deployments over different namespaces and I would like to expose some of them to the Internet, even if I don't have a static and public IP available.
The different services are deployed on Rancher k3s and every service which should be publicly accessible has an Ingress defined in the same namespace.
I was trying to follow Rancher - How to expose my services publicly?, but I didn't really get what I've to do and, moreover:
Why do we need to define a LoadBalancer? It seems to me that the IngressController used by k3s (Traefik?) already creates one. If this is a must (or a good way to go), how it should the service defined exactly?
I don't have any Rancher UI in my environment. Therefore, is there a way to achieve what described in that link in a declarative way?
Is there a way to use services like No-IP or FreeDNS for the final hostname?
If I get it right, you deployed Kubernetes manually on barebone/vms nodes and now you want to reach you deployments running inside that cluster.
There is two level of loadbalancing in this setup, the one managed by your ingress controller, sounds like it is traefik in your case, and it is recommanded to run a second L4 load balancer in front of your workers to reach the ingress pods that are usually deployed on multiple/all nodes. Traefik, or other lb controllers, will load balancer traffic inside the k8s cluster without issue even if you don't have a L4 load balancer, but it is not recommanded as if you loose this node, no traffic can reach the kubernetes cluster anymore. You "just" need to have your dns resolution pointing at your public ip and routed to one of your worker, or the LB in front of it. However, if you don't have a L4 LB, you'll need to have your ingress pods listening on ports 80 and/or 443.
Most things that you do in Rancher UI is just an easier way to see your k8s objects, all ingress configuration can be achieved via kubectl, k9s (strongly recommand thatone!), lens or other methods. However k8s objects are still k8s objects. In this case, you need to have your services exposed with ClusterIP that are then reachable by the ingress pods.
I've never used such a solution natively from k8s, but when I had too the internet router was able to do this part, once you're there, it is internal routing.
I hope this helps. Ingress can definitely be a tough one to grasp!

Real IP (Domains and Subtomains) on Bare Metal Cluster with MatalLB and Ingress

help me figure it out.
I have a Bare Metal Kubernetes cluster with three nodes, each node has a public ip.
I have installed MetalLB and IngressController.
It is not clear to me which IP should I redirect domains and subdomains to so that they can be resolved by the Ingress Controller?
I need to initially define on which node the Ingress Controller will be launched?
I need to install the Ingress Controller, and then look at the worker node, on which it will be installed and send all domains or subdomains there?
What happens if, after restarting the cluster, the ingress controller will be deployed on another node?
All the tutorials I've seen show how it works locally or with a cloud load balancer.
Help me understand how this should work correctly.
Usually, when you install MetalLB, you configure a pool of addresses which can be used to assign new IPs at LoadBalancer services whenever they are created. Such IP addresses need to be available, they cannot be created out of nothing of course.. they could be in lease from your hosting provider for example.
If instead you have a private Bare Metal cluster which serves only your LAN network, you could just select a private range of IP addresses which are not used.
Then, once MetalLB is running, what happens is the following:
Someone / something creates a LoadBalancer services (an HELM Chart, a user with a definition, with commands, etc)
The newly created service needs an external IP. MetalLB will select one address from the configured selected range and assign it to that service
MetalLb will start to announce using standard protocol that the IP address can now be reached by contacting the cluster, it can work either in Layer2 mode (one node of the cluster holds that additional IP address) or BGP (true load balancing across all nodes of the cluster)
From that point, you can just reach the new service by contacting this newly assigned IP address (which is NOT the ip of any of the cluster nodes)
Usually, the Ingress Controller will just bring a LoadBalancer service (which will grab an external IP address from MetalLb) and then, you can reach hte Ingress Controller from that IP.
As for your other questions, you don't need to worry about where the Ingress Controller is running or similar, it will be automatically handled.
The only thing you may want to do is to make the domain names which you want to serve point to the external IP address assigned to the Ingress Controller.
Some docs:
MetalLB explanations
Bitnami MetalLB chart
LoadBalancer service docs
As an alternative (especially when you want "static" ip addresses) I should mention HAProxy, installed external to kubernetes cluster in a bare_server/vm/lxc_container/etc. and configured to send all incoming 80/433 traffic to the NodePort of ingress controller on all kubernetes workers (if no ingress pod is running on that worker traffic will be forwarded by kubernetes).
Of course, nowadays ip addresses are also "cattle", not "pets" anymore, so MetalLB is more of a "kubernetish" solution, but who knows ...
This is the link describing HAProxy solution (I am not affiliated with the author):
https://itnext.io/bare-metal-kubernetes-with-kubeadm-nginx-ingress-controller-and-haproxy-bb0a7ef29d4e

GKE: ingres with sub-domain

I used to work with Openshift/OKD cluster deployed in AWS and there it was possible to connect cluster to some domain name from Route53. Then as soon as I was deploying ingress with some hosts mappings (and the hosts defined in ingres were subdomains of the basis domain) all necessary lb rules (Routes in Openshift) and subdomain itself were created by Openshift and were directly available. For example: Openshift is connected to domain "somedomain.com" which is registered in Route53. In ingress I have the host mapping like:
hosts:
- host: sub1.somedomain.com
paths:
- path
After deployment I can reach sub1.somedomain.com. Is this kind of functionality available in GKE?
So far I have seen only mapping to static IP.
Also I red here https://cloud.google.com/kubernetes-engine/docs/how-to/ingress-http2 that if I need to connect service with ingress, the service have to be of type NodePort. Is it realy so? In Openshift it was not required any normal ClusterIP service could be connected to ingress.
Thanks in advance!
I think you should consider the other Ingress Controllers for your use cases.
I'm not an expert of the GKE, but as I can see Best practices for enterprise multi-tenancy as follows,
you need to consider how to route the multiple Ingress hostnames through wildcard subdomain like the OpenShift additionally.
Set up HTTP(S) Load Balancing with Ingress
:
You can create and configure an HTTP(S) load balancer by creating a Kubernetes Ingress resource,
which defines how traffic reaches your Services and how the traffic is routed to your tenant's application.
By registering Services with the Ingress resource, the Services' naming convention becomes consistent,
showing a single ingress, such as tenanta.example.com and tenantb.example.com.
The routing feature depends on the Ingress Controllers basically.
In my finding, the default Ingress Controllers of the GKE just creates a Google Cloud HTTP(S) Load Balancer, but it does not consider multi-tenancy by default like the OpenShift.
In contrast, in the OpenShift, the Ingress Controller was implemented using HAProxy with dynamic configuration feature as follows.
LB -tenanta.example.com--> HAProxy(directly forward the tenanta.example.com traffic to the target pod IPs) ---> Target Pods
The type of service exposition depends on the K8S implementation on each cloud provider.
If the ingress controller is a component inside your cluster, a ClusterIP is enough to have your service reachable (internally from inside the cluster itself)
If the ingress definition configure an external element (in case of GKE, a load balancer), this element isn't a part of the cluster and can't know the ClusterIP (because it is only accessible internally). A node port is required in this case.
So, in your case, either you expose your service in NodePort, or you configure GKE with another Ingress controller, locally installed in the cluster, instead of using this one by default.
So far GKE does not provide the possibility to dynamically create subdomains. The wished situation would be if GKE cluster can be set some DNS zone managed in GCP and there is a mimik of OpenShift Routes using for example ingress annotations.
But the reality tight now - you have to create subdomain or domain youself as well as IP address wich you connect this domain to. And this particular GCP IP address (using name) can be connected to ingress using annotations. Or it can be used in loadbalancer service.

How does Traefik / Ngnix - (Ingress Controllers) forwards request to two different services having configured with same port number.?

Basically I have Following Hdfs Cluster setup using docker-compose:
Node 1 with IP: 192.168.1.1 having service deployed as below:
Namenode1:9000
HMaster1: 8300
ZooKeeper1:1291
Node 2 with IP: 192.168.1.2 having service deployed as below:
Namenode2:9000
ZooKeeper2:1291
How does Traefik / Ngnix - (Ingress Controllers) forwards request to two different services having configured with same port number?
There are several great tutorials on how ingress and load balancing works in kubernetes, e.g. this one by Mark Betz. As a general rule, it helps to think in terms of services and workloads instead of specific nodes where your workloads are running on.
A workload deployed in Kubernetes (a so called Pod) has its own internal IP address, called a ClusterIP. That pod can have one or more ports open, just on that pod-owned ip address.
If you now have several pods to distribute the load, e.g. like 5 web server processes or backend logic, it would be hard for a client (inside the cluster) to keep track of all those pod IPs, because they also change when a pod is updated or just restarted due to a crash. This is why Kubernetes has a so called concept of services. Those provide a stable DNS name and IP which then transparently "forwards" to one of the healthy pods. So your client only needs to know the DNS name and not keep track of the specific pod IPs.
If you now want to expose such a service to the public, there are different ways. Either you set your service to type: LoadBalancer which then sets up some load balancer infrastructure on your cloud provider and routes traffic to the nodes and then to the pods - or - you already have an ingress controller in place and just define the routing based on host names and paths. An ingress controller itself is such a loadbalanced service with an attached cloud load balancer and also has some pods (with e.g. a traefik or nginx container) which then route your packets accordingly.
So coming back to your initial question: If you want to expose a service with several pods of the same kind, then you would first create a Service resource that matches your Pods using the selector and then you create one single ingress resource that provides a hostname/path and references this service. The ingress controller will pick up those ingress resources and configure the traefik or nginx accordingly. The ingress controller doesn't really care about the host IPs and port numbers, because it acts on the internal kubernetes ClusterIPs, so you even don't need (and shouldn't) expose such a service directly when you have an ingress in place.
I hope this answers your question regarding exposing two workloads over an ingress controller. For details, check the Kubernetes docs on Ingresses. Based on the services you named (zookeeper, hdfs) load balancing and ingresses might not be what you need for that case. Zookeeper instances should be internal in most cases and need to be adressed individually, so you might want to check out headless services, for this use case. Also check the Kubernetes docs for a way to run zookeeper.

Why do we need a load balancer to expose kubernetes services using ingress?

For a sample microservice based architecture deployed on Google kubernetes engine, I need help to validate my understanding :
We know services are supposed to load balance traffic for pod replicaset.
When we create an nginx ingress controller and ingress definitions to route to each service, a loadbalancer is also setup automatically.
had read somewhere that creating nginx ingress controller means an nginx controller (deployment) and a loadbalancer type service getting created behind the scene. I am not sure if this is true.
It seems loadbalancing is being done by services. URL based routing is
being done by ingress controller.
Why do we need a loadbalancer? It is not meant to load balance across multiple instances. It will just
forward all the traffic to nginx reverse proxy created and it will
route requests based on URL.
Please correct if I am wrong in my understanding.
A Service type LoadBalancer and the Ingress is the way to reach your application externally, although they work in a different way.
Service:
In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). The set of Pods targeted by a Service is usually determined by a selector (see below for why you might want a Service without a selector).
There are some types of Services, and of them is the LoadBalancer type that permit you to expose your application externally assigning a externa IP for your service. For each LoadBalancer service a new external IP will be assign to it.
The load balancing will be handled by kube-proxy.
Ingress:
An API object that manages external access to the services in a cluster, typically HTTP.
Ingress may provide load balancing, SSL termination and name-based virtual hosting.
When you setup an ingress (i.e.: nginx-ingress), a Service type LoadBalancer is created for the ingress-controller pods and a Load Balancer in you cloud provider is automatically created and a public IP will be assigned for the nginx-ingress service.
This load balancer/public ip will be used for incoming connection for all your services, and nginx-ingress will be the responsible to handle the incoming connections.
For example:
Supose you have 10 services of LoadBalancer type: This will result in 10 new publics ips created and you need to use the correspondent ip for the service you want to reach.
But if you use a ingress, only 1 IP will be created and the ingress will be the responsible to handle the incoming connection for the correct service based on PATH/URL you defined in the ingress configuration. With ingress you can:
Use regex in path to define the service to redirect;
Use SSL/TLS
Inject custom headers;
Redirect requests for a default service if one of the service failed (default-backend);
Create whitelists based on IPs
Etc...
A important note about Ingress Load balancing in ingress:
GCE/AWS load balancers do not provide weights for their target pools. This was not an issue with the old LB kube-proxy rules which would correctly balance across all endpoints.
With the new functionality, the external traffic is not equally load balanced across pods, but rather equally balanced at the node level (because GCE/AWS and other external LB implementations do not have the ability for specifying the weight per node, they balance equally across all target nodes, disregarding the number of pods on each node).
An ingress controller(nginx for example) pods needs to be exposed outside the kubernetes cluster as an entry point of all north-south traffic coming into the kubernetes cluster. One way to do that is via a LoadBalancer. You could use NodePort as well but it's not recommended for production or you could just deploy the ingress controller directly on the host network on a host with a public ip. Having a load balancer also gives ability to load balance the traffic across multiple replicas of ingress controller pods.
When you use ingress controller the traffic comes from the loadBalancer to the ingress controller and then gets to backend POD IPs based on the rules defined in ingress resource. This bypasses the kubernetes service and load balancing(by kube-proxy at layer 4) offered by kubernetes service.Internally the ingress controller discovers all the POD IPs from the kubernetes service's endpoints and directly route traffic to the pods.
It seems loadbalancing is being done by services. URL based routing is being done by ingress controller.
Services do balance the traffic between pods. But they aren't accessible outside the kubernetes in Google Kubernetes Engine by default (ClusterIP type). You can create services with LoadBalancer type, but each service will get its own IP address (Network Load Balancer) so it can get expensive. Also if you have one application that has different services it's much better to use Ingress objects that provides single entry point. When you create an Ingress object, the Ingress controller (e.g. nginx one) creates a Google Cloud HTTP(S) load balancer. An Ingress object, in turn, can be associated with one or more Service objects.
Then you can get the assigned load balancer IP from ingress object:
kubectl get ingress ingress-name --output yaml
As a result your application in pods become accessible outside the kubernetes cluster:
LoadBalancerIP/url1 -> service1 -> pods
LoadBalancerIP/url2 -> service2 -> pods