How does traffic get routed to an ingress controller? - kubernetes

The following questions are about an on-prem K3S setup.
1] How does HTTP/S traffic reach an ingress controller in say K3S?
When I hit any of my nodes on HTTPS port 443 I get the traefik ingress controller. This must be "magic" though because:
There is no process on the host listening on 443 (according to lsof)
The actual nodePort on the traefik service (of type LoadBalancer) is 30492
2] Where is the traefik config located inside the ingress controller pod?
When I shell into my traefik pods I cannot find the config anywhere - /etc/traefik does not even exist. Is everything done via API (from Ingress resource definitions) and not persisted?
3] Is ingress possible without any service of type LoadBalancer? I.e. can I use a nodePort service instead by using an external load balancer (like F5) to balance traffic between nodes and these nodeports?
4] Finally, how do the traefik controller pods "know" when a node is down and stop sending/balancing traffic to pods which no longer exist?

Port-forwarding is responsible for traffic getting mapped to traefik ingress controller by hitting on port 443 and NodePort is generally in between this range 30000-32767 only.
Refer this documentation for more information on port forwarding.
Yes, An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer.
Refer this documentation for more information on ingress.
Kubernetes has a health check mechanism to remove unhealthy pods from Kubernetes services (cf readiness probe). As unhealthy pods have no Kubernetes endpoints, Traefik will not forward traffic to them. Therefore, Traefik health check is not available for kubernetesCRD and kubernetesIngress providers.
Refer this documentation for more information on Health check.

Related

Exposing Service from a BareMetal(Kubeadm) Kubernetes Cluster to outside world

Exposing Service from a BareMetal(Kubeadm) Build Kubernetes Cluster to the outside world. I am trying to access my Nginx as a service outside of the cluster to get NGINX output in the web browser.
For that, I have created a deployment and service for NGINX as shown below,
As per my search, found that we have below to expose to outside world
MetalLb
Ingress NGINX
Some HELM resources
I would like to know all these 3 or any more approaches in such way it help me to learn new things.
GOAL
Exposing Service from a BareMetal(Kubeadm) Built Kubernetes Cluster to the outside world.
How Can I make my service has its own public IP to access from the outside cluster?
You need to set up MetalLB to get an external IP address for the LoadBalancer type services. It will give a local network IP address to the service.
Then you can do port mapping (configuration in the router) of incoming traffic of port 80 and port 443 to your external service IP address.
I have done a similar setup you can check it here in detail:
https://developerdiary.me/lets-build-low-budget-aws-at-home/
You need to deploy an ingress controller in your cluster so that it gives you an entrypoint where your applications can be accessed. Traditionally, in a cloud native environment it would automatically provision a LoadBalancer for you that will read the rules you define inside your Ingress object and route your request to the appropriate service.
One of the most commonly used ingress controller is the Nginx Ingress Controller. There are multiple ways you can use to deploy it (mainfests, helm, operators). In case of bare metal clusters, there are multiple considerations which you can read here.
MetalLB is still in beta stage so its your choice if you want to use. If you don't have a hard requirement to expose the ingress controller as a LoadBalancer, you can expose it as a NodePort Service that will accessible across all your nodes in the cluster. You can then map that NodePort Service in your DNS so that the ingress rules are evaluated.

Why Ingress needs an NodePort service?

I was reading the book "kubernetes in action", where it is mentioned that "Ingress controllers on cloud providers (in GKE, for example) require the Ingress to point to a NodePort service".
As Ingress controller fetch Pod IP from service itself and route the request directly to the IP and port, why does it need a NodePort service? and what does Node's IP and port (provided by NodePort service) used for?
An ingress controller is typically used to route traffic from outside a cluster to services inside the cluster.A NodePort is an open port on every node of your cluster. Kubernetes transparently routes incoming traffic on the NodePort to your service, even if your application is running on a different node.
A NodePort exposes the application on a port across each of your nodes via ingress .NodePort Service in ingress is to enable external users to access the internal pods without entering the cluster.
Follow this doc for more information.

Why do we need a load balancer to expose kubernetes services using ingress?

For a sample microservice based architecture deployed on Google kubernetes engine, I need help to validate my understanding :
We know services are supposed to load balance traffic for pod replicaset.
When we create an nginx ingress controller and ingress definitions to route to each service, a loadbalancer is also setup automatically.
had read somewhere that creating nginx ingress controller means an nginx controller (deployment) and a loadbalancer type service getting created behind the scene. I am not sure if this is true.
It seems loadbalancing is being done by services. URL based routing is
being done by ingress controller.
Why do we need a loadbalancer? It is not meant to load balance across multiple instances. It will just
forward all the traffic to nginx reverse proxy created and it will
route requests based on URL.
Please correct if I am wrong in my understanding.
A Service type LoadBalancer and the Ingress is the way to reach your application externally, although they work in a different way.
Service:
In Kubernetes, a Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). The set of Pods targeted by a Service is usually determined by a selector (see below for why you might want a Service without a selector).
There are some types of Services, and of them is the LoadBalancer type that permit you to expose your application externally assigning a externa IP for your service. For each LoadBalancer service a new external IP will be assign to it.
The load balancing will be handled by kube-proxy.
Ingress:
An API object that manages external access to the services in a cluster, typically HTTP.
Ingress may provide load balancing, SSL termination and name-based virtual hosting.
When you setup an ingress (i.e.: nginx-ingress), a Service type LoadBalancer is created for the ingress-controller pods and a Load Balancer in you cloud provider is automatically created and a public IP will be assigned for the nginx-ingress service.
This load balancer/public ip will be used for incoming connection for all your services, and nginx-ingress will be the responsible to handle the incoming connections.
For example:
Supose you have 10 services of LoadBalancer type: This will result in 10 new publics ips created and you need to use the correspondent ip for the service you want to reach.
But if you use a ingress, only 1 IP will be created and the ingress will be the responsible to handle the incoming connection for the correct service based on PATH/URL you defined in the ingress configuration. With ingress you can:
Use regex in path to define the service to redirect;
Use SSL/TLS
Inject custom headers;
Redirect requests for a default service if one of the service failed (default-backend);
Create whitelists based on IPs
Etc...
A important note about Ingress Load balancing in ingress:
GCE/AWS load balancers do not provide weights for their target pools. This was not an issue with the old LB kube-proxy rules which would correctly balance across all endpoints.
With the new functionality, the external traffic is not equally load balanced across pods, but rather equally balanced at the node level (because GCE/AWS and other external LB implementations do not have the ability for specifying the weight per node, they balance equally across all target nodes, disregarding the number of pods on each node).
An ingress controller(nginx for example) pods needs to be exposed outside the kubernetes cluster as an entry point of all north-south traffic coming into the kubernetes cluster. One way to do that is via a LoadBalancer. You could use NodePort as well but it's not recommended for production or you could just deploy the ingress controller directly on the host network on a host with a public ip. Having a load balancer also gives ability to load balance the traffic across multiple replicas of ingress controller pods.
When you use ingress controller the traffic comes from the loadBalancer to the ingress controller and then gets to backend POD IPs based on the rules defined in ingress resource. This bypasses the kubernetes service and load balancing(by kube-proxy at layer 4) offered by kubernetes service.Internally the ingress controller discovers all the POD IPs from the kubernetes service's endpoints and directly route traffic to the pods.
It seems loadbalancing is being done by services. URL based routing is being done by ingress controller.
Services do balance the traffic between pods. But they aren't accessible outside the kubernetes in Google Kubernetes Engine by default (ClusterIP type). You can create services with LoadBalancer type, but each service will get its own IP address (Network Load Balancer) so it can get expensive. Also if you have one application that has different services it's much better to use Ingress objects that provides single entry point. When you create an Ingress object, the Ingress controller (e.g. nginx one) creates a Google Cloud HTTP(S) load balancer. An Ingress object, in turn, can be associated with one or more Service objects.
Then you can get the assigned load balancer IP from ingress object:
kubectl get ingress ingress-name --output yaml
As a result your application in pods become accessible outside the kubernetes cluster:
LoadBalancerIP/url1 -> service1 -> pods
LoadBalancerIP/url2 -> service2 -> pods

How does k8 traffic flow internally?

I have ingress and service with LB. When traffic coming from outside it hits ingress first and then does it goes to pods directly using ingress LB or it goes to service and get the pod ip via selector and then goes to pods? If it's first way, what is the use of services? And which kind, services or ingress uses readinessProbe in the deployment?
All the setup is in GCP
I am new to K8 networks.
A service type LoadBalancer is a external source provided by your cloud and are NOT in Kubernetes cluster. They can work forwarding the request to your pods using node selector, but you can't for example make path rules or redirect, rewrites because this is provided by an Ingress.
Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). The set of Pods targeted by a Service is usually determined by a selector (see below for why you might want a Service without a selector).
Internet
|
[ LoadBalancer ]
--|-----|--
[ Services ]
--| |--
[ Pod1 ] [ Pod2 ]
When you use Ingress, is a component controller by a ingress controller that is basically a pod configured to handle the rules you defined.
To use ingress you need to configure a service for your path, and then this service will reach the pods with configures selectors. You can configure some rules based on path, hostname and them redirect for the service you want. Like this:
Internet
|
[ Ingress ]
--|-----|--
[ Services ]
--| |--
[ Pod1 ] [ Pod2 ]
Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
This article has a good explanation between all ways to expose your service.
The readnessProbe is configured in your pod/deployment specs, and kubelet is responsible to evaluate your container healthy.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
kube-proxy is the responsible to foward the request for the pods.
For example, if you have 2 pods in different nodes, kube-proxy will handle the firewall rules (iptables) and distribute the traffic between your nodes. Each node in your cluster has a kube-proxy running.
kube-proxy can be configured in 3 ways: userspace mode, iptables mode and ipvs mode.
If kube-proxy is running in iptables mode and the first Pod that’s selected does not respond, the connection fails. This is different from userspace mode: in that scenario, kube-proxy would detect that the connection to the first Pod had failed and would automatically retry with a different backend Pod.
References:
https://kubernetes.io/docs/concepts/services-networking/service/
https://kubernetes.io/docs/concepts/services-networking/ingress/
Depends whether your LoadBalancer service exposes the Ingress controller or your application Pods (the first is the correct approach).
The usual way to use Services and Ingresses is like this:
LoadBalancer Service -> Ingress -> ClusterIP Service -> Pods
In that case, the traffic from the Internet first hits the load balancer of your cloud provider (created by the LoadBalancer Service), which forwards it to the Ingress controller (which is one or multiple Pods running NGINX in your cluster), which in turn forward it to your application Pods (by getting the Pods' IP addresses from the ClusterIP Service).
I'm not sure if you currently have this constellation:
Ingress -> LoadBalancer Service -> Pods
In that case, you don't need a LoadBalancer Service there. You need only a ClusterIP Service behind an Ingress, and then you typically expose the Ingress with a LoadBalancer Service.

Accessing a webpage hosting on a pod

I have deployment that hosts a website on port 9001 and a service attached to it. I want to allow anyone (from outside cluster) to be able to connect to that site.
Any help would be appreciated.
I want to allow anyone (from outside cluster) to be able to connect to that site
There are many ways to do this using kubernetes services to expose port 9001 of the website to the outside world:
Service type LoadBalancer if you have an external, cloud-provider's load-balancer.
ExternalIPs. The website can be hit at ExternalIP:Port.
Service type NodePort if the cluster's nodes are reachable from the users. The website can be hit at NodeIP:NodePort.
Ingress controller and ingress resource.
As you wrote that this is not a cloud deployment, you need to consider how to correctly expose this to the world in a decent fashion. First and formost, create a NodePort type service for your deployment. With this, your nodes will expose that service on a high port.
Depending on your network, at this point you either need to configure a loadbalancer in your network to forward traffic for some IP:80 to your node(s) high NodePort, or for example deploy HAProxy in a DeamonSet with hostNetwork: true that will proxy 80 to your NodePort.
A bit more complexity can be added by deployment of Nginx IngressController (exposed as above) and use of Ingress to make the Ingress Controller expose all your services without the need to fiddle with NodePort/LB/HAProxy for each of them individualy any more.