i have a question related to design and architecture needs instead of issue one, we have a kubernetes cluster which handle our production workload, we need to secure external traffic to this cluster so we have designed this approach :
make a worker node with ingress controller and without any workload
place this worker node in a DMZ zone in order to handle external traffic to our clusterIP services of our applications.
is that a good idea for securing our workloads ?
if we place an HAproxy in a DMZ zone (as a L4 just to load balance traffic to workers to be handled by ingress nginx for ex) it'll not give us an other level of security (protocol break)
note that we don't have a WAF.
Any ideas please??
Agree to use two dedicated nodes, for high availability, for external traffic entry point.
I would use the haproxy ingress controller Announcing HAProxy Kubernetes Ingress Controller 1.6 with Evolving Kubernetes networking with the Gateway API
Related
In al the tutorials about Kubernetes cluster I have read I didn't see that they mention to 2 load balancers, but only one for the ingress pods.
However, in a proper production environment, should's we have 2 different load balancers?
to balance between the master nodes for requests to the ApiServer.
to balance between the Ingress podes to control the external traffic.
to balance between the master nodes for requests to the ApiServer.
For all production environments its advised to have load
balancer for API Server. This is the first step as part of K8S HA mode creation. More details are in k8s documentation
to balance between the Ingress podes to control the external traffic.
You are correct for this also it’s definitely required to handle external traffic. All the ingress services are created of LoadBalancer Type in their implementations.
Can anyone please help me understand the ingress traffic flow to a pod in kubernetes? Any web links or documents are much appreciated.
In my application there is a intermittent connection timed out so i want to understand how the traffic is flowing in to cluster and where do i need to enable tcpdump to understand what is happening when there is timeout.
Your question does not contain enough information to give you a detailed answer. There are different types of ingress controllers, and load balancers as well.
So, suppose:
you are using Azure Kubernetes Service
you are using Azure Load Balancer
you have two types of backend pods, each has its own dedicated service
you are using Nginx as ingress controller which is able to do LAYER 7 (OSI) load balancing
Nginx has also its own pods and a service sits in front of these pods. This service has a Service IP which is available only within the AKS cluster. Due to this, additionally you can use Azure Load Balancer (ALB) to make your backend pods available for the public. ALB is a layer 4 load balancer, which sends the incoming traffic to the worker nodes.
Kube-proxy is running on every worker nodes and able to recognize that the traffic from the ALB was destined to the Nginx service.
See the flow on the image below:
Can someone help me to understand if service mesh itself is a type of ingress or if there is any difference between service mesh and ingress?
An "Ingress" is responsible for Routing Traffic into your Cluster (from the Docs: An API object that manages external access to the services in a cluster, typically HTTP.)
On the other side, a Service-Mesh is a tool that adds proxy-Containers as Sidecars to your Pods and Routs traffic between your Pods through those proxy-Containers.
use-Cases for Service-Meshes are i.E.
distributed tracing
secure (SSL) connections between pods
resilience (service-mesh can reroute traffic from failed requests)
network-performance-monitoring
When using an external load balancer with istio ingress gateways (multiple replicas spread across different nodes), how does it identify which istio ingress gateway it can possibly hit i.e. I can manually access nodeip:nodeport/endpoint for any node manually but how is an external load balancer expected to know all nodes.
Is this manually configured or does the load balancer consume this info from an API
Is there a recommended strategy for bypassing an external load balancer eg. roundrobin across a DNS which is aware of the node ip / port ?
The root of this question is - how do we avoid a single point of failure . Using multiple istio ingress gateway replicas achieves this in istio but then the the external load balancer / load balancer cluster needs to know the replicas . Is this automated or a manual config or is there a single virtual endpoint that the external load balancer hits?
External load balancers are generally configured to do health check on your set of nodes (over /healthz endpoint or some other method), and balance the incoming traffic using an LB algorithm, by sending the packets it receives to one of the healthy nodes over the service's NodePort.
In fact, that's mostly the reason why NodePort type services exist in the first place - they don't have much of an usage by themselves, but they are the intermediate steps between modes LoadBalancer and ClusterIP.
How does the load balancer know about the nodes? It heavily depends on the load balancer. As an example, if you use MetalLB in BGP mode, you need to add your nodes as peers to your external BGP router (either manually or in an automated way). MetalLB takes care of advertising the IPs of the LoadBalancer type services to the router. This means, that router effectively becomes the load balancer of your cluster.
There are also a number of enterprise-grade commercial Kubernetes load balancers out there, such as F5 Big-IP.
Enable ClusterIP for service rather than Node Port. Any LB can be used along with the ingress. But it depends on the platform you are using . It's bare metal or open shift , IBM Cloud, Google cloud. Once the ingress controller ( Metalb, ngnix, Traffic) is able to communicate any LB like F5 GTM or LTM can be set up in front.
I have a question related to Kubernetes networking.
I have a microservice (say numcruncherpod) running in a pod which is serving requests via port 9000, and I have created a corresponding Service of type NodePort (numcrunchersvc) and node port which this service is exposed is 30900.
My cluster has 3 nodes with following IPs:
192.168.201.70,
192.168.201.71
192.168.201.72
I will be routing the traffic to my cluster via reverse proxy (nginx). As I understand in nginx I need to specify IPs of all these cluster nodes to route the traffic to the cluster, is my understanding correct ?
My worry is since nginx won't have knowledge of cluster it might not be a good judge to decide the cluster node to which the traffic should be sent to. So is there a better way to route the traffic to my kubernetes cluster ?
PS: I am not running the cluster on any cloud platform.
This answer is a little late, and a little long, so I ask for forgiveness before I begin. :)
For people not running kubernetes clusters on Cloud Providers there are 4 distinct options for exposing services running inside the cluster to the world outside.
Service of type: NodePort. This is the simplest and default. Kubernetes assigns a random port to your service. Every node in the cluster listens for traffic to this particular port and then forwards that traffic to any one of the pods backing that service. This is usually handled by kube-proxy, which leverages iptables and load balances using a round-robin strategy. Typically since the UX for this setup is not pretty, people often add an external "proxy" server, such as HAProxy, Nginx or httpd to listen to traffic on a single IP and forward it to one of these backends. This is the setup you, OP, described.
A step up from this would be using a Service of type: ExternalIP. This is identical to the NodePort service, except it also gets kubernetes to add an additional rule on all kubernetes nodes that says "All traffic that arrives for destination IP == must also be forwarded to the pods". This basically allows you to specify any arbitrary IP as the "external IP" for the service. As long as traffic destined for that IP reaches one of the nodes in the cluster, it will be routed to the correct pod. Getting that traffic to any of the nodes however, is your responsibility as the cluster administrator. The advantage here is that you no longer have to run an haproxy/nginx setup, if you specify the IP of one of the physical interfaces of one of your nodes (for example one of your master nodes). Additionally you cut down the number of hops by one.
Service of type: LoadBalancer. This service type brings baremetal clusters at parity with cloud providers. A fully functioning loadbalancer provider is able to select IP from a pre-defined pool, automatically assign it to your service and advertise it to the network, assuming it is configured correctly. This is the most "seamless" experience you'll have when it comes to kubernetes networking on baremetal. Most of LoadBalancer provider implementations use BGP to talk and advertise to an upstream L3 router. Metallb and kube-router are the two FOSS projects that fit this niche.
Kubernetes Ingress. If your requirement is limited to L7 applications, such as REST APIs, HTTP microservices etc. You can setup a single Ingress provider (nginx is one such provider) and then configure ingress resources for all your microservices, instead of service resources. You deploy your ingress provider and make sure it has an externally available and routable IP (you can pin it to a master node, and use the physical interface IP for that node for example). The advantage of using ingress over services is that ingress objects understand HTTP mircoservices natively and you can do smarter health checking, routing and management.
Often people combine one of (1), (2), (3) with (4), since the first 3 are L4 (TCP/UDP) and (4) is L7. So things like URL path/Domain based routing, SSL Termination etc is handled by the ingress provider and the IP lifecycle management and routing is taken care of by the service layer.
For your use case, the ideal setup would involve:
A deployment for your microservice, with health endpoints on your pod
An Ingress provider, so that you can tweak/customize your routing/load-balancing as well as use for SSL termination, domain matching etc.
(optional): Use a LoadBalancer provider to front your Ingress provider, so that you don't have to manually configure your Ingress's networking.
Correct. You can route traffic to any or all of the K8 minions. The K8 network layer will forward to the appropriate minion if necessary.
If you are running only a single pod for example, nginx will most likely round-robin the requests. When the requests hit a minion which does not have the pod running on it, the request will be forwarded to the minion that does have the pod running.
If you run 3 pods, one on each minion, the request will be handled by whatever minion gets the request from nginx.
If you run more than one pod on each minion, the requests will be round-robin to each minion, and then round-robin to each pod on that minion.