Preserving original source ip address of client - kubernetes

We are trying to understand this behavior of "preserving the original source ip address of client".
We have referred the following wiki:
https://istio.io/latest/docs/tasks/security/authorization/authz-ingress/#source-ip-address-of-the-original-client
We have the following test setup:
(1) one kubernetes cluster having two nodes node1 and node2
(2) istio ingress-gateway pod running on node1
(3) httpbin pod running on node2
(4) gateway and virtual-service has been created to route traffic from istio-ingressgateway service to httpbin pod
We have the following flow :
external traffic -> istio ingress-gateway service (load balancer) -> istio ingress-gateway pod -> httpbin service -> httpbin pod
We have been testing with Azure and GCP (Network Load Balancer)
We have set externalTrafficPolicy to local
We are firing the following call :
curl --connect-to httpbin.example.com:80:<istio_ingress_gateway_external_ip>:80 http://httpbin.example.com/get?show_env=1
Now from the response, we are getting "X-Forwarded-For" header which contains the source ip address (which is different from node ip address)
We have the following queries:
(1) According to ISTIO wiki, if istio ingress-gateway pod is not running on every node, then the traffic will be dropped. In our test setup there is no istio ingress-gateway pod on node2, but it is not dropping the traffic.
(2) Setting 'externalTrafficPolicy to local', it should prevent sending traffic to other nodes. In our test setup, we observe that the traffic is sent to node2.
Can anyone please help us if we are missing something, or the istio wiki needs to be further explained ?

Related

How to set DNS entrys & network configuration for kubernetes cluster #home (noob here)

I am currently running a Kubernetes cluster on my own homeserver (in proxmox ct's, was kinda difficult to get working because I am using zfs too, but it runs now), and the setup is as follows:
lb01: haproxy & keepalived
lb02: haproxy & keepalived
etcd01: etcd node 1
etcd02: etcd node 2
etcd03: etcd node 3
master-01: k3s in server mode with a taint for not accepting any jobs
master-02: same as above, just joining with the token from master-01
master-03: same as master-02
worker-01 - worker-03: k3s agents
If I understand it correctly k3s delivers with flannel as a CNI pre-installed, as well as traefik as a Ingress Controller.
I've setup rancher on my cluster as well as longhorn, the volumes are just zfs volumes mounted inside the agents tho, and as they aren't on different hdd's I've set the replicas to 1. I have a friend running the same setup (we set them up together, just yesterday) and we are planing on joining our networks trough vpn tunnels and then providing storage nodes for each other as an offsite backup.
So far I've hopefully got everything correct.
Now to my question: I've both got a static ip #home as well as a domain, and I've set that domain to my static ip
Something like that: (don't know how dns entries are actually written, just from the top of my head for your reference, the entries are working well.)
A example.com. [[my-ip]]
CNAME *.example.com. example.com
I've currently made a port-forward to one of my master nodes for port 80 & 443 but I am not quite sure how you would actually configure that with ha in mind, and my rancher is throwing a 503 after visiting global settings, but I have not changed anything.
So now my question: How would one actually configure the port-forward and, as far as I know k3s has a load-balancer pre-installed, but how would one configure those port-forwards for ha? the one master node it's pointing to could, theoretically, just stop working and then all services would not be reachable anymore from outside.
Assuming your apps are running on port 80 and port 443 your ingress should give you a service with an external ip and you would point your dns at that. Read below for more info.
Seems like you are not a noob! you got a lot going on with your cluster setup. What you are asking is a bit complicated to answer and I will have to make some assumptions about your setup, but will do my best to give you at least some intial info.
This tutorial has a ton of great info and may help you with what you are doing. They use kubeadm instead of k3s, buy you can skip that section if you want and still use k3s.
https://www.debontonline.com/p/kubernetes.html
If you are setting up and installing etcd on your own, you don't need to do that k3s will create an etcd cluster for you that run inside pods on your cluster.
Load Balancing your master nodes
haproxy + keepalived nodes would be configured to point to the ips of your master nodes at port 6443 (TCP), the keepalived will give you a virtual ip and you would configure your kubeconfig (that you get from k3s) to talk to that ip. On your router you will want to reserve an ip (make sure not to assign that to any computers).
This is a good video that explains how to do it with a nodejs server but concepts are the same for your master nodes:
https://www.youtube.com/watch?v=NizRDkTvxZo
Load Balancing your applications running in the cluster
Use an K8s Service read more about it here: https://kubernetes.io/docs/concepts/services-networking/service/
essentially you need an external ip, I prefer to do this with metal lb.
metal lb gives you a service of type load balancer with an external ip
add this flag to k3s when creating initial master node:
https://metallb.universe.tf/configuration/k3s/
configure metallb
https://metallb.universe.tf/configuration/#layer-2-configuration
You will want to reserve more ips on your router and put them under the addresses section in the yaml below. In this example you will see you have 11 ips in the range 192.168.1.240 to 192.168.1.250
create this as a file example metallb-cm.yaml
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 192.168.1.240-192.168.1.250
kubectl apply -f metallb-cm.yaml
Install with these yaml files:
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.12.1/manifests/namespace.yaml
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/v0.12.1/manifests/metallb.yaml
source - https://metallb.universe.tf/installation/#installation-by-manifest
ingress
Will need a service of type load balancer, use its external ip as the external ip
kubectl get service -A - look for your ingress service and see if it has an external ip and does not say pending
I will do my best to answer any of your follow up questions. Good Luck!

How to pass incoming traffic in bare-metal cluster with MetalLB and Ingress Controllers?

I'm trying to wrap my head around exposing internal loadbalancing to outside world on bare metal k8s cluster.
Let's say we have a basic cluster:
Some master nodes and some worker nodes, that has two interfaces, one public facing (eth0) and one local(eth1) with ip within 192.168.0.0/16 network
Deployed MetalLB and configured 192.168.200.200-192.168.200.254 range for its internal ips
Ingress controller with its service with type LoadBalancer
MetalLB now should assign one of the ips from 192.168.200.200-192.168.200.254 to ingress service, as of my current understanding.
But I have some following questions:
On every node I could curl ingress controller externalIP (as long as they are reachable on eth1) with host header attached and get a response from a service thats configured in coresponding ingress resource or is it valid only on node where Ingress pods are currently placed?
What are my options to pass incoming external traffic to eth0 to an ingress listening on eth1 network?
Is it possible to forward requests saving source ip address or attaching X-Forwarded-For header is the only option?
Assuming that we are talking about Metallb using Layer2.
Addressing the following questions:
On every node I could curl ingress controller externalIP (as long as they are reachable on eth1) with host header attached and get a response from a service thats configured in coresponding ingress resource or is it valid only on node where Ingress pods are currently placed?
Is it possible to forward requests saving source ip address or attaching X-Forwarded-For header is the only option?
Dividing the solution on the premise of preserving the source IP, this question could go both ways:
Preserve the source IP address
To do that you would need to set the Service of type LoadBalancer of your Ingress controller to support "Local traffic policy" by setting (in your YAML manifest):
.spec.externalTrafficPolicy: Local
This setup will be valid as long as on each Node there is replica of your Ingress controller as all of the networking coming to your controller will be contained in a single Node.
Citing the official docs:
With the Local traffic policy, kube-proxy on the node that received the traffic sends it only to the service’s pod(s) that are on the same node. There is no “horizontal” traffic flow between nodes.
Because kube-proxy doesn’t need to send traffic between cluster nodes, your pods can see the real source IP address of incoming connections.
The downside of this policy is that incoming traffic only goes to some pods in the service. Pods that aren’t on the current leader node receive no traffic, they are just there as replicas in case a failover is needed.
Metallb.universe.tf: Usage: Local traffic policy
Do not preserve the source IP address
If your use case does not require you to preserve the source IP address, you could go with the:
.spec.externalTrafficPolicy: Cluster
This setup won't require that the replicas of your Ingress controller will be present on each Node.
Citing the official docs:
With the default Cluster traffic policy, kube-proxy on the node that received the traffic does load-balancing, and distributes the traffic to all the pods in your service.
This policy results in uniform traffic distribution across all pods in the service. However, kube-proxy will obscure the source IP address of the connection when it does load-balancing, so your pod logs will show that external traffic appears to be coming from the service’s leader node.
Metallb.universe.tf: Usage: Cluster traffic policy
Addressing the 2nd question:
What are my options to pass incoming external traffic to eth0 to an ingress listening on eth1 network?
Metallb listen by default on all interfaces, all you need to do is to specify the address pool from this eth within Metallb config.
You can find more reference on this topic by following:
Metallb.universe.tf: FAQ: In layer 2 mode how to specify the host interface for an address pool
An example of such configuration, could be following:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools: # HERE
- name: my-ip-space
protocol: layer2
addresses:
- 192.168.1.240/28

How does k8 traffic flow internally?

I have ingress and service with LB. When traffic coming from outside it hits ingress first and then does it goes to pods directly using ingress LB or it goes to service and get the pod ip via selector and then goes to pods? If it's first way, what is the use of services? And which kind, services or ingress uses readinessProbe in the deployment?
All the setup is in GCP
I am new to K8 networks.
A service type LoadBalancer is a external source provided by your cloud and are NOT in Kubernetes cluster. They can work forwarding the request to your pods using node selector, but you can't for example make path rules or redirect, rewrites because this is provided by an Ingress.
Service is an abstraction which defines a logical set of Pods and a policy by which to access them (sometimes this pattern is called a micro-service). The set of Pods targeted by a Service is usually determined by a selector (see below for why you might want a Service without a selector).
Internet
|
[ LoadBalancer ]
--|-----|--
[ Services ]
--| |--
[ Pod1 ] [ Pod2 ]
When you use Ingress, is a component controller by a ingress controller that is basically a pod configured to handle the rules you defined.
To use ingress you need to configure a service for your path, and then this service will reach the pods with configures selectors. You can configure some rules based on path, hostname and them redirect for the service you want. Like this:
Internet
|
[ Ingress ]
--|-----|--
[ Services ]
--| |--
[ Pod1 ] [ Pod2 ]
Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
This article has a good explanation between all ways to expose your service.
The readnessProbe is configured in your pod/deployment specs, and kubelet is responsible to evaluate your container healthy.
The kubelet uses readiness probes to know when a Container is ready to start accepting traffic. A Pod is considered ready when all of its Containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.
kube-proxy is the responsible to foward the request for the pods.
For example, if you have 2 pods in different nodes, kube-proxy will handle the firewall rules (iptables) and distribute the traffic between your nodes. Each node in your cluster has a kube-proxy running.
kube-proxy can be configured in 3 ways: userspace mode, iptables mode and ipvs mode.
If kube-proxy is running in iptables mode and the first Pod that’s selected does not respond, the connection fails. This is different from userspace mode: in that scenario, kube-proxy would detect that the connection to the first Pod had failed and would automatically retry with a different backend Pod.
References:
https://kubernetes.io/docs/concepts/services-networking/service/
https://kubernetes.io/docs/concepts/services-networking/ingress/
Depends whether your LoadBalancer service exposes the Ingress controller or your application Pods (the first is the correct approach).
The usual way to use Services and Ingresses is like this:
LoadBalancer Service -> Ingress -> ClusterIP Service -> Pods
In that case, the traffic from the Internet first hits the load balancer of your cloud provider (created by the LoadBalancer Service), which forwards it to the Ingress controller (which is one or multiple Pods running NGINX in your cluster), which in turn forward it to your application Pods (by getting the Pods' IP addresses from the ClusterIP Service).
I'm not sure if you currently have this constellation:
Ingress -> LoadBalancer Service -> Pods
In that case, you don't need a LoadBalancer Service there. You need only a ClusterIP Service behind an Ingress, and then you typically expose the Ingress with a LoadBalancer Service.

How do Kubernetes NodePort services with Service.spec.externalTrafficPolicy=Local route traffic?

There seems to be two contradictory explanations of how NodePort services route traffic. Services can route traffic to one of the two, not both:
Nodes (through the kube-proxy) According to kubectl explain Service.spec.externalTrafficPolicy and this article that adds more detail, packets incoming to NodePort services with Service.spec.externalTrafficPolicy=Local set get routed to a kube-proxy, which then routes the packets to the corresponding pods its running.
This kube-proxy networking documentation further supports this theory adding that endpoints add a rule in the service's IPtable that forwards traffic to nodes through the kube-proxy.
Pods: services update their IPtables from endpoints, which contain the IP addresses for the pods they can route to. Furthermore, if you remove your service's label selectors and edit endpoints you can change where your traffic is routed to.
If one of these is right, then I must be misunderstanding something.
If services route to nodes, then why can I edit endpoints without breaking the IPtables?
If services route to pods, then why would services go through the trouble of routing to nodes when Service.spec.externalTrafficPolicy is set?
A Service is a virtual address/port managed by kube-proxy. Services forward traffic to their associated endpoints, which are usually pods but as you mentioned, can be set to any destination IP/Port.
A NodePort Service doesn't change the endpoint side of the service, the NodePort allows external traffic into Service via a port on a node.
Breakdown of a Service
kube-proxy can use 3 methods to implement the forwarding of a service from Node to destination.
a user proxy
iptables
ipvs
Most clusters use iptables, which is what is described below. I use the term "forward" instead of "route" because services use Network Address Translation (or the proxy) to "forward" traffic rather than standard network routing.
The service ClusterIP is a virtual entity managed by kube-proxy. This address/port combination is available on every node in the cluster and forwards any local (pod) service traffic to the endpoints IP and port.
/ Pod (remote node)
Pod -- ClusterIP/Port -- KUBE-SVC-NAT -- Pod
\ Pod (remote node)
A service with a NodePort is the same as above, with the addition of a way to forward external traffic into the cluster via a Node. kube-proxy manages an additional rule to watch for external traffic and forward it into the same service rules.
Ext -- NodePort \ / Pod (remote node)
KUBE-SVC-NAT -- Pod
Pod -- ClusterIP/Port / \ Pod (remote node)
The externalTrafficPolicy=Local setting makes a NodePort service use only a local Pod to service the incoming traffic. This avoids a network hop which removes the need to rewrite the source of the packet (via NAT). This results in the real network IP arriving at the pod servicing the connection, rather than one of the cluster nodes being the source IP.
Ext -- NodePort \ Pod (remote node)
KUBE-SVC-NAT -- Pod (local)
Pod -- ClusterIP/Port / Pod (remote node)
iptables
I recommend attempting to trace a connection from source to destination for a service or nodeport on a host. It requires a bit of iptables knowledge but I think it's worthwhile
To list all the services ip/ports that will be forwarded:
iptables -vnL -t nat KUBE-SERVICES
To list all the nodeports that will be forwarded:
iptables -vnL -t nat KUBE-NODEPORTS
Once you have the rule you can jump through KUBE-SVC-XXX "target" rules in the full output.
iptables -vnL -t nat | less
externalTrafficPolicy: Cluster will not used on ClusterIP, try to remove and apply it, it"ll work

Kubernetes Service LoadBalancer "EXTERNAL-IP" remains "<none>" instead of taking worker nodes public IP addresses

I have 5 VPS with a public network interface for each, for which I have configured a VPN.
3 nodes are Kubernetes masters where I have set the Kubelet --node-ip flag as their private IP address.
One of the 3 nodes have a HAProxy load balancer for the Kubernetes masters, listening on the private IP, so that all the nodes used the private IP address of the load balancer in order to join the cluster.
2 nodes are Kubernetes workers where I didn't set the Kubelet --node-ip flag so that their node IP is the public address.
The cluster is healthy and I have deploy my application and its dependencies.
Now I'd like to access the app from the Internet, so I've deployed a edge router and created a Kubernetes Service with the type LoadBalancer.
The service is well created but never takes the worker nodes' public IP addresses as EXTERNAL-IP.
Assigning the IP addresses manually works, but obviously want that to be automatic.
I have read about the MetalLb project, but it doesn't seem to fit in my case as it is supposed to have a range of IP addresses to distribute, while here I have one public IP address per node, and not in the same range.
So who can I configure Kubernetes so that my Service of type LoadBalancer gets automatically the public IP addresses as EXTERNAL-IP?
I finally can answer myself in two times.
Without an external Load Balancer
Firstly, in order to solve the problem from my question, the only way I found which worked quite well was to set the externalIPs of my LoadBalancer service with the IP addresses of the Kubernetes worker nodes.
Those nodes were running Traefik and therefor had it listening on ports 80 and 443.
After that, I've created as many A DNS entries as I have Kubernetes worker nodes, pointing each to the Kubernetes respective worker node public IP address. This setup makes the DNS server returning the list of IP addresses, in a random order, and then the web browser will take care of trying the first IP address, then the second one if the first is down and so on.
The downside of this, is when you want to drain a node for maintenance, or when it crashes, the web browser will wast time trying to reach it until it tries the next IP address.
So here come the second option: External Load balancer.
With an external Load Balancer
I took another VPS where I've installed HAproxy and configured a SSL passthrough of the Kubernetes API port so that it load balancer the trafic to the master nodes, without terminating it.
With this solution, I removed the externalIPs field from my Service and I've installed MetalLB with a single IP address configured with this manifest:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: staging-public-ips
protocol: layer2
addresses:
- 1.2.3.4/32
When the LoadBalancer Service is created, MetalLB assigns this IP address and calls the Kubernetes APIs accordingly.
This has solved my issue to integrate my Kubernetes cluster with Gitlab.
WARNING: MetalLB will assign only once the IP address so that if you have a second LoadBalancer Service, it will remain in Pending state forever, until you give a new IP address to MetalLB.