Expose services inside Kubernetes cluster behind NAT - kubernetes

I have a GKE cluster set up with Cloud NAT, so traffic from any node/container going outward would have the same external IP. (I needed this for whitelisting purposes while working with 3rd-party services).
Now, if I want to deploy a proxy server onto this cluster that does basic traffic forwarding, how do I expose the proxy server "endpoint"? Or more generically, how do I expose a service if I deploy it to this GKE cluster?

Proxy server running behind NAT ?
Bad idea, unless it is only for your kubernetes cluster workload, but you didn't specify anywhere that it should be reachable only by other Pods running in the same cluster.
As you can read here:
Cloud NAT does not implement unsolicited inbound connections from the
internet. DNAT is only performed for packets that arrive as responses
to outbound packets.
So it is not meant to be reachable from outside.
If you want to expose any application within your cluster, making it available for other Pods, use simple ClusterIP Service which is the default type and it will be created as such even if you don't specify its type at all.

Normally, to expose a service endpoint running on a Kubernetes cluster, you have to use one of the Service Types, as Pods have internal IP addresses and are not addressable externally.
The possible service types:
ClusterIP: this also uses an internal IP address, and is therefore not addressable externally.
NodePort: this type opens a port on every node in your Kubernetes cluster, and configures iptables to forward traffic arriving to this port into the Pods providing the actual service.
LoadBalancer: this type opens a port on every node as with NodePort, and also allocates a Google Cloud Load Balancer service, and configures that service to access the port opened on the Kubernetes nodes (actually load balancing the incoming traffic between your operation Kubernetes nodes).
ExternalName: this type configures the Kubernetes internal DNS server to point to the specified IP address (to provide a dynamic DNS entry inside the cluster to connect to external services).
Out of those, NodePort and LoadBalancer are usable for your purposes. With a simple NodePort type Service, you would need publicly accessible node IP addresses, and the port allocated could be used to access your proxy service through any node of your cluster. As any one of your nodes may disappear at any time, this kind of service access is only good if your proxy clients know how to switch to another node IP address. Or you could use the LoadBalancer type Service, in that case you can use the IP address of the configured Google Cloud Load Balancer for your clients to connect to, and expect the load balancer to forward the traffic to any one of the running nodes of your cluster, which would then forward the traffic to one of the Pods providing this service.
For your proxy server to access the Internet as a client, you also need some kind of public IP address. Either you give the Kubernetes nodes public IP addresses (in that case, if you have more than a single node, you'd see multiple source IP addresses, as each node has its own IP address), or if you use private addresses for your Kubernetes nodes, you need a Source NAT functionality, like the one you already use: Cloud NAT

Related

How to forward traffic to an on-premise Kubernetes cluster

I'm trying to understand how traffic can be forwarded to an on-premise Kubernetes cluster.
It's clear to me that in a public Cloud provider, the underlying infrastructure of the Cloud can automatically manage and forward traffic to a Kubernetes distribution, such as EKS, GKE, AKS, by assining a LoadBalancer IP to a Kubernetes Service. Then, after a few seconds, this service will receive an external IP and will be reachable from the outside world.
On the other hand, in an on-premise Kubernetes cluster, by assigning a LoadBalancer IP to a service, it stays on pending forever, unless you assign a node IP, but what if you want to assign a different IP from a private IP range? In order to tackle this, in my homelab, I've deployed metallb inside my K3s cluster. The metallb is configured to use a private IP range of my network, let's say 10.0.0.0/24. Now, services of type LoadBalancer can consume an address of this range, e.g. my Ingress Controller can receive 10.0.2.3 as its external IP.
I can't understand what's metallb doing under the hood. How metallb "listens" to an address of the range and forwards traffic to my cluster. Can this be achieved without a metallb? I've tried setting an ExternalIP directly to a service of type LoadBalancer, but it never managed to claim that specific IP without it.
In addition, I'm aware that this can also be achieved with a "physical" load-balancer solution, such as NGINX and HAProxy, that sits in front of the cluster. To my understanding, technically this does the same thing as metallb. With such a solution configured, an address can be listened and be forwarded to the cluster. But my question here is, can this be achieved without those technologies? Can a Kubernetes Cluster listen to an external address and accept traffic without an intermediate solution? Maybe through Firewall rules and port-forwarding?
Your time is highly appreciated!
This involves some of the core networking concepts like NATing, you can have two networks one local and one external CIDR. For exposing the services you can NAT the local CIDR with external CIDR and configure required firewall rules for making your cluster serve the public.

metalLB vs nodeport in kubernetes

In https://kubernetes.github.io/ingress-nginx/deploy/baremetal/
In metalLB mode, one node attracts all the traffic for the ingress-nginx
By node port we can gather all traffic and loadbalance it podes by service
what is diffrence between node port and metalLB?
It's detailed fairly well here in the Kubernetes Service here:
https://kubernetes.io/docs/concepts/services-networking/service/#publishing-services-service-types
To summarise:
NodePort exposes the service on a port, which can then be accessed externally.
LoadBalancer uses a Cloud Providers option for exposing a port. E.G. Azure Load Balancers are used and can potentially expose multiple public IP addresses and load balances them against a larger pool of backend resources (Kubernetes nodes)
A Nodeport offers access to a service through a port on the node (hence node+port). A port is allocated that you can access the service through on any node in the cluster.
MetalLB is a load balancer for on-prem clusters. It allocates services with separate dedicated IP addresses allocated from a pool. So, if you want to access a service (an ingress controller or something else) on a dedicated IP then MetalLB allows you to do this.
MetalLB works in two ways, either BGP or Layer2 ARP. The latter is easier to set up if you're working on a "lab" environment. Basically the MetalLB responds to ARP requests sent by clients trying to connect to a service to which it's allocated an IP.
I was also struggling a bit to understand: why would i need a service of type loadbalancer, if i can use a nodePort service, which would allow me to access a service through a nodeport on all the nodes (loadbalanced by kube-proxy).
I think the main reason is security. Nodeport service forces you to expose your k8s nodes ip adresses to users.
Whereas when using loadbalancer service (either with metallb or any cload-provider), as #starfry mentioned it allocates services with separate dedicated IP addresses allocated from a pool.
Nodeport service allows only exposing ports of the range 30000-32767. As the port is exposed on the K8s node and would be possible otherwise to tromp on real ports used by the node.
So another reason would be for exposing a service outside of this range.

Wrong IP from GCP kubernetes load balancer to app engine's service

I'm having some troubles with a nginx pod inside a kubernetes cluster located on GCP which should be able to access a service located on app engine.
I have set firewall rules in the app engine to deny all and only allow some ips but the ip which hits my app engine service isn't the IP of the load balancer of my Nginx but instead the IP of one of the node of the cluster.
An image is better than 1000 words, then here's an image of our architecture :
The problem is: The ip which hits app engine's firewall is IP A whereas I thought i'd be IP B. IP A changes everytime I kill/create the cluster. If it were IP B, I could easily open this IP in App engine's firewall rules as I've put her static. Anyone has an idea how to have IP B instead of IP A ?
Thanks
The IP address assigned to your nginx "load balancer" is (likely) not an IP owned or managed by your Kubernetes cluster. Services of type LoadBalancer in GKE use Google Cloud Load Balancers. These are an external abstraction which terminates inbound connections in Google's front-end infrastructure and passes traffic to the individual k8s nodes in the cluster for onward delivery to your k8s-hosted service.
Pods in a Kubernetes cluster will, by default, route egress traffic out of the cluster using the configuration of their host node. In GKE, this route corresponds to the gateway of the VPC in which the cluster (and, by extension, Compute Engine instances) exists. The public IP of cluster nodes will change as they are added and removed from the pool.
A workaround uses a dedicated instance with a static external IP to process egress traffic leaving your VPC (i.e. egress from your cluster). Google has a tutorial for this purpose here: https://cloud.google.com/solutions/using-a-nat-gateway-with-kubernetes-engine
There are k8s-native solutions, but these will be unsuitable in a GKE context at present due to the inability to maintain any node with a non-ephemeral public IP.

Kubernetes Networking on Outbound Packet

I have created a k8s service (type=loadbalancer) with a numbers of pods behind. To my understanding, all packets initiazed from the pods will have the source ip as PodIP, wheareas those responding the inbound traffic will have the source ip as LoadBalancer IP. So my questions are:
Is my claim true, or there are times the source IP will be the node IP instead?
Are there any tricks in k8s, which I can change the source IP in the first scenario from PodIP to LB IP??
Any way to specify a designated pod IP??
The Pods are running in the internal network while the load balancer is exposed on the Internet, so the addresses of the packets will look more or less like this:
[pod1] <-----> [load balancer] <-----> [browser]
10.1.0.123 10.1.0.234 201.123.41.53 217.123.41.53
For specifying the pod IP have a look at SessionAffinity.
As user315902 said, Azure ACS k8s exposed service to internet with Azure load balancer.
Architectural diagram of Kubernetes deployed via Azure Container Service:
Is my claim true, or there are times the source IP will be the node IP
instead?
If we expose the service to internet, I think the source IP will be the load balancer public IP address. In ACS, if we expose multiple services to internet, Azure LB will add multiple public IP addresses.
Are there any tricks in k8s, which I can change the source IP in the
first scenario from PodIP to LB IP??
Do you mean you want to use node public IP address to expose the service to internet? if yes, I think we can't use node IP to expose service to internet. In Azure, we had to use LB to expose service to internet.

Difference between NodePort and LoadBalancer?

I have just started with Kubernetes and I am confused about the difference between NodePort and LoadBalancer type of service.
The difference I understand is that LoadBalancer does not support UDP but apart from that whenever we create a service either Nodeport or Loadbalancer we get a service IP and port, a NodePort, and endpoints.
From Kubernetes docs:
NodePort: on top of having a cluster-internal IP, expose the service
on a port on each node of the cluster (the same port on each node).
You'll be able to contact the service on any NodeIP:NodePort
address.
LoadBalancer: on top of having a cluster-internal IP and
exposing service on a NodePort also, ask the cloud provider for a load
balancer which forwards to the Service exposed as a NodeIP:NodePort
for each Node.
So, I will always access service on NodeIP:NodePort.
My understanding is, whenever we access the node:NodePort, the kubeproxy will intercept the request and forward it to the respective pod.
The other thing mentioned about LoadBalancer is that we can have an external LB which will LB between the Nodes. What prevents us to put a LB for services created as nodeport?
I am really confused. Most of the docs or tutorials talk only about LoadBalancer service therefore I couldn't find much on internet.
Nothing prevents you from placing an external load balancer in front of your nodes and use the NodePort option.
The LoadBalancer option is only used to additionally ask your cloud provider for a new software LB instance, automatically in the background.
I'm not up to date which cloud providers are supported yet, but i saw it working for Compute Engine and OpenStack already.
Difference between Node port and Load Balancer services.
Node Port
Load balancer
By creating a NodePort service, you are saying to Kubernetes reserve a port on all its nodes and forwards incoming connections to the pods that are part of the service.
There is no such port reserve with Load balancer on each node in the cluster.
NodePort service can be accessed not only through the service’s internal cluster IP, but also through any node’s IP and the reserved node port.
Only accessible by Load balancer public IP
Specifying the port isn’t mandatory. Kubernetes will choose a random port if you omit it( default range 30000 - 32767).
Load balancer will have its own unique, publicly accessible IP address and will redirect all connections to your service
If you only point your clients to the first node, when that node fails, your clients can’t access the service anymore
With Load balancer in front of the nodes to make sure you’re spreading requests across all healthy nodes and never sending them to a node that’s offline at that moment.