GCP Cluster ip address is not the same as request's remoteAddr

GCP Cluster ip address is not the same as request's remoteAddr - kubernetes

I have a node in Google Cloud Platform Kubernetes public cluster. When I make HTTP request from my application to external website, nginx in that website shows some IP address different than the IP address of my kubernetes cluster. I can't figure out where that IP address comes from. I'm not using NAT in GCP.

I will just add some official terminology to put some light on GKE networking before providing an answer;
Let's have a look at some GKE networking terminology:
The Kubernetes networking model relies heavily on IP addresses. Services, Pods, containers, and nodes communicate using IP addresses and ports. Kubernetes provides different types of load balancing to direct traffic to the correct Pods. All of these mechanisms are described in more detail later in this topic. Keep the following terms in mind as you read:
ClusterIP: The IP address assigned to a Service. In other documents, it may be called the "Cluster IP". This address is stable for the lifetime of the Service, as discussed in the Services section in this topic.
Pod IP: The IP address assigned to a given Pod. This is ephemeral, as discussed in the Pods section in this topic.
Node IP: The IP address assigned to a given node.
Additionally you may have a look at the exposing your service documentation which may give you even more insight.
And to support the fact that you got your node's IP - GKE uses an IP masquerading:
IP masquerading is a form of network address translation (NAT) used to perform many-to-one IP address translations, which allows multiple clients to access a destination using a single IP address. A GKE cluster uses IP masquerading so that destinations outside of the cluster only receive packets from node IP addresses instead of Pod IP addresses.

Related

Dynamic load balancing with kubernetes

I'm new in kubernetes.
We have 50 ip addresses and ip addresses have a request limit. The limit is a value kept in the database. We want load balancer to choose it based on the one that has the most limits in the database. Can Kubernetes do that?

Firstly I advice you to read official documentation about networking in Kubernetes - you can find it here: kubernetes-networking. Especially read about services. Original Load Balancer in Kubernetes never checks application-specific databases.
An abstract way to expose an application running on a set of
Pods as a network service. With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can
load-balance across them.
Example on service type clusterIP.
Kubernetes assigns a stable, reliable IP address to each newly-created
Service (the
ClusterIP) from the cluster's pool of available Service IP addresses. Kubernetes also assigns a hostname to the ClusterIP, by adding a DNS entry. The ClusterIP and hostname are unique within the cluster and do not change throughout the lifecycle of the Service. Kubernetes only
releases the ClusterIP and hostname if the Service is deleted from the cluster's configuration. You can reach a healthy Pod running your application using either the ClusterIP or the hostname of the Service.
Take a look how it looks in GKE: GKE-IP-allocation.
You can specify also your own cluster IP address as part of a
Service creation request - set the .spec.clusterIP field.
The IP address that you choose must be a valid IPv4 or IPv6 address from within the service-cluster-ip-range CIDR range that is configured for the API server. If you try to create a Service with an invalid clusterIP address value, the API server will return a 422 HTTP status code to indicate that there's a problem.
To sum up. Kubernetes load balancer never does deep dive into your app. To connect to your app you need to create service. Kubernetes assigns a stable, reliable IP address to each newly-created Service from which you can access your app within or from outside the cluster. You can also manually assign IP per service.

Expose services inside Kubernetes cluster behind NAT

I have a GKE cluster set up with Cloud NAT, so traffic from any node/container going outward would have the same external IP. (I needed this for whitelisting purposes while working with 3rd-party services).
Now, if I want to deploy a proxy server onto this cluster that does basic traffic forwarding, how do I expose the proxy server "endpoint"? Or more generically, how do I expose a service if I deploy it to this GKE cluster?

Proxy server running behind NAT ?
Bad idea, unless it is only for your kubernetes cluster workload, but you didn't specify anywhere that it should be reachable only by other Pods running in the same cluster.
As you can read here:
Cloud NAT does not implement unsolicited inbound connections from the
internet. DNAT is only performed for packets that arrive as responses
to outbound packets.
So it is not meant to be reachable from outside.
If you want to expose any application within your cluster, making it available for other Pods, use simple ClusterIP Service which is the default type and it will be created as such even if you don't specify its type at all.

Normally, to expose a service endpoint running on a Kubernetes cluster, you have to use one of the Service Types, as Pods have internal IP addresses and are not addressable externally.
The possible service types:
ClusterIP: this also uses an internal IP address, and is therefore not addressable externally.
NodePort: this type opens a port on every node in your Kubernetes cluster, and configures iptables to forward traffic arriving to this port into the Pods providing the actual service.
LoadBalancer: this type opens a port on every node as with NodePort, and also allocates a Google Cloud Load Balancer service, and configures that service to access the port opened on the Kubernetes nodes (actually load balancing the incoming traffic between your operation Kubernetes nodes).
ExternalName: this type configures the Kubernetes internal DNS server to point to the specified IP address (to provide a dynamic DNS entry inside the cluster to connect to external services).
Out of those, NodePort and LoadBalancer are usable for your purposes. With a simple NodePort type Service, you would need publicly accessible node IP addresses, and the port allocated could be used to access your proxy service through any node of your cluster. As any one of your nodes may disappear at any time, this kind of service access is only good if your proxy clients know how to switch to another node IP address. Or you could use the LoadBalancer type Service, in that case you can use the IP address of the configured Google Cloud Load Balancer for your clients to connect to, and expect the load balancer to forward the traffic to any one of the running nodes of your cluster, which would then forward the traffic to one of the Pods providing this service.
For your proxy server to access the Internet as a client, you also need some kind of public IP address. Either you give the Kubernetes nodes public IP addresses (in that case, if you have more than a single node, you'd see multiple source IP addresses, as each node has its own IP address), or if you use private addresses for your Kubernetes nodes, you need a Source NAT functionality, like the one you already use: Cloud NAT

what is the use of cluster IP in kubernetes

Can someone help me understand about the IP address I see for cluster IP when I list services.
what is cluster IP (not the service type, but the real IP)?
how it is used?
where does it come from?
can I define the range for cluster IP (like we do for pod network)?

Good question to start learning something new (also for me):
Your concerns are related to kube-proxy by default in K8s cluster it's working in iptables mode.
Every node in a Kubernetes cluster runs a kube-proxy. Kube-proxy is responsible for implementing a form of virtual IP for Services.
In this mode, kube-proxy watches the Kubernetes control plane for the addition and removal of Service and Endpoint objects. For each Service, it installs iptables rules, which capture traffic to the Service’s clusterIP and port, and redirect that traffic to one of the Service’s backend sets. For each Endpoint object, it installs iptables rules which select a backend Pod.
Node components kube-proxy:
kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.
kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.
kube-proxy uses the operating system packet filtering layer if there is one and it’s available. Otherwise, kube-proxy forwards the traffic itself.
As described here:
Due to these iptables rules, whenever a packet is destined for a service IP, it’s DNATed (DNAT=Destination Network Address Translation), meaning the destination IP is changed from service IP to one of the endpoints pod IP chosen at random by iptables. This makes sure the load is evenly distributed among the backend pods.
When this DNAT happens, this info is stored in conntrack — the Linux connection tracking table (stores 5-tuple translations iptables has done: protocol, srcIP, srcPort, dstIP, dstPort). This is so that when a reply comes back, it can un-DNAT, meaning change the source IP from the Pod IP to the Service IP. This way, the client is unaware of how the packet flow is handled behind the scenes.
There are also different modes, you can find more information here
During cluster initialization you can use --service-cidr string parameter Default: "10.96.0.0/12"
ClusterIP: The IP address assigned to a Service
Kubernetes assigns a stable, reliable IP address to each newly-created Service (the ClusterIP) from the cluster's pool of available Service IP addresses. Kubernetes also assigns a hostname to the ClusterIP, by adding a DNS entry. The ClusterIP and hostname are unique within the cluster and do not change throughout the lifecycle of the Service. Kubernetes only releases the ClusterIP and hostname if the Service is deleted from the cluster's configuration. You can reach a healthy Pod running your application using either the ClusterIP or the hostname of the Service.
Pod IP: The IP address assigned to a given Pod.
Kubernetes assigns an IP address (the Pod IP) to the virtual network interface in the Pod's network namespace from a range of addresses reserved for Pods on the node. This address range is a subset of the IP address range assigned to the cluster for Pods, which you can configure when you create a cluster.
Resources:
Iptables Mode
Network overview
Understanding Kubernetes Kube-Proxy
Hope this helped

The cluster IP is the address where your service can be reached from inside the cluster. You won't be able to ping from the external network the cluster IP unless you do some kind of SSH tunneling. This IP is auto assigned by k8s and it might be possible to define a range (I'm not sure and I don't see why you need to do so).

Google Cloud deployment and Kubernetes node IP address change

We have had our database running on Kubernetes cluster (deployed to our private network) in Google cloud for a few months now. Last week we noticed that for some reason the IP address of all underlying nodes (VMs) changed. This caused an outage. We have been using the NodePort configuration of Kubernetes for our service to access our database (https://kubernetes.io/docs/concepts/services-networking/service/#nodeport).
We understand that the IP address of the pods within the VMs are dynamic and will eventually change, however we did not know that the IP address of the actual nodes (VMs) may also change. Is this normal? Does anyone know what can cause a VM IP address change in a Kubernetes cluster?

From the documentation about Ephemeral IP Addresses on GCP,
When you create an instance or forwarding rule without specifying an
IP address, the resource is automatically assigned an ephemeral
external IP address. Ephemeral external IP address are released from a
resource if you delete the resource. For VM instances, if you stop the
instance, the IP address is also released. Once you restart the
instance, it is assigned a new ephemeral external IP address.
You can assign static external IP addresses to instances, but as #Notauser mentioned, it is not recommended for Kubernetes nodes. This is because you may configure autoscaler for your instance groups and node sizes can be minimized or maximized.
Also, you need to reserve a static IP address for each node, which is not recommended. Moreover you will waste Static IP address resources and if the reserved static IP addresses are not used, you will still be charged for that.
Otherwise you can configure HTTP loadbalancer using ingress and then reserve a static IP address for your load balancer. Instead of using NodePort you should use ClusterIP type services and create an ingress rule forwarding the traffic to those services.

If you are using a managed Kubernetes Engine (GKE) cluster, this is expected as nodes are mortal and might be replaced or restarted if it becomes unresponsive for example. Therefore the node's IP will change. There is currently no way to assign a static (fixed) public IP to nodes. In this case you should expose your DB service as cluster IP instead. it will have an unchanged static IP. Here's an example on how to do that.
Alternatively, if you are using a non-managed kubernetes cluster in Compute Engine (GCE) then you simply have to promote your nodes IP's to static.

Kubernetes Networking on Outbound Packet

I have created a k8s service (type=loadbalancer) with a numbers of pods behind. To my understanding, all packets initiazed from the pods will have the source ip as PodIP, wheareas those responding the inbound traffic will have the source ip as LoadBalancer IP. So my questions are:
Is my claim true, or there are times the source IP will be the node IP instead?
Are there any tricks in k8s, which I can change the source IP in the first scenario from PodIP to LB IP??
Any way to specify a designated pod IP??

The Pods are running in the internal network while the load balancer is exposed on the Internet, so the addresses of the packets will look more or less like this:
[pod1] <-----> [load balancer] <-----> [browser]
10.1.0.123 10.1.0.234 201.123.41.53 217.123.41.53
For specifying the pod IP have a look at SessionAffinity.

As user315902 said, Azure ACS k8s exposed service to internet with Azure load balancer.
Architectural diagram of Kubernetes deployed via Azure Container Service:
Is my claim true, or there are times the source IP will be the node IP
instead?
If we expose the service to internet, I think the source IP will be the load balancer public IP address. In ACS, if we expose multiple services to internet, Azure LB will add multiple public IP addresses.
Are there any tricks in k8s, which I can change the source IP in the
first scenario from PodIP to LB IP??
Do you mean you want to use node public IP address to expose the service to internet? if yes, I think we can't use node IP to expose service to internet. In Azure, we had to use LB to expose service to internet.