Getting client ip using Knative and Anthos - kubernetes

We use Google Cloud Run on our K8s cluster on GCP which is powered by Knative and Anthos, however it seems the load balancer doesn't amend the x-forwarded-for (and this is not expected as it is TCP load balancer), and Istio doesn't do the same.
Do you have the same issue or it is limited to our deployment?
I understand Istio support this as part of their upcoming Gateway Network Topology but not in the current gcp version.

I think you are correct in assessing that current Cloud Run for Anthos set up (unintentionally) does not let you see the origin IP address of the user.
As you said, the created gateway for Istio/Knative in this case is a Cloud Network Load Balancer (TCP) and this LB doesn’t preserve the client’s IP address on a connection when the traffic is routed to Kubernetes Pods (due to how Kubernetes networking works with iptables etc). That’s why you see an x-forwarded-for header, but it contains internal hops (e.g. 10.x.x.x).
I am following up with our team on this. It seems that it was not noticed before.

Related

Kubernetes IP egress addressing

I question I have trouble finding an answer for is this:
When a K8s pod connects to an external service over the Internet, then that external service, what IP address does it see the pod traffic coming from?
I would like to know the answer in two distinct cases:
there is a site-to-site VPN between the K8s cluster and the remote service
there is no such VPN, the access is over the public Internet.
Let me also add the assumption that the K8s cluster is running on AWS (not with EKS,it is customer-managed).
Thanks for answering.
When the traffic leaves the pod and goes out, it usually undergoes NATing on the K8S Node, so the traffic in most cases will be coming with the Node's IP address in SRC. You can manipulate this process by (re-) configuring IP-MASQ-AGENT, which can allow you not to NAT this traffic, but then it would be up to you to make sure the traffic can be routed in the Internet, for example by using a cloud native NAT solution (Cloud NAT in case of GCP, NAT Gateway in AWS).

send request from kubernetes pods through load balancer ip

I have a k8s cluster on DigitalOcean using traefik 1.7 as Ingress Controller.
our domain point to the load balancer ip created by trafik.
All incomming request go through load balancer ip and be routed by trafik to proper service.
Now I want to perform HTTP requests from my services to an external system which only accepts registered IPs.
Can I provide them load balancer's IP and make all outbound requests go through load balancer IP? or I need to provide them all node's public IPs?
thanks
You can do either of them.
But the best solution to this would be to make all the traffic go through load balancer assuming this is some proxy server with tunnelling capabilities and open comms through load balancer IP on your external system. Because, imagine, right now you might be having a dozen of nodes running 100 micro services and now you opened your external system security group to allow traffic from dozen.
But in next few months you might go from 12 to 100 nodes and the overhead of updating your external system's security group whenever you add a node in DigitalOcean.
But you can also try a different approach by adding a standalone proxy server and route traffic through it from your pods. Something like [this] (Kubernetes outbound calls to an external endpoint with IP whitelisting).
Just a note, it's not just these options there are several ways one can achieve this, one another approach would be associating a NAT IP to all your nodes and keeping every node behind a private network would also work. It all depends on how you want to set it up and the purpose of the system you are planning to achieve.
Hope this helps.
Unfortunately, Ingress resources can't use outbound requests.
So you need to provide all nodes public IPs.
Another idea, if you use a forward proxy(e.g. nginx, haproxy), you can limit the nodes where forward proxy pods are scheduled by setting nodeSelector.
By doing so, I think you can limit the nodes that provide public IP addresses.
Egress packets from a k8s cluster to cluster-external services have node's IP as the source IP. So, you can register k8s nodes' IPs in the external system to allow egress packets from the k8s cluster.
https://kubernetes.io/docs/tutorials/services/source-ip/ says egress packets from k8s get source NAT'ed with node's IP:
Source NAT: replacing the source IP on a packet, usually with a node’s IP
Following can be used to send egress packets from a k8s cluster:
k8s network policies
calico CNI egress policies
istio egress gateway
nirmata/kube-static-egress-ip GitHub project
kube-static-egress-ip provides a solution with which a cluster operator can define an egress rule where a set of pods whose outbound traffic to a specified destination is always SNAT'ed with a configured static egress IP. kube-static-egress-ip provides this functionality in Kubernetes native way using custom rerources.

Google Kubernetes Engine Service loadBalancerSourceRanges not allowing connection on IP range

I'm exposing an application run on a GKE cluster using a LoadBalancer service. By default, the LoadBalancer creates a rule in the Google VPC firewall with IP range 0.0.0.0/0. With this configuration, I'm able to reach the service in all situations.
I'm using an OpenVPN server inside my default network to prevent outside access to GCE instances on a certain IP range. By modifying the service .yaml file loadBalancerSourceRanges value to match the IP range of my VPN server, I expected to be able to connect to the Kubernetes application while connected to the VPN, but not otherwise. This updated the Google VPN firewall rule with the range I entered in the .yaml file, but didn't allow me to connect to the service endpoint. The Kubernetes cluster is located in the same network as the OpenVPN server. Is there some additional configuration that needs to be used other than setting loadBalancerSourceRanges to the desired ingress IP range for the service?
You didn't mention the version of this GKE cluster; however, it might be helpful to know that, beginning with Kubernetes version 1.9.x, automatic firewall rules have changed such that workloads in your Google Kubernetes Engine cluster cannot communicate with other Compute Engine VMs that are on the same network, but outside the cluster. This change was made for security reasons. You can replicate the behavior of older clusters (1.8.x and earlier) by setting a new firewall rule on your cluster. You can see this notification on the Release Notes published in the official documentation

kubernetes on gke / why a load balancer use is enforced?

Made my way into kubernetes through GKE, currently trying out via kubeadm on bare metal.
In the later environment, there is no need of any specific load balancer; using nginx-ingress and ingresses let one serve service to the www.
Oppositely, on gke, using the same nginx-ingress, or using the gke provided l7, you always end up with a billed load balancer.
What's the reason about that, as it seemed not to be ultimately needed ?
(Reposting my comment above)
In general, when one is receiving traffic from the outside world, that traffic is being sent to one or more non-ACLd public IP addresses.
If you run k8s on bare metals, those BMs can have public IPs, and you can just run ingress on one or more of them.
A managed k8s environment, however, for security reasons, will not permit nodes to have public IPs.
Instead, managed load balancers are allowed to have public IPs. Those are configured to know the private node IPs hosting ingress for your cluster and will direct traffic accordingly.
Kubernetes services have few types, each building up on previous one : ClusterIP, NodePort and LoadBalancer. Only the last one will provision LoadBalancer in a cloud environment, so you can avoid it on GKE without fuzz. The question is, what then? Because, in best case you end up with an Ingress (I assume we expose ingress as in your question), that is available on volatile IPs (nodes can be rolled at any time and new ones will get new IPs) and high ports given by NodePort service. Meaning that not only you have no fixed IP to use, but also you would need to open something like http://:31978, which obviously is crap. Hence, in cloud, you have a simple solution of putting a cloud load balancer in front of it with LoadBalancer service type. This LB will ingest the traffic on port 80/443 and forward it to correct backing service/pods.

Deterministic connection to cloud-internal IP of K8S service or its underlying endpoint?

I have a Kubernetes cluster (1.3.2) in the the GKE and I'd like to connect VMs and services from my google project which shares the same network as the cluster.
Is there a way for a VM that's internal to the subnet but not internal to the cluster itself to connect to the service without hitting the external IP?
I know there's a ton of things you can do to unambiguously determine the IP and port of services, such as the ENVs and DNS...but the clusterIP is not reachable outside of the cluster (obviously).
Is there something I'm missing? An important component to this is that this is meant to be a service "public" to the project, such that I don't know which VMs on the project will want to connect to the service (this could rule out loadBalancerSourceRanges). I understand the endpoint which the services actually wraps is the internal IP I can hit, but the only good way to get to that IP is though the Kube API or kubectl, both of which are not prod-ideal ways of hitting my service.
Check out my more thorough answer here, but the most common solution to this is to create bastion routes in your GCP project.
In the simplest form, you can create a single GCE Route to direct all traffic w/ dest_ip in your cluster's service IP range to land on one of your GKE nodes. If that SPOF scares you, you can create several routes pointing to different nodes, and traffic will round-robin between them.
If that management overhead isn't something you want to do going forward, you could write a simple controller in your GKE cluster to watch the Nodes API endpoint, and make sure that you have a live bastion route to at least N nodes at any given time.
GCP internal load balancing was just released as alpha, so in the future, kube-proxy on GCP could be implemented using that, which would eliminate the need for bastion routes to handle internal services.