custom outgoing network path for kubernetes pod - kubernetes

say I have 2 sites, S1,S2; each with at least a kubernetes worker. The two sites are geographically apart and have different public IPs on the nodes/workers.
Does kubernetes offer any existing mechanisms to route outgoing internet traffic from a pod/container in S1, via S2 ?
The goal is to be able to use the public IP(s) in S2 for pods in S1.
If k8s-federation is a requisite for a solution; then that is fine.

Kubernetes doesn't have any input on this, it's up to your network design and structure, just like it would be with traditional VMs or whatnot. That said, this sounds like a very bad network design given what you described so I would be surprised if it was easy to set up. Calico runs normal BGP under the hood so you can probably set up two ASes and force one to route via the other.

With calico you can introduce a non default IPPool which does not do NAT for outgoing traffic. Then you can annotate the pods and/or namespaces you want to use that IPPool.
At that point it leaves you with non-working outgoing traffic, since your cluster IPs are "leaking" to the outside world. But a return path is not known by any upstream routers between your cluster and the internet.
You have to let your router, which should be between your cluster and the internet, know about the cluster ranges. You can use the global BGPPeer concept from calico. Once you've done that, als set up BGP on your router. (Use [1]) for more info.
From there you should have all the flexibility on your router to route it differently based on the non default IPPool's subnet and/or first tunnel it, e.g., to the questioner's 'S2'
Note that unless you are truly using public IP space, i.e., non RFC-1918 IPs (plus some others), you should introduce NATing somewhere yourself now, you can choose to do so in S1 or S2, if you opt for the latter, than that site also needs to know about the return path back to your router.
This is not really a cloud-native solution since you're just moving the problem from kubernetes to "old-school" domain of policy based routing on fixed subnets -- which is not really what the questioner asked for since he implied that there is also a kubernetes process in 'S2'. In the possible solution above, a k8s process is not needed in S2.
This is what #coderanger's custom outgoing network path for kubernetes pod was suggesting

Related

Forwarding all Kubernetes traffic through a single node

I have a Kubernetes cluster with multiple nodes in two different subnets (x and y). I have an IPsec VPN tunnel setup between my x subnet and an external network. Now my problem is that the pods that get scheduled in the nodes on the y subnet can't send requests to the external network because they're in nodes not covered by the VPN tunnel. Creating another VPN to cover the y subnet isn't possible right now. Is there a way in k8s to force all pods' traffic to go through a single source? Or any clean solution even if outside of k8s?
Posting this as a community wiki, feel free to edit and expand.
There is no built-in functionality in kubernetes that can do it. However there are two available options which can help to achieve the required setup:
Istio
If services are well known then it's possible to use istio egress gateway. We are interested in this use case:
Another use case is a cluster where the application nodes don’t have
public IPs, so the in-mesh services that run on them cannot access the
Internet. Defining an egress gateway, directing all the egress traffic
through it, and allocating public IPs to the egress gateway nodes
allows the application nodes to access external services in a
controlled way.
Antrea egress
There's another solution which can be used - antrea egress. Use cases are:
You may be interested in using this capability if any of the following apply:
A consistent IP address is desired when specific Pods connect to
services outside of the cluster, for source tracing in audit logs, or
for filtering by source IP in external firewall, etc.
You want to force outgoing external connections to leave the cluster
via certain Nodes, for security controls, or due to network topology
restrictions.

Having 1 outgoing IP for kubernetes egress traffic

Current set-up
Cluster specs: Managed Kubernetes on Digital Ocean
Goal
My pods are accessing some websites but I want to use a proxy first.
Problem
The proxy I need to use is only taking 1 IP address in an "allow-list".
My cluster is using different nodes, with node-autoscaler so I have multiple and changing IP addresses.
Solutions I am thinking about
Setting-up a proxy (squid? nginx?) outside of the cluster (Currently not working when I access an HTTPS website)
Istio could let me set-up a gateway? (No knowledge of Istio)
Use GCP managed K8s, and follow the answers on Kubernetes cluster outgoing traffic IP. But all our stack is on Digital Ocean and the pricing is better there.
I am curious to know what is the best practice, easiest solution or if anyone experienced such use-case before :)
Best
You could set up all your traffic to go through istio-egressgateway.
Then you could manipulate the istio-egressgateway to always be deployed on the same node of the cluster, and whitelist that IP address.
Pros: super easy. BUT. If you are not using Istio already, to set up Istio just for this is may be killing a mosquito with a bazooka.
Cons: Need to make sure the node doesn't change the IP address. Otherwise the istio-egressgateway itself might not get deployed (if you do not have the labels added to the new node), and you will need to reconfigure everything for the new node (new IP address). Another con might be the fact that if the traffic goes up, there is an HPA, which will deploy more replicas of the gateway, and all of them will be deployed on the same node. So, if you are going to have lots of traffic, may be it would be a good idea to isolate one node, just for this purpose.
Another option would be as you are suggesting; a proxy. I would recommend an Envoy proxy directly. I mean, Istio is going to be using Envoy anyways right? So, just get the proxy directly, put it in a pod, do the same thing as I mentioned before; node affinity, so it will always run on the same node, so it will go out with the same IP.
Pros: You are not installing entire service mesh control plane for one tiny thing.
Cons: Same as before, as you still have the issue of the node IP change if something goes wrong, plus you will need to manage your own Deployment object, HPA, configure the Envoy proxy, etc. instead of using Istio objects (like Gateway and a VirtualService).
Finally, I see a third option; to set up a NAT gateway outside the cluster, and configure your traffic to go through it.
Pros: You won't have to configure any kubernetes object, therefor there will be no need to set up any node affinity, therefor there will be no node overwhelming or IP change. Plus you can remove the external IP addresses from your cluster, so it will be more secure (unless you have other workloads that need to reach internet directly). Also , probably having a single node configured as NAT will be more resilient then a kubernetes pod, running in a node.
Cons: May be a little bit more complicate to set up?
And there is this general Con, that you can whitelist only 1 IP address, so you will always have a single point of failure. Even NAT gateway; it still can fail.
The GCP static IP won't help you. What is suggesting the other post is to reserve an IP address, so you can re-use it always. But it's not that you will have that IP address automatically added to a random node that goes down. Human intervention is needed. I don't think you can have one specific node to have a static IP address, and if it goes down, the new created node will pick the same IP. That service, to my knowledge, doesn't exist.
Now, GCP does offer a very resilient NAT gateway. It is managed by Google, so shouldn't fail. Not cheap though.

How to assign a single static source IP address for all pods of a service or deployment in kubernetes?

Consider a microservice X which is containerized and deployed in a kubernetes cluster. X communicates with a Payment Gateway PG. However, the payment gateway requires a static IP for services contacting it as it maintains a whitelist of IP addresses which are authorized to access the payment gateway. One way for X to contact PG is through a third party proxy server like QuotaGuard which will provide a static IP address to service X which can be whitelisted by the Payment Gateway.
However, is there an inbuilt mechanism in kubernetes which can enable a service deployed in a kube-cluster to obtain a static IP address?
there's no mechanism in Kubernetes for this yet.
other possible solutions:
if nodes of the cluster are in a private network behind a NAT then just add your network's default gateway to the PG's whitelist.
if whitelist can accept a cidr apart from single IPs (like 86.34.0.0/24 for example) then add your cluster's network cidr to the whitelist
If every node of the cluster has a public IP and you can't add a cidr to the whitelist then it gets more complicated:
a naive way would be to add ever node's IP to the whitelist, but it doesn't scale above tiny clusters few just few nodes.
if you have access to administrating your network, then even though nodes have pubic IPs, you can setup a NAT for the network anyway that targets only packets with PG's IP as a destination.
if you don't have administrative access to the network, then another way is to allocate a machine with a static IP somewhere and make it act as a proxy using iptables NAT similarly like above again. This introduces a single point of failure though. In order to make it highly available, you could deploy it on a kubernetes cluster again with few (2-3) replicas (this can be the same cluster where X is running: see below). The replicas instead of using their node's IP to communicate with PG would share a VIP using keepalived that would be added to PG's whitelist. (you can have a look at easy-keepalived and either try to use it directly or learn from it how it does things). This requires high privileges on the cluster: you need be able to grant to pods of your proxy NET_ADMIN and NET_RAW capabilities in order for them to be able to add iptables rules and setup a VIP.
update:
While waiting for builds and deployments during last few days, I've polished my old VIP-iptables scripts that I used to use as a replacement for external load-balancers on bare-metal clusters, so now they can be used as well to provide egress VIP as described in the last point of my original answer. You can give them a try: https://github.com/morgwai/kevip
There are two answers to this question: for the pod IP itself, it depends on your CNI plugin. Some allow it with special pod annotations. However most CNI plugins also involve a NAT when talking to the internet so the pod IP being static on the internal network is kind of moot, what you care about is the public IP the connection ends up coming from. So the second answer is "it depends on how your node networking and NAT is set up". This is usually up to the tool you used to deploy Kubernetes (or OpenShift in your case I guess). With Kops it's pretty easy to tweak the VPC routing table.

Why is it bad to use Pod IPs to communicate within a Kubernetes cluster?

So I'm setting up a NATS cluster at work in OpenShift. I can easily get things to work by having each NATS server instance broadcast its Pod IP to the cluster. The guy I talked to at work strongly advised against using the Pod IP and suggested using the Pod name. In the email, he said something about if a pod restarted. But like I tried deleting the pod and the new Pod IP was in the list of connect urls for NATS and it worked fine. I know Kubernetes has DNS and you can use the headless service but it seems somewhat flaky to me. The Pod IP works.
I believe "the guy at work" has a point, to a certain extent, but it's hard to tell to which extent it's cargo-culting and what is half knowledge. The point being: the pod IPs are not stable, that is, every time a pod gets re-launched (on the same node or somewhere else, doesn't matter) it will get a new IP from the pod CIDR-range assigned.
Now, services provide stability by introducing a virtual IP (VIP): this acts as a cluster-internal mini-load balancer sitting in front of pods and yes, the recommended way to talk to pods, in the general case, is via services. Otherwise, you'd need to keep track of the pod IPs out-of-band, no bueno.
Bottom-line: if NATS manages that for you, keeps track and maps pod IPs then fine, use it, no harm done.
While the answer from Michael is mostly true, it is important to understand there is no 100% guarantee that a service IP (aka ClusterIP) service will not change it's IP. There is a specific case of service recreation (delete/create) that will cause service IP change.
That said, the situation is somewhat different for services that have their own means of autodiscovery and/or clustering. Usually it will not be fine or enough to have a single regular service. They need to connect to seed, or discover all nodes etc. One of the means that you might use here are headless services, which return, under given name a full list of all, direct pod IPs.
Mind that using headles service has its tiny quirks as well, ie. not all software re-resolves DNS over time after initial startup, so you might end up with cached endpoints that become obsolete over time.
You might also want to leverage StatefulSets capability to retain a deterministic name (aka network identity) for each pod (ie. mypod-1, mypod-2 etc.) which, combined with headless Service, will give you static per pod names to use.
I do think that using only pod IPs will probably lead to some issues at one edge case or another, so you should at least use one of the above solutions for cluster discovery/registration. For actual communication during and after the pod was registered in the cluster, use of pod IPs can actually be for the best.

Kubernetes (on GKE) external connection through NAT for specific kube services?

I currently have two clusters on GKE - one in eu-west1-b and another in us-east1-b. The pods deployed to the nodes in these clusters need to make location-based requests (for latency testing purposes).
I also need to connect to my postgres instance on RDS, which uses IP-based whitelisting for external connections. The nodes in my clusters have ephemeral IPs so I can't use them.
I have done a lot of research and gone through lots of SO answers and docs and tutorials and come to the solution that routing traffic through a NAT is pretty much the best/only way to do this right now on GKE.
https://serverfault.com/questions/835425/kubernetes-external-connection-through-single-ip
Similar to that question above, I don't want to route all of my traffic through the NAT. My reason is because I need my requests to come from the internet gateway associated with the current node so it is coming from a particular region.
The above question has some answers that almost get me there, but doesn't include any kube-specific configuaration. This is a great tutorial:
https://docs.tenable.com/pvs/deployment/Content/GoogleCloudInstructionsNatGateway.htm
But again, is not based on kube.
My thinking is that I need to define a service for postgres in my kube cluster, and then tell that to route to the external service through the NAT. Not entirely sure where to start and would appreciate help.
A solution:
Tag your instances in different zones/regions with different tags
Create static IP addresses for each zone/region
Create NAT exit nodes (GCE instances or instance groups) using the external address from above
Create a route trough each of the NAT exit nodes. Restrict each route with destination IP range for your RDS ingress IP/32 and network tags from Step 1 (so the instances use the correct gateway)