Kuberentes cluster IP vs FQDN of a service - kubernetes

I'm curious what's the benefit of using Cluster IP in kubernetes.
I know if one app needs to access service of other app. It can directly use FQDN of that service, without the trouble of creating Cluster IP.
But i still see lots of places Cluster IP being used.

From my exp FQDN uses much more, however there are pretty much examples with Cluster IP direct usage. Benefit? dont think so. I think this is more philosophical question and you can do the way you prefer more.
Referring to Connect a Front End to a Back End Using a Service from kubernetes page will show optimal way how to do similar things

Related

kubernetes point domain to a local service

I'm new to Kubernetes and trying to point all requests to the domain to another local service.
Both applications are running in the same cluster under a different namespace
Example domains
a.domain.com hosting first app
b.domain.com hosting the second app
When I do a curl request from the first app to the second app (b.domain.com). it travels through the internet to the second app.
Usually what I could do is in /etc/hosts point b.domain.com to localhost.
What do we do in this case in Kubernetes?
I was looking into Network Policies but I'm not sure if it correct approach.
Also As I understood we could just call service name.namespace:port from the first app. But I would like to keep the full URL.
Let me know if you need more details to help me solve this.
The way to do it is by using the Kubernetes Gateway API. Now, it is true that you can deploy your own implementation since this is an Open Source project, but there are a lot of solutions already using it and it would be much easier to learn how to implement those instead.
For what you want, Istio would fit your needs. If your cluster is hosted in a Cloud environment, you can take a look at Anthos, which is the managed version of Istio.
Finally, take a look at the blog Welcome to the service mesh era, since the traffic management between services is one of the elements of the service mesh paradigm, among others like monitoring, logging, etc.

Having 1 outgoing IP for kubernetes egress traffic

Current set-up
Cluster specs: Managed Kubernetes on Digital Ocean
Goal
My pods are accessing some websites but I want to use a proxy first.
Problem
The proxy I need to use is only taking 1 IP address in an "allow-list".
My cluster is using different nodes, with node-autoscaler so I have multiple and changing IP addresses.
Solutions I am thinking about
Setting-up a proxy (squid? nginx?) outside of the cluster (Currently not working when I access an HTTPS website)
Istio could let me set-up a gateway? (No knowledge of Istio)
Use GCP managed K8s, and follow the answers on Kubernetes cluster outgoing traffic IP. But all our stack is on Digital Ocean and the pricing is better there.
I am curious to know what is the best practice, easiest solution or if anyone experienced such use-case before :)
Best
You could set up all your traffic to go through istio-egressgateway.
Then you could manipulate the istio-egressgateway to always be deployed on the same node of the cluster, and whitelist that IP address.
Pros: super easy. BUT. If you are not using Istio already, to set up Istio just for this is may be killing a mosquito with a bazooka.
Cons: Need to make sure the node doesn't change the IP address. Otherwise the istio-egressgateway itself might not get deployed (if you do not have the labels added to the new node), and you will need to reconfigure everything for the new node (new IP address). Another con might be the fact that if the traffic goes up, there is an HPA, which will deploy more replicas of the gateway, and all of them will be deployed on the same node. So, if you are going to have lots of traffic, may be it would be a good idea to isolate one node, just for this purpose.
Another option would be as you are suggesting; a proxy. I would recommend an Envoy proxy directly. I mean, Istio is going to be using Envoy anyways right? So, just get the proxy directly, put it in a pod, do the same thing as I mentioned before; node affinity, so it will always run on the same node, so it will go out with the same IP.
Pros: You are not installing entire service mesh control plane for one tiny thing.
Cons: Same as before, as you still have the issue of the node IP change if something goes wrong, plus you will need to manage your own Deployment object, HPA, configure the Envoy proxy, etc. instead of using Istio objects (like Gateway and a VirtualService).
Finally, I see a third option; to set up a NAT gateway outside the cluster, and configure your traffic to go through it.
Pros: You won't have to configure any kubernetes object, therefor there will be no need to set up any node affinity, therefor there will be no node overwhelming or IP change. Plus you can remove the external IP addresses from your cluster, so it will be more secure (unless you have other workloads that need to reach internet directly). Also , probably having a single node configured as NAT will be more resilient then a kubernetes pod, running in a node.
Cons: May be a little bit more complicate to set up?
And there is this general Con, that you can whitelist only 1 IP address, so you will always have a single point of failure. Even NAT gateway; it still can fail.
The GCP static IP won't help you. What is suggesting the other post is to reserve an IP address, so you can re-use it always. But it's not that you will have that IP address automatically added to a random node that goes down. Human intervention is needed. I don't think you can have one specific node to have a static IP address, and if it goes down, the new created node will pick the same IP. That service, to my knowledge, doesn't exist.
Now, GCP does offer a very resilient NAT gateway. It is managed by Google, so shouldn't fail. Not cheap though.

Why is it bad to use Pod IPs to communicate within a Kubernetes cluster?

So I'm setting up a NATS cluster at work in OpenShift. I can easily get things to work by having each NATS server instance broadcast its Pod IP to the cluster. The guy I talked to at work strongly advised against using the Pod IP and suggested using the Pod name. In the email, he said something about if a pod restarted. But like I tried deleting the pod and the new Pod IP was in the list of connect urls for NATS and it worked fine. I know Kubernetes has DNS and you can use the headless service but it seems somewhat flaky to me. The Pod IP works.
I believe "the guy at work" has a point, to a certain extent, but it's hard to tell to which extent it's cargo-culting and what is half knowledge. The point being: the pod IPs are not stable, that is, every time a pod gets re-launched (on the same node or somewhere else, doesn't matter) it will get a new IP from the pod CIDR-range assigned.
Now, services provide stability by introducing a virtual IP (VIP): this acts as a cluster-internal mini-load balancer sitting in front of pods and yes, the recommended way to talk to pods, in the general case, is via services. Otherwise, you'd need to keep track of the pod IPs out-of-band, no bueno.
Bottom-line: if NATS manages that for you, keeps track and maps pod IPs then fine, use it, no harm done.
While the answer from Michael is mostly true, it is important to understand there is no 100% guarantee that a service IP (aka ClusterIP) service will not change it's IP. There is a specific case of service recreation (delete/create) that will cause service IP change.
That said, the situation is somewhat different for services that have their own means of autodiscovery and/or clustering. Usually it will not be fine or enough to have a single regular service. They need to connect to seed, or discover all nodes etc. One of the means that you might use here are headless services, which return, under given name a full list of all, direct pod IPs.
Mind that using headles service has its tiny quirks as well, ie. not all software re-resolves DNS over time after initial startup, so you might end up with cached endpoints that become obsolete over time.
You might also want to leverage StatefulSets capability to retain a deterministic name (aka network identity) for each pod (ie. mypod-1, mypod-2 etc.) which, combined with headless Service, will give you static per pod names to use.
I do think that using only pod IPs will probably lead to some issues at one edge case or another, so you should at least use one of the above solutions for cluster discovery/registration. For actual communication during and after the pod was registered in the cluster, use of pod IPs can actually be for the best.

Kubernetes (on GKE) external connection through NAT for specific kube services?

I currently have two clusters on GKE - one in eu-west1-b and another in us-east1-b. The pods deployed to the nodes in these clusters need to make location-based requests (for latency testing purposes).
I also need to connect to my postgres instance on RDS, which uses IP-based whitelisting for external connections. The nodes in my clusters have ephemeral IPs so I can't use them.
I have done a lot of research and gone through lots of SO answers and docs and tutorials and come to the solution that routing traffic through a NAT is pretty much the best/only way to do this right now on GKE.
https://serverfault.com/questions/835425/kubernetes-external-connection-through-single-ip
Similar to that question above, I don't want to route all of my traffic through the NAT. My reason is because I need my requests to come from the internet gateway associated with the current node so it is coming from a particular region.
The above question has some answers that almost get me there, but doesn't include any kube-specific configuaration. This is a great tutorial:
https://docs.tenable.com/pvs/deployment/Content/GoogleCloudInstructionsNatGateway.htm
But again, is not based on kube.
My thinking is that I need to define a service for postgres in my kube cluster, and then tell that to route to the external service through the NAT. Not entirely sure where to start and would appreciate help.
A solution:
Tag your instances in different zones/regions with different tags
Create static IP addresses for each zone/region
Create NAT exit nodes (GCE instances or instance groups) using the external address from above
Create a route trough each of the NAT exit nodes. Restrict each route with destination IP range for your RDS ingress IP/32 and network tags from Step 1 (so the instances use the correct gateway)

Handling thousands of services in Kubernetes cluster

Some time ago I asked about handling thousands of services in a Kubernetes cluster:
Can Kubernetes handle thousands of services?
At that time Kubernetes was using env vars and my question was more oriented to that. Now that Kubernetes has a DNS sounds like we don't have the problem with env vars anymore, however the docs still says it won't perform well when handling thousands of services:
https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/services.md#shortcomings
Wanted to know if documentation is outdated or if there are still issues to scale Kubernetes to thousands of services.
The shortcoming mentioned in the documentation has not changed, because Kubernetes still uses the same mechanism (iptables and a userspace proxy) for proxying traffic sent to a service IP to the pods backing the service.
However, I don't believe we actually know how bad it is. A team member briefly tried testing it early this year and didn't see any impact, but didn't do anything rigorous to verify. It's possible that it'll work fine at a couple thousand services. If you try it, we'd love to hear how it goes via IRC or email.