Routing internal traffic in Kubernetes? - kubernetes

We presently have a setup where applications within our mesos/marathon cluster want to reach out to services which may or may not reside in our mesos/marathon cluster. Ingress for external traffic into the cluster is accomplished via an Amazon ELB sitting in front of a cluster of Traefik instances, which then chooses the appropriate set of container instances to load-balance to via the incoming HTTP Host header compared against essentially a many-to-one association of configured host headers against a particular container instance. Internal-to-internal traffic is actually handled by this same route as well, as the DNS record that is associated with a given service is mapped to that same ELB both internal to and external to our mesos/marathon cluster. We also give the ability to have multiple DNS records pointing against the same container set.
This setup works, but causes seemingly unnecessary network traffic and load against our ELBs as well as our Traefik cluster, as if the applications in the containers or another component were able to self-determine that the services they wished to call out to were within the specific mesos/marathon cluster they were in, and make an appropriate call to either something internal to the cluster fronting the set of containers, or directly to the specific container itself.
From what I understand of Kubernetes, Kubernetes provides the concept of services, which essentially can act as the front for a set of pods based on configuration for which pods the service should match over. However, I'm not entirely sure of the mechanism by which we can have applications in a Kubernetes cluster know transparently to direct network traffic to the service IPs. I think that some of this can be helped by having Envoy proxy traffic meant for, e.g., <application-name>.<cluster-name>.company.com to the service name, but if we have a CNAME that maps to that previous DNS entry (say, <application-name>.company.com), I'm not entirely sure how we can avoid exiting the cluster.
Is there a good way to solve for both cases? We are trying to avoid having our applications' logic have to understand that it's sitting in a particular cluster and would prefer a component outside of the applications to perform the routing appropriately.
If I am fundamentally misunderstanding a particular component, I would gladly appreciate correction!

When you are using service-to-service communication inside a cluster, you are using Service abstraction which is something like a static point which will road traffic to the right pods.
Service endpoint available only from inside a cluster by it's IP or internal DNS name, provided by internal Kubernetes DNS server. So, for communicating inside a cluster, you can use DNS names like <servicename>.<namespace>.svc.cluster.local.
But, what is more important, Service has a static IP address.
So, now you can add that static IP as a hosts record to the pods inside a cluster for making sure that they will communicate each other inside a cluster.
For that, you can use HostAlias feature. Here is an example of configuration:
apiVersion: v1
kind: Pod
metadata:
name: hostaliases-pod
spec:
restartPolicy: Never
hostAliases:
- ip: "10.0.1.23"
hostnames:
- "my.first.internal.service.example.com"
- ip: "10.1.2.3"
hostnames:
- "my.second.internal.service.example.com"
containers:
- name: cat-hosts
image: busybox
command:
- cat
args:
- "/etc/hosts"
So, if you will use your internal Service IP in combination with service's public FQDN, all traffic from your pod will be 100% inside a cluster, because the application will use internal IP address.
Also, you can use upstream DNS server which will contain same aliases, but an idea will be the same.
With Upstream DNS for the separate zone, resolving will work like that:
With a new version of Kubernetes, which using Core DSN for providing DNS service, and has more features it will be a bit simpler.

Related

OpenShift/OKD, what is the difference between deployment, service, route, ingress?

Could you please explain use of each "Kind" of OpenShift in a short sentences?
It is okay, that deployment contains data about, image source, pod counts, limits etc.
With the route we can determine the URL for each deployment as well as Ingress, but what is the difference and when should use route and when ingress?
And what is the exact use of service?
Thanks for your help in advance!
Your question cannot be answered simply in short words or one line answers, go through the links and explore more,
Deployment: It is used to change or modify the state of the pod. A pod can be one or more running containers or a group of duplicate pods called ReplicaSets.
Service: Each pod is given an IP address when using a Kubernetes service. The service provides accessibility, connects the appropriate pod automatically, and this address may not be directly identifiable.
Route:Similar to the Kubernetes Ingress resource, OpenShift's Route was developed with a few additional features, including the ability to split traffic between multiple backends.
Ingress: It offers routing rules for controlling who can access the services in a Kubernetes cluster.
Difference between route and ingress?
OpenShift uses HAProxy to get (HTTP) traffic into the cluster. Other Kubernetes distributions use the NGINX Ingress Controller or something similar. You can find more in this doc.
when to use route and ingress: It depends on your requirements. From the image below you can find the feature of the ingress and route and you select according to your requirements.
Exact use of service:
Each pod in a Kubernetes cluster has its own unique IP address. However, the IP addresses of the Pods in a Deployment change as they move around. Therefore, using Pod IP addresses directly is illogical. Even if the IP addresses of the member Pods change, you will always have a consistent IP address with a Service.
A Service also provides load balancing. Clients call a single, dependable IP address, and the Service's Pods distribute their requests evenly.

Expose pods in AKS to internet with existing setup

We have a request to expose certain pods in an AKS environment to the internet for 3rd party use.
Currently we have a private AKS cluster with a managed standard SKU load balancer in front using the advanced azure networking (basically Calico) where each Pod gets its own private IP from the Vnet IP space. All private IPs currently route through a firewall via user defined route in order to reach the internet, and vice versa. Traffic between on prem routes over a VPN connection through the azure virtual wan. I don’t want to change any existing routing behavior unless 100% necessary.
My question is, how do you expose an existing private AKS cluster’s specific Pods to be accessible from the internet? The entire cluster does not need to be exposed to the internet. The issue I foresee is the ephemeral Pods and ever changing IPs making simple NATing in the firewalls not an option. I’ve also thought about simply making a new AKS cluster with a public load balancer. The issue here though is security as it must still go through the firewalls and likely could with existing user defined routes
What is the recommended way to setup the architecture where certain Pods in AKS can be accessible over the internet, while still allowing those Pods to access the Pods over the private network. I want to avoid exposing all Pods to the internet
There are a couple of options that you can use in order to expose your application to
outside your network, such as: Service:
NodePort: Exposes the Service on each Node’s IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
LoadBalancer: Exposes the Service externally using a cloud provider’s load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
Also, there is another option that is use an ingress, IMO this is the best way to expose HTTP applications externally, because it's possible to create rules by path and host, and gives you much more flexibility than services. For ingress only HTTP/HTTPS is supported, if you need TCP then go to Services
I'd recommend you take a look in this links to understand in deep how services and ingress works:
Kubernetes Services
Kubernetes Ingress
NGINX Ingress
AKS network concepts
Deploy nginx ingress controller and bind the ingress controller service to a public Load Balancer. Define Ingress rules for the kubernetes services that you want to access from internet. Note that ingress controller enables entry point to the services running inside kubernetes
Several years later and wanted to update.
We did successfully implement a scalable ingress option into our private AKS cluster using NGINX as the ingress. The basic flow was
Public IP > NAT to frontend private IP of NGINX > NGINX path rules that point to your pod/service
Taking a URL as an example for a microservice of www.example.com/service1, the public DNS entry you create is what resolves www.example.com to the public IP that you will NAT to the private IP of NGINX. Then, the rules you create within NGINX take the specific /service1 path of the URL and use it to route to the specific service you pointed it at. It behaves much like URL switching in other load balancers. That is really all NGINX is doing for you. In NGINX syntax, this involves specifying a hosts name (URL) and an associated rule with a backend path and service name. The service name in this example is service1 and the path is / because service1 sits just behind the root.
Something like this saves cost by using less public IPs. For example, you can use a subdomain to easily NAT traffic to a seperate test environment. www.test.example.com and www.example.com can point to separate public IPs, which you can NAT to separate AKS clusters running NGINX. In this way, your NGINX rules can be identical because it's only looking for /service1 which hopefully you've mirrored test and prod environments.
Many ways to do this but a few recommendations from lessons learned
use subdomains to break out multiple environments
standardize your NGINX private front end IP across envronments (make them all end in .100 as an example
create a standard NGINX ingress template where you really only need to modify the serviceName. Your hostName should be static within an environment
have your devs include this and deploy their microservices with helm rather than relying on an infrastructure team to update NGINX services. Sort of defeats the devops mentality and speed gains

How does Traefik / Ngnix - (Ingress Controllers) forwards request to two different services having configured with same port number.?

Basically I have Following Hdfs Cluster setup using docker-compose:
Node 1 with IP: 192.168.1.1 having service deployed as below:
Namenode1:9000
HMaster1: 8300
ZooKeeper1:1291
Node 2 with IP: 192.168.1.2 having service deployed as below:
Namenode2:9000
ZooKeeper2:1291
How does Traefik / Ngnix - (Ingress Controllers) forwards request to two different services having configured with same port number?
There are several great tutorials on how ingress and load balancing works in kubernetes, e.g. this one by Mark Betz. As a general rule, it helps to think in terms of services and workloads instead of specific nodes where your workloads are running on.
A workload deployed in Kubernetes (a so called Pod) has its own internal IP address, called a ClusterIP. That pod can have one or more ports open, just on that pod-owned ip address.
If you now have several pods to distribute the load, e.g. like 5 web server processes or backend logic, it would be hard for a client (inside the cluster) to keep track of all those pod IPs, because they also change when a pod is updated or just restarted due to a crash. This is why Kubernetes has a so called concept of services. Those provide a stable DNS name and IP which then transparently "forwards" to one of the healthy pods. So your client only needs to know the DNS name and not keep track of the specific pod IPs.
If you now want to expose such a service to the public, there are different ways. Either you set your service to type: LoadBalancer which then sets up some load balancer infrastructure on your cloud provider and routes traffic to the nodes and then to the pods - or - you already have an ingress controller in place and just define the routing based on host names and paths. An ingress controller itself is such a loadbalanced service with an attached cloud load balancer and also has some pods (with e.g. a traefik or nginx container) which then route your packets accordingly.
So coming back to your initial question: If you want to expose a service with several pods of the same kind, then you would first create a Service resource that matches your Pods using the selector and then you create one single ingress resource that provides a hostname/path and references this service. The ingress controller will pick up those ingress resources and configure the traefik or nginx accordingly. The ingress controller doesn't really care about the host IPs and port numbers, because it acts on the internal kubernetes ClusterIPs, so you even don't need (and shouldn't) expose such a service directly when you have an ingress in place.
I hope this answers your question regarding exposing two workloads over an ingress controller. For details, check the Kubernetes docs on Ingresses. Based on the services you named (zookeeper, hdfs) load balancing and ingresses might not be what you need for that case. Zookeeper instances should be internal in most cases and need to be adressed individually, so you might want to check out headless services, for this use case. Also check the Kubernetes docs for a way to run zookeeper.

What is the reason of creating LoadBalancer when create cluster using kops

I tried to create k8s cluster on aws using kops.
After create the cluster with default definition, I saw a LoadBalance has been created.
apiVersion: kops/v1alpha2
kind: Cluster
metadata:
name: bungee.staging.k8s.local
spec:
api:
loadBalancer:
type: Public
....
I just wondering about the reason of creating the LoadBalancer along with cluster.
Appreciate !
In the type of cluster that kops creates the apiserver (referred to as api above, a component of the Kubernetes master, aka control plane) may not have a static IP address. Also, kops can create a HA (replicated) control plane, which means there will be multiple IPs where the apiserver is available.
The apiserver functions as a central connection hub for all other Kubernetes components, for example all the nodes connect to it but also the operator humans connect to them via kubectl. For one, these configuration files do not support multiple IP address for the apiserver (as to make use of the HA setup). Plus updating the configuration files every time the apiserver IP address(es) change would be difficult.
So the load balancer functions as a front for the apiserver(s) with a single, static IP address (an anycast IP with AWS/GCP). This load balancer IP is specified in the configuration files of Kubernetes components instead of actual apiserver IP(s).
Actually, it is also possible to solve this program by using a DNS name that resolves to IP(s) of the apiserver(s) coupled with a mechanism that keeps this record updated. This solution can't react to changes of the underlying IP(s) as fast a load balancer can, but it does save you couple of bucks plus it is slightly less likely to fail and creates less dependency on the cloud provider. This can be configured like so:
spec:
api:
dns: {}
See specification for more details.

Multiple Host Kubernetes Ingress Controller

I've been studying Kubernetes for a few weeks now, and using the kube-lego NGINX examples (https://github.com/jetstack/kube-lego) have successfully deployed services to Kubernetes cluster using Rancher on DigitalOcean.
I've deployed sample static sites, Wordpress, Laravel, Craft CMS, etc. All of which use custom Namespaces, Deployment, Secrets, Containers with external registries, Services, and Ingress Definitions.
Using the example (lego) NGINX Ingress Controller setup, I'm able to apply DNS to the exposed IP address of my K8s cluster, and have the resulting sites appear.
What I don't know, though, is how to allow for multiple hosts to have Ingress Controllers service the same deployments, and thus provide HA Ingress to the cluster. (by applying an external load balancer service, or geo-ip, or what-have-you).
Rancher (stable) allows me to add multiple hosts, I've spun up 3 to 5 at a time, and Kubernetes is configured and deployed across all Hosts. Furthermore, I'll define many replicas and/or deployments (listed above) and they will be spread over the cluster and accessible as would be expected. I've even specified multiple replicas of the Ingress Controller, but of course they all get scheduled on the same host, giving me only one IP address of Ingress.
So how do I allow multiple hosts (each with their own public facing IP address) to allow ingress into the cluster? I've also read about setting up multiple Ingress Controllers, but then you must specify what deployment/services are being serviced by what Ingress Controller, which then totally defeats the purpose.
Maybe I'm missing something, but if K8s multi-host is supposed to provide HA, and the Host with the Ingress Controller goes down, then the service will be rescheduled on the other Hosts, but the IP address that everything is pointing to will be dead, and thus an outage. Any way to have multiple IP Addresses to the same set of deployment/services?
I investigated my setup a bit more today, and I think I found out why I was having difficulty. The "LoadBalancer" is often mentioned as for use with Cloud Providers (in both docs, and what #fiunchinho describes). I was using it with a Rancher setup, which auto creates an HA-Proxy LoadBalancer ingress for you on the hosts.
By default, it will just schedule it on one of the hosts. You can specify that you want it scheduled globally buy providing an 'annotation' of io.rancher.scheduler.global: "true".
Like so:
annotations:
# Create load balancers on every host in the environment
io.rancher.scheduler.global: "true"
http://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
I preferred LoadBalancer over NodePort because I wanted the ability to send port 80 (and in the future port 443) to any of the Nodes, and have them successfully fulfil my request by inspecting the Host header, and directing as-needed.
These LBs can also be setup in the Rancher UI under the "Infrastructure Stack" menu. I have successfully removed the single LB, and re-added one with an "Always run one instance of this container on every host" option enabled.
After this was configured, I could make a request to any of the Hosts for any of the Ingresses, and get a response, no matter what host the container was scheduled on.
https://rancher.com/docs/rancher/v1.6/en/rancher-services/load-balancer/
So cool!
The ingress controller is deployed like any regular pod. That means that you can have as many replicas as you'd like, which will be spread among all your nodes.
You need a Service object that group all the pods for the ingress controller.
Then you just need to expose that Service to outside the cluster. You can do that using a LoadBalancer service if you are on a cloud provider. Or you can use just a NodePort service.
The point is that the service will balance the traffic that your ingress controller receives between all the pods that are running on different kubernetes nodes. If one of the nodes goes down, it doesn't really matter, because there are other nodes containing ingress controller pods.