Kubernetes over internet - kubernetes

My VPS provider doesn't have possibility to have private network between VSes. So master and nodes are interconnected over the internet. Is it safe enough practice? Or it is better to move to AWS?

While the other answer states it is not safe I would strongly disagree on it.
1: It is perfectly fine to expose master on the public internet, as you would do with any other server. It is by design protected with authentication/cipher. Obviously a regular sec hardening should be in place, but that is a case for any internet facing system. Your masters will also run things like scheduler and controller-manager, all locally, so not really an issue.
2: The traffic between pods in usual kubernetes setup passes via an overlay network like ie. flannel, calico or weave. Speaking from experience, some of them, like in my case Weave Net, support traffic ciphering explicitly to make it safer for the overlay to communicate over public network.
3: Statement that any pods that open ports are by default public is fundamentally wrong. Each pod has it's own network namespace, so even if it listens on 0.0.0.0 to capture any traffic, this happens only within that local namespace so by no means is it exposed externally. Untill you configura kubernetes service of NodePort or LoadBalancer type to explicitly expose this service (and it's backing pods ports) to the internet. And you can control this even more by means of NetworkPolicies.
So yes, you can run kubernetes cluster over public network in a way that is safe.

Related

Understanding Kubernetes networking and public/private addressing

I'm trying to set up a Kubernetes cluster for production (with kubeadm) on a private cloud (IONOS Cloud). I've been trying to figure it all out and configure it as best as possible for a few weeks now. Still there is one thing I don't quite understand or see if it is possible.
The initial cluster would be 5 nodes, 3 masters and 2 workers. The master servers with a load balancer and the workers with another load balancer. Each server has 2 interfaces, one for the public network and one for the private network (192.168.3.0/24). Each server has a firewall that by default blocks all packets.
In order not to have to create firewall rules on each server, so that they can see each other over the public network, I would like any kind of communication between Kubernetes nodes to be over the private network.
I can think of other reasons to use the private network for inter-node communication, such as: speed, latency, security...
However, I have not been able to configure it and I don't know if it is really possible to create this scenario. The problems I encounter are the following:
In order to access the API (e.g. kubectl) of the cluster from outside I need to expose the control-plane endpoint with the public IP of the balancer. If I do this, then the database endpoints etcd are exposed on the public network. Then, the other nodes, in the process of joining the cluster (kubeadm join) need to get some information from the databases etcd, therefore, they necessarily need visibility over the public network between them.
Once all the nodes are joined, the "kubernetes" service has the endpoints of all the control-plane endpoints (public). When I try to deploy Calico (or another CNI) they never finish deploying because they make queries to the kubernetes service, and if they don't have visibility between them, it fails.
It seems that whatever I do, if I publish the API on the public network, I need all the nodes to see each other over the public network.
Maybe I'm making my life too complicated, and the simplest thing to do is to open a firewall rule for each node, but I don't know if this is a good practice.
Architecture diagram (I still cannot embed images)

How to assign a public ip to a pod on dedicated servers

I have applications needs to give each pod a public ip and expose ports on this public ip.
I am trying not to use virtual machines.
matellb has similar feature. But, it binds a address to a service not pod. And, it wastes a lot of bandwidth.
Technically this is up to your CNI plugin, however very few support this. Pods generally live in the internal cluster network and are exposed externally through either NodePort or LoadBalancer services, for example using MetalLB. Why do you think this "wastes bandwidth"? If you're concerned about internal rerouting, you may want to enable externalTrafficPolicy: Local to reduce internal bounces but your internal network probably has a lot more bandwidth available than your internet connection so it that's not usually a reason to worry.

Q: Efficient Kubernetes load balancing

I've been looking into Kubernetes networking, more specifically, how to serve HTTPS users the most efficient.
I was watching this talk: https://www.youtube.com/watch?v=0Omvgd7Hg1I and from 22:18 he explains what the problem is with a load balancer that is not pod aware. Now, how they solve this in kubernetes is by letting the nodes also act as a 'router' and letting the node pass the request on to another node. (explained at 22:46). This does not seem very efficient, but when looking around SoundCloud (https://developers.soundcloud.com/blog/how-soundcloud-uses-haproxy-with-kubernetes-for-user-facing-traffic) actually seems to do something similar to this but with NodePorts. They say that the overhead costs less than creating a better load balancer.
From what I have read an option might be using an ingress controller. Making sure that there is not more than one ingress controller per node, and routing the traffic to the specific nodes that have an ingress controller. That way there will not be any traffic re-routing needed. However, this does add another layer of routing.
This information is all from 2017, so my question is: is there any pod aware load balancer out there, or is there some other method that does not involve sending the http request and response over the network twice?
Thank you in advance,
Hendrik
EDIT:
A bit more information about my use case:
There is a bare-metal setup with kubernetes. The firewall load balances the incomming data between two HAProxy instances. These HAProxy instances do ssl termination and forward the traffic to a few sites. This includes an exchange setup, a few internal IIS sites and a nginx server for a static web app. The idea is to transform the app servers into kubernetes.
Now my main problem is how to get the requests from HAProxy into kubernetes. I see a few options:
Use the SoundCloud setup. The infrastructure could stay almost the same, the HAProxy server can still operate the way they do now.
I could use an ingress controller on EACH node in the kubernetes cluster and have the firewall load balance between the nodes. I believe it is possible to forward traffic from the ingress controller to server outside the cluster, e.g. exchange.
Some magic load balancer that I do not know about that is pod aware and able to operate outside of the kubernetes cluster.
Option 1 and 2 are relatively simple and quite close in how they work, but they do come with a performance penalty. This is the case when the node that the requests gets forwarded to by the firewall does not have the required pod running, or if another pod is doing less work. The request will get forwarded to another node, thus, using the network twice.
Is this just the price you pay when using Kubernetes, or is there something that I am missing?
How traffic heads to pods depend on whether a managed cluster is used.
Almost all cloud providers can forward traffic in a cloud-native way in their managed K8s clusters. First, you can a managed cluster with some special network settings (e.g. vpc-native cluster of GKE). Then, the only thing you need to do is to create a LoadBalancer typed Service to expose your workload. You can also create Ingresses for your L7 workloads, they are going to be handled by provided IngressControllers (e.g. ALB of AWS).
In an on-premise cluster without any cloud provider(OpenStack or vSphere), the only way to expose workloads is NodePort typed Service. It doesn't mean you can't improve it.
If your cluster is behind reverse proxies (the SoundCloud case), setting externalTrafficPolicy: Local to Services could break traffic forwarding among work nodes. When traffic received through NodePorts, they are forwarded to local Pods or dropped if Pods reside on other nodes. Reserve proxy will mark these NodePort as unhealthy in the backend health check and reject to forward traffic to them. Another choice is to use topology-aware service routing. In this case, local Pods have priorities and traffic is still forwarded between node when no local Pods matched.
For IngressController in on-prem clusters, it is a little different. You may have some work nodes that have EIP or public IP. To expose HTTP(S) services, an IngressController usually deployed on those work nodes through DaemeaSet and HostNetwork such that clients access the IngressController via the well-known ports and EIP of nodes. These work nodes regularly don't accept other workloads (e.g. infra node in OpenShift) and one more forward on the Pod network is needed. You can also deploy the IngressController on all work nodes as well as other workloads, so traffic could be forwarded to a closer Pod if the IngressController supports topology-aware service routing although it can now.
Hope it helps!

Best way to go between private on-premises network and kubernetes

I have setup an on-premises Kubernetes cluster, and I want to be ensure that my services that are not in Kubernetes, but exist on a separate class B are able to consume those services that have migrated to Kubernetes. There's a number of ways of doing this by all accounts and I'm looking for the simplest one.
Ingress + controller seems to be the one favoured - and it's interesting because of the virtual hosts and HAProxy implementation. But where I'm getting confused is how to set up the Kubernetes service:
We've not a great deal of choice - ClusterIP won't be sufficient to expose it to the outside, or NodePort. LoadBalancer seems to be a simpler, cut down way of switching between network zones - and although there are OnPrem solutions (metalLB), seems to be far geared towards cloud solutions.
But if I stick with NodePort, then my entry into the network is going to be on a non-standard port number, and I would prefer it to be over standard port; particuarly if running a percentage of traffic for that service over non-kube, and the rest over kubernetes (for testing purposes, I'd like to monitor the traffic over a period of time before I bite the bullet and move 100% of traffic for the given microservice to kubernetes). In that case it would be better those services would be available across the same port (almost always 80 because they're standard REST micro-services). More than that, if I have to re-create the service for whatever reason, I'm pretty sure the port will change, and then all traffic will not be able to enter the Kubernetes cluster and that's a frightening proposition.
What are the suggested ways of handling communication between existing on-prem and Kubernetes cluster (also on prem, different IP/subnet)?
Is there anyway to get traffic coming in without changing the network parameters (class B's the respective networks are on), and not being forced to use NodePort?
NodePort service type may be good at stage or dev environments. But i recommend you to go with LoadBalancer type service (Nginx ingress controller is one). The advantage for this over other service types are
You can use standard port (Rather random Nodeport generated by your kubernetes).
Your service is load balanced. (Load balancing will be taken care by ingress controller).
Fixed port (it will not change unless you modify something in ingress object).

Joining an external Node to an existing Kubernetes Cluster

I have a custom Kubernetes Cluster (deployed using kubeadm) running on Virtual Machines from an IAAS Provider. The Kubernetes Nodes have no Internet facing IP Adresses (except for the Master Node, which I also use for Ingress).
I'm now trying to join a Machine to this Cluster that is not hosted by my main IAAS provider. I want to do this because I need specialized computing resources for my application that are not offered by the IAAS.
What is the best way to do this?
Here's what I've tried already:
Run the Cluster on Internet facing IP Adresses
I have no trouble joining the Node when I tell kube-apiserver on the Master Node to listen on 0.0.0.0 and use public IP Adresses for every Node. However, this approach is non-ideal from a security perspective and also leads to higher cost because public IP Adresses have to be leased for Nodes that normally don't need them.
Create a Tunnel to the Master Node using sshuttle
I've had moderate success by creating a tunnel from the external Machine to the Kubernetes Master Node using sshuttle, which is configured on my external Machine to route 10.0.0.0/8 through the tunnel. This works in principle, but it seems way too hacky and is also a bit unstable (sometimes the external machine can't get a route to the other nodes, I have yet to investigate this problem further).
Here are some ideas that could work, but I haven't tried yet because I don't favor these approaches:
Use a proper VPN
I could try to use a proper VPN tunnel to connect the Machine. I don't favor this solution because it would add a (admittedly quite small) overhead to the Cluster.
Use a cluster federation
It looks like kubefed was made specifically for this purpose. However, I think this is overkill in my case: I'm only trying to join a single external Machine to the Cluster. Using Kubefed would add a ton of overhead (Federation Control Plane on my Main Cluster + Single Host Kubernetes Deployment on the external machine).
I couldn't think about any better solution than a VPN here. Especially since you have only one isolated node, it should be relatively easy to make the handshake happen between this node and your master.
Routing the traffic from "internal" nodes to this isolated node is also trivial. Because all nodes already use the master as their default gateway, modifying the route table on the master is enough to forward the traffic from internal nodes to the isolated node through the tunnel.
You have to be careful with the configuration of your container network though. Depending on the solution you use to deploy it, you may have to assign a different subnet to the Docker bridge on the other side of the VPN.