Joining an external Node to an existing Kubernetes Cluster - kubernetes

I have a custom Kubernetes Cluster (deployed using kubeadm) running on Virtual Machines from an IAAS Provider. The Kubernetes Nodes have no Internet facing IP Adresses (except for the Master Node, which I also use for Ingress).
I'm now trying to join a Machine to this Cluster that is not hosted by my main IAAS provider. I want to do this because I need specialized computing resources for my application that are not offered by the IAAS.
What is the best way to do this?
Here's what I've tried already:
Run the Cluster on Internet facing IP Adresses
I have no trouble joining the Node when I tell kube-apiserver on the Master Node to listen on 0.0.0.0 and use public IP Adresses for every Node. However, this approach is non-ideal from a security perspective and also leads to higher cost because public IP Adresses have to be leased for Nodes that normally don't need them.
Create a Tunnel to the Master Node using sshuttle
I've had moderate success by creating a tunnel from the external Machine to the Kubernetes Master Node using sshuttle, which is configured on my external Machine to route 10.0.0.0/8 through the tunnel. This works in principle, but it seems way too hacky and is also a bit unstable (sometimes the external machine can't get a route to the other nodes, I have yet to investigate this problem further).
Here are some ideas that could work, but I haven't tried yet because I don't favor these approaches:
Use a proper VPN
I could try to use a proper VPN tunnel to connect the Machine. I don't favor this solution because it would add a (admittedly quite small) overhead to the Cluster.
Use a cluster federation
It looks like kubefed was made specifically for this purpose. However, I think this is overkill in my case: I'm only trying to join a single external Machine to the Cluster. Using Kubefed would add a ton of overhead (Federation Control Plane on my Main Cluster + Single Host Kubernetes Deployment on the external machine).

I couldn't think about any better solution than a VPN here. Especially since you have only one isolated node, it should be relatively easy to make the handshake happen between this node and your master.
Routing the traffic from "internal" nodes to this isolated node is also trivial. Because all nodes already use the master as their default gateway, modifying the route table on the master is enough to forward the traffic from internal nodes to the isolated node through the tunnel.
You have to be careful with the configuration of your container network though. Depending on the solution you use to deploy it, you may have to assign a different subnet to the Docker bridge on the other side of the VPN.

Related

How to run a script when a node goes down on Kubernetes?

I have a question about kepping a Kubernetes cluster online as much as possible. Usually, the cluster would be behind some kind of cloud load-balancer which does health-checks and directs traffic to the available nodes.
Now, my hosting provider does not offer managed load balancers. Instead, they have a configurable so-called "failover" IP which can be detached and re-attached to another server by running a command in the command line. it's not a real failover IP in the traditional sense. More like a movable IP.
As a beginner to Kubernetes, I'm not sure how to go about this.
Basically, I'd need to run a script that checks if the cluster is still publically online on the IP. When it goes down, one of the nodes should run the script to detach and re-attach the failover IP to itself or one of the other nodes.
Extra complication: The moving of the failover IP takes around 40-60 seconds to take effect, so we should not run the script too often.
This also means that only 1 node is attached to the public IP and all traffic to the cluster will come in this way. There is no load balancer distributing traffic among the online nodes. Will Kubernetes send the request on its own to the other nodes internally? I imagine so?
The cluster consists of 3 identical servers with 1 master and 2 other workers. I'd setup load balancing in the cluster with Ingress.
The goal is to keep websites running on k8s up as much as possible, while working with the limited options our hosting company offers. The hosting company only offers dedicated bare-metal machines and this movable IP. They don't have managed load balancers like AWS or DigitalOcean have, so I need to find a solution for that. This moveable IP looks like an answer, but if you know a better way, then sure.
All 3 machines have a public IP and a private IP.
There is 1 extra public IP that can be moved to 1 of the 3 nodes (using this I want to achieve failover, unless you know a better way?).
Personally, I don't think I need a multi-master cluster. As I understand it, the master can go down for short periods of time, and during those periods the cluster is more vulnerable, but this is okay as long as we can timely fix the master. Only thing is, that we need to move this IP over to an online node, and I'm not sure how to trigger this.
Thanks

How to make cluster nodes private on Google Kubernetes Engine?

I noticed every node in a cluster has an external IP assigned to it. That seems to be the default behavior of Google Kubernetes Engine.
I thought the nodes in my cluster should be reachable from the local network only (through its virtual IPs), but I could even connect directly to a mongo server running on a pod from my home computer just by connecting to its hosting node (without using a LoadBalancer).
I tried to make Container Engine not to assign external IPs to newly created nodes by changing the cluster instance template settings (changing property "External IP" from "Ephemeral" to "None"). But after I did that GCE was not able to start any pods (Got "Does not have minimum availability" error). The new instances did not even show in the list of nodes in my cluster.
After switching back to the default instance template with external IP everything went fine again. So it seems for some reason Google Kubernetes Engine requires cluster nodes to be public.
Could you explain why is that and whether there is a way to prevent GKE exposing cluster nodes to the Internet? Should I set up a firewall? What rules should I use (since nodes are dynamically created)?
I think Google not allowing private nodes is kind of a security issue... Suppose someone discovers a security hole on a database management system. We'd feel much more comfortable to work on fixing that (applying patches, upgrading versions) if our database nodes are not exposed to the Internet.
GKE recently added a new feature allowing you to create private clusters, which are clusters where nodes do not have public IP addresses.
This is how GKE is designed and there is no way around it that I am aware of. There is no harm in running kubernetes nodes with public IPs, and if these are the IPs used for communication between nodes you can not avoid it.
As for your security concern, if you run that example DB on kubernetes, even if you go for public IP it would not be accessible, as this would be only on the internal pod-to-pod networking, not the nodes them selves.
As described in this article, you can use network tags to identify which GCE VMs or GKE clusters are subject to certain firewall rules and network routes.
For example, if you've created a firewall rule to allow traffic to port 27017, 27018, 27019, which are the default TCP ports used by MongoDB, give the desired instances a tag and then use that tag to apply the firewall rule that allows those ports access to those instances.
Also, it is possible to create GKE cluster with applying the GCE tags on all nodes in the new node pool, so the tags can be used in firewall rules to allow/deny desired/undesired traffic to the nodes. This is described in this article under --tags flag.
Kubernetes Master is running outside your network and it needs to access your nodes. This could the the reason for having public IPs.
When you create your cluster, there are some firewall rules created automatically. These are required by the cluster, and there's e.g. ingress from master and traffic between the cluster nodes.
Network 'default' in GCP has readymade firewall rules in place. These enable all SSH and RDP traffic from internet and enable pinging of your machines. These you can remove without affecting the cluster and your nodes are not visible anymore.

Can nodes on a Kubernetes cluster have dynamic IP's OIC Classic

I'm deploying K8s to the Oracle Cloud Infrastructure where while I can make sure that the public internet facing IP stays static even when the instances are restarted. But for some reason the private IP of the instances always changes. Which brings me to the question - can Kubernetes work with nodes who's IP changes after restarts?
This could be quite a noob question but I did try to read up online and I couldn't find a conclusive answer.
Yes, kubernetes can handle that case easily, and on OCI it works just fine. The individual worker nodes will (using the kubelet on that host) call to the master IP, which we would recommend using a load balancer to front to achieve a static IP and allow you to change, scale, and otherwise adjust your master kubernetes control plane nodes as you wish, without disrupting the workers.
You can get a pretty slick setup currently with the terraform tooling for kubernetes that is published here:
https://github.com/oracle/terraform-kubernetes-installer

OpenShift and hostnetwork=true

I have deployed two POD-s with hostnetwork set to true. When the POD-s are deployed on same OpenShfit node then everything works fine since they can discover each other using node IP.
When the POD-s are deployed on different OpenShift nodes then they cant discover each other, I get no route to host if I want to point one POD to another using node IP. How to fix this?
The uswitch/kiam (https://github.com/uswitch/kiam) service is a good example of a use case.
it has an agent process that runs on the hostnetwork of all worker nodes because it modifies a firewall rule to intercept API requests (from containers running on the host) to the AWS api.
it also has a server process that runs on the hostnetwork to access the AWS api since the AWS api is on a subnet that is only available to the host network.
finally... the agent talks to the server using GRPC which connects directly to one of the IP addresses that are returned when looking up the kiam-server.
so you have pods of the agent deployment running on the hostnetwork of node A trying to connect to kiam server running on the hostnetwork of node B.... which just does not work.
furthermore, this is a private service... it should not be available from outside the network.
If you want the two containers to be share the same physical machine and take advantage of loopback for quick communications, then you would be better off defining them together as a single Pod with two containers.
If the two containers are meant to float over a larger cluster and be more loosely coupled, then I'd recommend taking advantage of the Service construct within Kubernetes (under OpenShift) and using that for the appropriate discovery.
Services are documented at https://kubernetes.io/docs/concepts/services-networking/service/, and along with an internal DNS service (if implemented - common in Kubernetes 1.4 and later) they provide a means to let Kubernetes manage where things are, updating an internal DNS entry in the form of <servicename>.<namespace>.svc.cluster.local. So for example, if you set up a Pod with a service named "backend" in the default namespace, the other Pod could reference it as backend.default.svc.cluster.local. The Kubernetes documentation on the DNS portion of this is available at https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/
This also avoids the "hostnetwork=true" complication, and lets OpenShift (or specifically Kubernetes) manage the networking.
If you have to absolutely use hostnetwork, you should be creating router and then use those routers to have the communication between pods. You can create ha proxy based router in opeshift, reference here --https://docs.openshift.com/enterprise/3.0/install_config/install/deploy_router.html

Deterministic connection to cloud-internal IP of K8S service or its underlying endpoint?

I have a Kubernetes cluster (1.3.2) in the the GKE and I'd like to connect VMs and services from my google project which shares the same network as the cluster.
Is there a way for a VM that's internal to the subnet but not internal to the cluster itself to connect to the service without hitting the external IP?
I know there's a ton of things you can do to unambiguously determine the IP and port of services, such as the ENVs and DNS...but the clusterIP is not reachable outside of the cluster (obviously).
Is there something I'm missing? An important component to this is that this is meant to be a service "public" to the project, such that I don't know which VMs on the project will want to connect to the service (this could rule out loadBalancerSourceRanges). I understand the endpoint which the services actually wraps is the internal IP I can hit, but the only good way to get to that IP is though the Kube API or kubectl, both of which are not prod-ideal ways of hitting my service.
Check out my more thorough answer here, but the most common solution to this is to create bastion routes in your GCP project.
In the simplest form, you can create a single GCE Route to direct all traffic w/ dest_ip in your cluster's service IP range to land on one of your GKE nodes. If that SPOF scares you, you can create several routes pointing to different nodes, and traffic will round-robin between them.
If that management overhead isn't something you want to do going forward, you could write a simple controller in your GKE cluster to watch the Nodes API endpoint, and make sure that you have a live bastion route to at least N nodes at any given time.
GCP internal load balancing was just released as alpha, so in the future, kube-proxy on GCP could be implemented using that, which would eliminate the need for bastion routes to handle internal services.