Can two kubernetes clusters share the same external etcd and work like master slave - kubernetes

We have a requirement to setup a geo redundant cluster. I am looking at sharing an external etcd cluster to run two kubernetes clusters. It may sound absurd at first, but the requirements have come down to it..I am seeking some direction to whether it is possible, and if not, what are the challenges.

Yes it is possible, you can have a single etcd cluster and multiple k8s clusters attached to it. The key to achieve it, is to use -etcd-prefix string flag from kubernetes apiserver. This way each cluster will use different root path for storing its resources and avoid possible conflict with second cluster in the etcd. In addition to it, you should also setup the appropriate rbac rules and certificates for each k8s cluster. You can find more detailed information about it in the following article: Multi-tenant external etcd for Kubernetes clusters.
EDIT: Ooh wait, just noticed that you want to have those two clusters to behave as master-slave. In that case you could achieve it by assign to the slave cluster a read-only role in the etcd and change it to read-write when it has to become master. Theoretically it should work, but I have never tried it and I think the best option is to use builtin k8s mechanism for high-availability like leader-election.

Related

Will the master know the data on workers/nodes in k8s

I try to deploy a set of k8s on the cloud, there are two options:the masters are in trust to the cloud provider or maintained by myself.
so i wonder about that if the masters in trust will leak the data on workers?
Shortly, will the master know the data on workers/nodes?
The abstractions in Kubernetes are very well defined with clear boundaries. You have to understand the concept of Volumes first. As defined here,
A Kubernetes volume is essentially a directory accessible to all
containers running in a pod. In contrast to the container-local
filesystem, the data in volumes is preserved across container
restarts.
Volumes are attached to the containers in a pod and There are several types of volumes
You can see the layers of abstraction source
Master to Cluster communication
There are two primary communication paths from the master (apiserver) to the cluster. The first is from the apiserver to the kubelet process which runs on each node in the cluster. The second is from the apiserver to any node, pod, or service through the apiserver’s proxy functionality.
Also, you should check the CCM - The cloud controller manager (CCM) concept (not to be confused with the binary) was originally created to allow cloud specific vendor code and the Kubernetes core to evolve independent of one another. The cloud controller manager runs alongside other master components such as the Kubernetes controller manager, the API server, and scheduler. It can also be started as a Kubernetes addon, in which case it runs on top of Kubernetes.
Hope this answers all your questions related to Master accessing the data on Workers.
If you are still looking for more secure ways, check 11 Ways (Not) to Get Hacked
Short answer: yes the control plane can access all of your data.
Longer and more realistic answer: probably don't worry about it. It is far more likely that any successful attack against the control plane would be just as successful as if you were running it yourself. The exact internal details of GKE/AKS/EKS are a bit fuzzy, but all three providers have a lot of experience running multi-tenant systems and it wouldn't be negligent to trust that they have enough protections in place against lateral escalations between tenants on the control plane.

Is it possible to join two separate kubernetes clusters?

I have deployments on one Kubernetes cluster that I might want to move to another Kubernetes cluster in the future. Is it possible to combine these two clusters or must I redeploy everything? If the answer is yes, what if there are StatefulSets?
The short answer is no.
You can connect to clusters with something like Kubernetes Federation or if you have Calico, you can use something like BGP peering
You'll have to redeploy everything and in the case of StatefulSets, it really depends where you are storing your state. For example:
Is it MySql? Backup your db and restore it in the new place.
Is it Cassandra? Can you reattach the same physical volumes in the cloud provider? if not, then you'll have to transfer your data.
Is it etcd, Consul or Zookeeper? Can you back it up or attach the same physical volumes?

How to create a GCP Kubernetes Engine cluster spanning two regions?

I wish to know how to create a GCP Kubernetes Engine cluster spanning two regions. For instance, a cluster has some instances at "us-west1" region, and others at "us-central1" region.
My use case is to verify "failure-domain.beta.kubernetes.io/region" topology key is working as expected. I am aware of:
1. cluster federation: not supported yet for Kubernetes Engine
2. multi-cluster ingress: in development, but may not something I am looking for
3. regional cluster: not applicable as it focuses on replication in only one region
I am aware that my use case is not atypical.
It is possible, but I cannot say that will be a stable and fully functional configuration.
There are no standard tools to do what you want, but you can connect external nodes to your cluster from a different region manually. It will not work with kubeadm, but if you will setup kubelet manually - it will work, but with many limitations:
No auto-updates.
You should manage the connection between regions manually (you should have a private network with direct routing within all your nodes).
You can have problems with logs, monitoring, load balancing, etc.
You will pay for the traffic between internal and external nodes as for the external traffic.
Finally, although it is possible, I cannot recommend you to use it. If you really want to get a multi-region cluster - setup it yourself by kubeadm and use kubefed to create a federation.

usecase for etcd inside Kubernetes

I was just wondering, why it is useful to run etcd cluster inside Kubernetes, when Kubernetes itself depends on etcd.
It just does not make sense to me, as if I have HA Kube, I am also forced to have HA etcd outside. Hence to reason to install it again inside...
I have an external ETCD that manages my k8s HA cluster and im not letting any developer apps near it. I would be too concerned about something going wrong and breaking the k8s cluster. It is also a fixed size at 3 which works well for the cluster size with its requirements. If the developers need a key/value store for their db and want etcd, this would be a great way to make one in the cluster for the applications. With it being statefulsets, its scalable.
If you're using Kubernetes via GKE, the underlying Etcd cluster is not exposed in any way.

Does Kubernetes provision new VMs for pods on my cloud platform?

I'm currently learning about Kubernetes and still trying to figure it out. I get the general use of it but I think that there still plenty of things I'm missing, here's one of them. If I want to run Kubernetes on my public cloud, like GCE or AWS, will Kubernetes spin up new VMs by itself in order to make more compute for new pods that might be needed? Or will it only use a certain amount of VMs that were pre-configured as the compute pool. I heard Brendan say, in his talk in CoreOS fest, that Kubernetes sees the VMs as a "sea of compute" and the user doesn't have to worry about which VM is running which pod - I'm interested to know where that pool of compute comes from, is it configured when setting up Kubernetes? Or will it scale by itself and create new machines as needed?
I hope I managed to be coherent.
Thanks!
Kubernetes supports scaling, but not auto-scaling. The addition and removal of new pods (VMs) in a Kubernetes cluster is performed by replication controllers. The size of a replication controller can be changed by updating the replicas field. This can be performed in a couple ways:
Using kubectl, you can use the scale command.
Using the Kubernetes API, you can update your config with a new value in the replicas field.
Kubernetes has been designed for auto-scaling to be handled by an external auto-scaler. This is discussed in responsibilities of the replication controller in the Kubernetes docs.