CockroachDB Clustered Across Google Container Engine Clusters, Stateful Sets - kubernetes

CockroachDB has a relatively simple clustering mechanism, you initialize the DB with a command line option pointing at the host name of the other cockroach machines (but, this question is relevant really for any peer to peer clustered db).
One of the benefits of Cockroach is you can cluster across regions within a continent. Cockroach themselves published a good k8s config to standup a cockroach cluster on stateful sets. See this config.
I'm trying to find a way to span the cockroach cluster across two GKE clusters in different regions. DNS and connectivity between the regions isn't really an issue, but I can't figure out how to address the stateful set instances. Internal to the cluster, they're cockroachdb-1.cockroach. Is there any way to allow these to be cross cluster addressable? One option would be to expose as a nodeport and point instances from the second cluster to machines with ports in the first cluster. That seems hacky and if the machine goes down represents a single point of failure. Any other ideas about how to do this? I also explored k8s federation, but I don't think it really addresses this issue either (though I could be wrong).
One final option would be exposing each instance through a load balancer...I don't really like that, but maybe it's the only way?

This is a good question that I've been meaning to play around with soon. You've been checking into a reasonable set of ideas. The core problem, as you allude to, is that every cockroach process needs to be able to individually address every other cockroach process.
I don't know how well cluster federation has developed over the last 12-18 months, but it seems like that's where this really should be solved.
Barring great developments in cluster federation, the "easiest" way that comes to mind is to use host networking for all the cockroachdb pods. You can specify a few known machine IPs as the join addresses for new pods to connect to, and then they'll all be able to talk to each other. I've made this work with StatefulSets before (by setting dnsPolicy: ClusterFirstWithHostNet along with hostNetwork: true), but I'm not sure it's a well-supported use case. You'd probably be better off using a DaemonSet (with a label selector to only run on certain nodes if you don't want it on them all). Something like this: https://gist.github.com/a-robinson/ec2b86783ccbf053c83ba83170673d63
If that doesn't tickle your fancy, then creating a service for each StatefulSet instance unfortunately is probably the next best bet. As of a fairly recent change in Kubernetes, a separate label will be created for each pod, which should make this easier than it used to be: https://github.com/kubernetes/kubernetes/pull/55329
I'd love to see other suggestions for this, though, since it's all kind of manual or infrastructure-specific.

Related

Off-Loading of k8s deployments to different cluster in case of high loads

Since I am unable to find anything on google or the official docs, I have a question.
I have a local minikube cluster with deployment, service and ingress, which is working fine. Now when the load on my local cluster becomes too high I want to automatically switch to a remote cluster.
Is this possible?
How would I achieve this?
Thank you in advance
EDIT:
A remote cluster in my case would be a rancher Kubernetes cluster, but as long as the resources on my local one are sufficient I want to stay there.
So lets say my local cluster has enough resources to run two replicas of my application, but when a third one is needed to distribute the load, it should be deployed to the remote rancher cluster. (I hope that is clearer now)
I imagine it would be doable with kubefed (https://github.com/kubernetes-sigs/kubefed) when using the ReplicaSchedulingPreferences (https://github.com/kubernetes-sigs/kubefed/blob/master/docs/userguide.md#replicaschedulingpreference) and just weighting the local cluster very high and the remote one very low and then setting spec.rebalance to true to distribute it in case of high loads, but that approach seems a bit like a workaround.
Your idea of using Kubefed sounds good but there is an another option: Multicluster-Scheduler.
Multicluster-scheduler is a system of Kubernetes controllers that
intelligently schedules workloads across clusters. It is simple to use
and simple to integrate with other tools.
To be able to make a better choice for your use case you can read through the Comparison with Kubefed (Federation v2).
All the necessary info can be found in the provided GitHub thread.
Please let me know if that helped.

Will the master know the data on workers/nodes in k8s

I try to deploy a set of k8s on the cloud, there are two options:the masters are in trust to the cloud provider or maintained by myself.
so i wonder about that if the masters in trust will leak the data on workers?
Shortly, will the master know the data on workers/nodes?
The abstractions in Kubernetes are very well defined with clear boundaries. You have to understand the concept of Volumes first. As defined here,
A Kubernetes volume is essentially a directory accessible to all
containers running in a pod. In contrast to the container-local
filesystem, the data in volumes is preserved across container
restarts.
Volumes are attached to the containers in a pod and There are several types of volumes
You can see the layers of abstraction source
Master to Cluster communication
There are two primary communication paths from the master (apiserver) to the cluster. The first is from the apiserver to the kubelet process which runs on each node in the cluster. The second is from the apiserver to any node, pod, or service through the apiserver’s proxy functionality.
Also, you should check the CCM - The cloud controller manager (CCM) concept (not to be confused with the binary) was originally created to allow cloud specific vendor code and the Kubernetes core to evolve independent of one another. The cloud controller manager runs alongside other master components such as the Kubernetes controller manager, the API server, and scheduler. It can also be started as a Kubernetes addon, in which case it runs on top of Kubernetes.
Hope this answers all your questions related to Master accessing the data on Workers.
If you are still looking for more secure ways, check 11 Ways (Not) to Get Hacked
Short answer: yes the control plane can access all of your data.
Longer and more realistic answer: probably don't worry about it. It is far more likely that any successful attack against the control plane would be just as successful as if you were running it yourself. The exact internal details of GKE/AKS/EKS are a bit fuzzy, but all three providers have a lot of experience running multi-tenant systems and it wouldn't be negligent to trust that they have enough protections in place against lateral escalations between tenants on the control plane.

Why is it bad to use Pod IPs to communicate within a Kubernetes cluster?

So I'm setting up a NATS cluster at work in OpenShift. I can easily get things to work by having each NATS server instance broadcast its Pod IP to the cluster. The guy I talked to at work strongly advised against using the Pod IP and suggested using the Pod name. In the email, he said something about if a pod restarted. But like I tried deleting the pod and the new Pod IP was in the list of connect urls for NATS and it worked fine. I know Kubernetes has DNS and you can use the headless service but it seems somewhat flaky to me. The Pod IP works.
I believe "the guy at work" has a point, to a certain extent, but it's hard to tell to which extent it's cargo-culting and what is half knowledge. The point being: the pod IPs are not stable, that is, every time a pod gets re-launched (on the same node or somewhere else, doesn't matter) it will get a new IP from the pod CIDR-range assigned.
Now, services provide stability by introducing a virtual IP (VIP): this acts as a cluster-internal mini-load balancer sitting in front of pods and yes, the recommended way to talk to pods, in the general case, is via services. Otherwise, you'd need to keep track of the pod IPs out-of-band, no bueno.
Bottom-line: if NATS manages that for you, keeps track and maps pod IPs then fine, use it, no harm done.
While the answer from Michael is mostly true, it is important to understand there is no 100% guarantee that a service IP (aka ClusterIP) service will not change it's IP. There is a specific case of service recreation (delete/create) that will cause service IP change.
That said, the situation is somewhat different for services that have their own means of autodiscovery and/or clustering. Usually it will not be fine or enough to have a single regular service. They need to connect to seed, or discover all nodes etc. One of the means that you might use here are headless services, which return, under given name a full list of all, direct pod IPs.
Mind that using headles service has its tiny quirks as well, ie. not all software re-resolves DNS over time after initial startup, so you might end up with cached endpoints that become obsolete over time.
You might also want to leverage StatefulSets capability to retain a deterministic name (aka network identity) for each pod (ie. mypod-1, mypod-2 etc.) which, combined with headless Service, will give you static per pod names to use.
I do think that using only pod IPs will probably lead to some issues at one edge case or another, so you should at least use one of the above solutions for cluster discovery/registration. For actual communication during and after the pod was registered in the cluster, use of pod IPs can actually be for the best.

Kubernetes best practices in pods

As I have been using kubernetes more I keep on seeing the reference that a pod can contain 1 container or more and I have even looked at examples.
My question is whether there is a case where this would be best practice and more efficient to create multi container pods since you can scale and replicate your pods coupling it with a service.
Thanks in advance
A Pod can contain multiple containers, but for the most portion of the situations, it makes perfect sense for the Pod to be simply an abstraction over a single running container.
In what situations does it make sense to have a multi-container deployed Pod?
What comes to my mind are the scenarios where you have a primary Pod running, but you need to tightly couple helper processes, such as a log watcher. In those situations, it makes perfect sense to actually have multiple containers running inside a single pod.
Another big example that comes to my mind is from the Istio project, which is a platform made to connect, manage and secure microservices and is generally referred as a Service Mesh.
A huge part of what it does and is able to accomplish to provide a greater control and customization over the deployed microservices network, is due to the fact that it deploys a sidecar proxy, denominated Envoy, throughout the environment intercepting all network communication between microservices.
Here, you can check an example of load balancing in a Istio service mesh. As you can see the Proxy is deployed inside the Pod, intercepting all communication that goes through it.

Clusters and nodes formation in Kubernetes

I am trying to deploy my Docker images using Kubernetes orchestration tools.When I am reading about Kubernetes, I am seeing documentation and many YouTube video tutorial of working with Kubernetes. In there I only found that creation of pods, services and creation of that .yml files. Here I have doubts and I am adding below section,
When I am using Kubernetes, how I can create clusters and nodes ?
Can I deploy my current docker-compose build image directly using pods only? Why I need to create services yml file?
I new to containerizing, Docker and Kubernetes world.
My favorite way to create clusters is kubespray because I find ansible very easy to read and troubleshoot, unlike more monolithic "run this binary" mechanisms for creating clusters. The kubespray repo has a vagrant configuration file, so you can even try out a full cluster on your local machine, to see what it will do "for real"
But with the popularity of kubernetes, I'd bet if you ask 5 people you'll get 10 answers to that question, so ultimately pick the one you find easiest to reason about, because almost without fail you will need to debug those mechanisms when something inevitably goes wrong
The short version, as Hitesh said, is "yes," but the long version is that one will need to be careful because local docker containers and kubernetes clusters are trying to solve different problems, and (as a general rule) one could not easily swap one in place of the other.
As for the second part of your question, a Service in kubernetes is designed to decouple the current provider of some networked functionality from the long-lived "promise" that such functionality will exist and work. That's because in kubernetes, the Pods (and Nodes, for that matter) are disposable and subject to termination at almost any time. It would be severely problematic if the consumer of a networked service needed to constantly update its IP address/ports/etc to account for the coming-and-going of Pods. This is actually the exact same problem that AWS's Elastic Load Balancers are trying to solve, and kubernetes will cheerfully provision an ELB to represent a Service if you indicate that is what you would like (and similar behavior for other cloud providers)
If you are not yet comfortable with containers and docker as concepts, then I would strongly recommend starting with those topics, and moving on to understanding how kubernetes interacts with those two things after you have a solid foundation. Else, a lot of the terminology -- and even the problems kubernetes is trying to solve -- may continue to seem opaque