Advantages to containerized Kubernetes master processes - kubernetes

Both the Kubernetes HA guide and the From Scratch guide recommend running Etcd, kube-apiserver, kube-controller-manager, and kube-scheduler in containers. The idea of self-hosting Kubernetes on Kubernetes goes back quite a while (see PR 167 on K8s github and issues/PRs linked there), but I haven't found a discussion about why this approach is so beneficial that it should be the 'recommended' way. Here are the benefits and drawbacks as I see them currently:
Benefits:
Potentially easy upgrade path to just update manifests and have kubelet pull new images.
"Container advantages": binary environment and the host environment separate, leverage others' existing images, etc.
Follows the whole Kubernetes pattern, so 'fits the brain' once you are using that pattern extensively.
Drawbacks:
Increased installation/configuration complexity in some cases. For example, if your Etcd cluster is separate from your Kubernetes nodes, you now have to install Docker (with possible storage changes depending on Linux distro), kubelet, and Etcd. Without using containerized Etcd, you just have that one binary to install.
Increased complexity at run time: With more moving parts, any bug in Docker or kubelet may be able to render critical components non-functional.
I'm new to Kubernetes (and containers) and feel like I might be missing advantages (or underestimating their value) when compared to the extra complexity it introduces. But I also have to choose once way to try. Why are containerized master components the recommended way to run Kubernetes despite the extra complexity?

The biggest benefit is streamlined setup for most people. Running a few docker run commands is way easier than downloading binaries, unpacking, fine-tuning init scripts (which are different on every distro), running a supervisor, etc. We have a pretty good process manager - relying on that is powerful.
We also don't recommend sharing etcd, so if you're doing that you are already off the beaten path.
Overall, containerized components are vastly simpler than the alternatives for most people.

Related

Copy-on-write style memory reuse for kubernetes pods? To make pod spawn faster and more memory efficient

Can Kubernetes pods share significant amount of memory?
Does copy-on-write style forking exist for pods?
The purpose is to make pods spawn faster and use less memory.
Our scenario is that we have a dedicated game server to host in kubernetes. The problem is that one instance of the dedicated game server would take up a few GB of memory upfront (e.g. 3 GBs).
Also, we have a few such docker images of game servers, each for game A, game B... Let's call a pod that's running game A's image for game A pod A.
Let's say we now have 3 x pod A, 5 x pod B. Now players rushing into game B, so I need let's say another 4 * pod B urgently.
I can surely spawn 4 more pod B. Kubernetes supports this perfectly. However there are 2 problems:
The booting of my game server is very slow (30s - 1min). Players don't want to wait.
More importantly for us, the cost of having this many pods that take up so much memory is very high. Because pods do not share memory as far as I know. Where as if it were plain old EC2 machine or bare metal, processes can share memory because they can fork and then copy-on-write.
Copy-on-write style forking and memory sharing seems to solve both problems.
One of Kubernetes' assumptions is that pods are scheduled on different Nodes, which contradicts the idea of sharing common resources (does not apply for storage where there are many options and documentation available). The situation is different when it comes to sharing resources between containers in one pod, but for your issue this doesn't apply.
However, it seems that there is some possibility to share memory - not well documented and I guess very uncommon in Kubernetes. Check my answers with more details below:
Can Kubernetes pods share significant amount of memory?
What I found is that pods can share a common IPC with the host (node).
You can check Pod Security Policies, especially field hostIPC:
HostIPC - Controls whether the pod containers can share the host IPC namespace.
Some usage examples and possible security issues can be found here:
Shared /dev/sh directory
Use existing IPC facilities
Keep in mind that this solution is not common in Kubernetes. Pods with elevated privileges are granted broader permissions than needed:
The way PSPs are applied to Pods has proven confusing to nearly everyone that has attempted to use them. It is easy to accidentally grant broader permissions than intended, and difficult to inspect which PSP(s) apply in a given situation.
That's why the Kubernetes team marked Pod Security Policies as deprecated from Kubernetes v1.21 - check more information in this article.
Also, if you are using multiple nodes in your cluster you should use nodeSelector to make sure that pods will be assigned to same node that means they will be able to share one (host's) IPC.
Does copy-on-write style forking exist for pods?
I did a re-search and I didn't find any information about this possibility, so I think it is not possible.
I think the main issue is that your game architecture is not "very suitable" for Kubernetes. Check these articles and websites about dedicated game servers in Kubernetes- maybe you will them useful:
Agones
Scaling Dedicated Game Servers with Kubernetes: Part 3 – Scaling Up Nodes
Google Cloud - Dedicated Game Server Solution
Google Cloud - Game Servers
A different way to resolve the issue would be if some of the initialisation can be baked into the image.
As part of the docker image build, start up the game server and do as much of the 30s - 1min initialisation as possible, then dump that part of the memory into a file in the image. On game server boot-up, use mmap (with MAP_PRIVATE and possibly even MAP_FIXED) to map the pre-calculated file into memory.
That would solve the problem with the game server boot-up time, and probably also with the memory use; everything in the stack should be doing copy-on-write all the way from the image through to the pod (although you'd have to confirm whether it actually does).
It would also have the benefit that it's plain k8s with no special tricks; no requirements for special permissions or node selection or anything, nothing to break or require reimplementation on upgrades or otherwise get in the way. You will be able to run it on any k8s cluster, whether your own or any of the cloud offerings, as well as in your CI/CD pipeline and dev machine.

When to use MiniKube and when to use Kubernetes?

I've found a partial answer Difference between Minikube, Kubernetes, Docker Compose, Docker Swarm, etc here, but I still do not completely get it:
In my understanding, kubernetes is a container-orchestration system. However, Minikube looks very similar to me.
Can somebody explain me when you would use minikube versus when you would use minikube, and why?
I think your question should have been "Can somebody explain me when you would use minikube versus when you would use Kubernetes, and why?"
Minikube is a small and easy Kubernetes setup for your Work-PC. You can install and configure a Kubernetes cluster very easily with it. However, for a production environment it is not the best choice. Minikube normally starts a virtual machine on your PC witch will affects the performance of your cluster other than Kubernetes which will run directly with your kernel if you use linux. Furthermore, like Butuzov already answered, it is only one node, not a "real" cluster.
So you use Kubernetes if you are in a production environment where you need distributed systems and workload as well as redundancy and failure safety.
Hope that helps for your understanding.
Edit: Use cases
Minikube:
Developer or DevOps who trying to execute a complex distributed system locally for testing purposes but with deployment over Helm.
Developer or DevOps who tries to create a deployment with Helm locally.
Kubernetes (standalone):
Execute complex distributed system on production systems.
Execute heavy workload (multiple products, distributed systems) in production
minikube - is one node cluster, with a master that can get loads, with a lot of solved and automated issues. designated to test, learn things from kubernetes ecosystem.
kubernetes itself is orchestrator that can come to you as managed service with a lot of problems (pv or loadbalancers) solved or like a lego, or you will tune here and there... well thing we called production ready.
minikube is ok to learn (not always but in 90% of cases) or experiment with tiny loads.

In a Kubernetes cluster. Does the Master Node need always to run alone in a cluster node?

I am aware that it is possible to enable the master node to execute pods and that is my concern. Since the default configuration is do not allow the master to run pods. Should I change it? What is the reason for the default configuration as it is?
If the change can be performed in some situations. I would like to ask if my cluster in one of these. It has only three nodes with exactly the same hardware and possibly more nodes are not going to be added in the foreseeable future. In my opinion, as I have three equal nodes, it will be a waste of resources to use 1/3 of my cluster computational power to run the kubernetes master. Am I right?
[Edit1]
I have found the following reason in Kubernets documentation.
It is, the security, the only reason?
Technically, it doesn't need to run on a dedicated node. But for your Kubernetes cluster to run, you need your masters to work properly. And one of the ways how to ensure it can be secure, stable and perform well is to use separate node which runs only the master components and not regular pod. If you share the node with different pods, there could be several ways how it can impact the master. For example:
The other pods will impact the perforamnce of the masters (network or disk latencies, CPU cache etc.)
They migth be a security risk (if someone manages to hack from some other pod into the master node)
A badly written application can cause stability issues to the node
While it can be seen as wasting resources, you can also see it as a price to pay for the stability of your master / Kubernetes cluster. However, it doesn't have to be waste of 1/3 of resources. Depending on how you deploy your Kubernetes cluster you can use different hosts for different nodes. So for example you can use small host for the master and bigger nodes for the workers.
No, this is not required, but strongly recommended. Security is one aspect, but performance is another. Etcd is usually run on those control plane nodes and it tends to chug if it runs out of IOPS. So a rogue pod running application code could destabilize the control plane, which then reduces your ability to fix the problem.
When running small clusters for testing purposes, it is common to run everything (control plane and workloads) on a single node specifically to save money/complexity.

Is Apache Nifi ready to use with Kubernetes in production?

I am planning to setup Apache Nifi on Kubernetes and make it to production. During my surfing I didn't find any one who potentially using this combination for production setup.
Is this good idea to choose this combination. Could you please share your thoughts/experience here about the same.
https://community.cloudera.com/t5/Support-Questions/NiFi-on-Kubernetes/td-p/203864
As mentioned in the Comments, work has been done regarding Nifi on Kubernetes, but currently this is not generally available.
It is good to know that there will be dataflow offerings where Nifi and Kubernetes meet in some shape or form during the coming year.* So I would recommend to keep an eye out for this and discuss with your local contacts before trying to build it from scratch.
*Disclaimer: Though I am an employee of Cloudera, the main driving force behind Nifi, I am not qualified to make promises and this is purely my own view.
I would like to invite you to try a Helm chart at https://github.com/Datenworks/apache-nifi-helm
We've been maintaining a 5-node Nifi cluster on GKE (Google Kubernetes Engine) in a production environment without major issues and performing pretty good. Please let me know if you find any issues on running this chart on your environment.
Regarding any high volume set on k8s. Be sure to tune your linux kernel (primarily related to the Linux Connection Tracker (Contrack) service. You will also expect to see non-zero tcp timeouts, retries, out of window acks, et al. Depending on which container networking implementation is used, there may be additional configuration changes required.
I will assume you are using containerd and NOT using docker networking (except obviously the container(s) within a pod)
The issue applies to ANY heavy IO pod: kafka, NiFi, mySQL, PostGreSQL, you name it.
The incident increases when:
"high" volumes of cross pod (especially cross node) tcp connections occur
additional errors if you have large (megabyte or larger) messages
Be aware of any other components using either the Pod or VM tcp stack (e.g. PVC software supporting NiFi persisted storage)

Clusters and nodes formation in Kubernetes

I am trying to deploy my Docker images using Kubernetes orchestration tools.When I am reading about Kubernetes, I am seeing documentation and many YouTube video tutorial of working with Kubernetes. In there I only found that creation of pods, services and creation of that .yml files. Here I have doubts and I am adding below section,
When I am using Kubernetes, how I can create clusters and nodes ?
Can I deploy my current docker-compose build image directly using pods only? Why I need to create services yml file?
I new to containerizing, Docker and Kubernetes world.
My favorite way to create clusters is kubespray because I find ansible very easy to read and troubleshoot, unlike more monolithic "run this binary" mechanisms for creating clusters. The kubespray repo has a vagrant configuration file, so you can even try out a full cluster on your local machine, to see what it will do "for real"
But with the popularity of kubernetes, I'd bet if you ask 5 people you'll get 10 answers to that question, so ultimately pick the one you find easiest to reason about, because almost without fail you will need to debug those mechanisms when something inevitably goes wrong
The short version, as Hitesh said, is "yes," but the long version is that one will need to be careful because local docker containers and kubernetes clusters are trying to solve different problems, and (as a general rule) one could not easily swap one in place of the other.
As for the second part of your question, a Service in kubernetes is designed to decouple the current provider of some networked functionality from the long-lived "promise" that such functionality will exist and work. That's because in kubernetes, the Pods (and Nodes, for that matter) are disposable and subject to termination at almost any time. It would be severely problematic if the consumer of a networked service needed to constantly update its IP address/ports/etc to account for the coming-and-going of Pods. This is actually the exact same problem that AWS's Elastic Load Balancers are trying to solve, and kubernetes will cheerfully provision an ELB to represent a Service if you indicate that is what you would like (and similar behavior for other cloud providers)
If you are not yet comfortable with containers and docker as concepts, then I would strongly recommend starting with those topics, and moving on to understanding how kubernetes interacts with those two things after you have a solid foundation. Else, a lot of the terminology -- and even the problems kubernetes is trying to solve -- may continue to seem opaque