Docker Swarm and Kubernetes Manager hardware requirements - kubernetes

We are planning to build a small docker cluster for our application services. We considered to use 2 master vms for ha, 1 consul(if we choose Swarm) and 5-10 hosts for containers. We have not yet decided what to use - Docker Swarm or Kubernetes.
So the question is what "hardware" requirements (CPU cores, RAM) managers, both Swarm and Kubernetes, can meet to orchestrate this small cluster.

Just to clarify a bit on what Robert wrote about Kubernetes.
If you want to have up to 5 machines for running your applications even 1-core virtual machine (n1-standard-1 on GCE) should be enough.
You can handle 10-node cluster with 2-core virtual machine as Robert said. For official recommendations please take a look at:
https://kubernetes.io/docs/setup/best-practices/cluster-large/
However, note that resource usage of our master components is more related to number of pods (containers) you want to run on your cluster. If you want to have say single-digit-number of them, even n1-standard-1 GCE should be enough for 10-node cluster. But it's definitely safer to use n1-standard-2 in case of <=10 node clusters.
As for HA, I agree with Robert that having 3 master VMs is better than 2. Etcd (which is our backing storage) requires more than a half of all registered replicas to be up to work correctly, so in case of 2 instances, all of them needs to be up (which is generally not your goal). If you have 3 instances, one of them can be down.
Let me know if you have more questions about Kubernetes.

For Kubernetes, a single 2-core virtual machine (e.g. n1-standard-2 on GCE) can handle 5 nodes and probably 10. If you want to run an HA master configuration, you are likely to want 3 nodes to create a quorum for the etcd instances and you may want to provision slightly larger instances (like an n1-standard-4) to account for the overhead of clustering etcd.

Related

Kubernetes parallel computing

I want to know , Kubernetes has any parallel computing implementation ?
long time ago i used OpenHPC or OpenMosix for parallel computation cluster system .
Kubernetes can replace with this services ?
if your answer is NO , so What does the word cluster mean when you talk about kubernetes ?
Kubernetes and HPC / HTC are not yet integrated, but some attempts can be observed.
In Kubernetes, Containers and HPC article you can find some kind of comparison between HPC and Kubernetes with similarities and differences.
The main differences are the workload types they focus on. While HPC workload managers are focused on running distributed memory jobs and support high-throughput scenarios, Kubernetes is primarily built for orchestrating containerized microservice applications.
If you are eager to find more information, you can read some specialist books like Seamlessly Managing HPC Workloads Through Kubernetes.
Regarding second part:
if your answer is NO , so What does the word cluster mean when you talk about kubernetes ?
You can find many definitions in the internet, however one of the easiest to understand is in Redhat Documentation.
A Kubernetes cluster is a set of node machines for running containerized applications. If you’re running Kubernetes, you’re running a cluster.
At a minimum, a cluster contains a control plane and one or more compute machines, or nodes. The control plane is responsible for maintaining the desired state of the cluster, such as which applications are running and which container images they use. Nodes actually run the applications and workloads.
The cluster is the heart of Kubernetes’ key advantage: the ability to schedule and run containers across a group of machines, be they physical or virtual, on premises or in the cloud. Kubernetes containers aren’t tied to individual machines. Rather, they’re abstracted across the cluster.
In addition, you can also find useful information in Official Kubernetes Documentation like What is Kubernetes? and Kubernetes Concepts.

Kubernetes with hybrid containers on one VM?

I have played around a little bit with docker and kubernetes. Need some advice here on - Is it a good idea to have one POD on a VM with all these deployed in multiple (hybrid) containers?
This is our POC plan:
Customers to access (nginx reverse proxy) with a public API endpoint. eg., abc.xyz.com or def.xyz.com
List of containers that we need
Identity server Connected to SQL server
Our API server with Hangfire. Connected to SQL server
The API server that connects to Redis Server
The Redis in turn has 3 agents with Hangfire load-balanced (future scalable)
Setup 1 or 2 VMs?
Combination of Windows and Linux Containers, is that advisable?
How many Pods per VM? How many containers per Pod?
Should we attach volumes for DB?
Thank you for your help
Cluster size can be different depending on the Kubernetes platform you want to use. For managed solutions like GKE/EKS/AKS you don't need to create a master node but you have less control over our cluster and you can't use latest Kubernetes version.
It is safer to have at least 2 worker nodes. (More is better). In case of node failure, pods will be rescheduled on another healthy node.
I'd say linux containers are more lightweight and have less overhead, but it's up to you to decide what to use.
Number of pods per VM is defined during scheduling process by the kube-scheduler and depends on the pods' requested resources and amount of resources available on cluster nodes.
All data inside running containers in a Pod are lost after pod restart/deletion. You can import/restore DB content during pod startup using Init Containers(or DB replication) or configure volumes to save data between pod restarts.
You can easily decide which container you need to put in the same Pod if you look at your application set from the perspective of scaling, updating and availability.
If you can benefit from scaling, updating application parts independently and having several replicas of some crucial parts of your application, it's better to put them in the separate Deployments. If it's required for the application parts to run always on the same node and if it's fine to restart them all at once, you can put them in one Pod.

Kubernetes Architecture - Kubernetes Cluster Management and initializing Nodes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to change my deploy scenario from docker to Kubernetes. Now I explored the architecture of Kubernetes - Cluster, Nodes, Pods, Services, replica Sets/controller, Kubernetes-cni, kube-ctl etc. Now I need to begin with deployment into Kubernetes cluster. When I am exploring, I found documentations and discussions that can create single node and master in same machine or possible in VMs. Also found kubespray and minikube documentations for cluster creation.
Here I am adding my confusions about hands on with Kubernetes.
For creating and working with Kubernetes, why there is a variation like single node and master in same or in VMs? Why there is a deviation in cluster container?
How I can decide whether I need to choose single node and master in same machine or I need to use Vms for different nodes?
How the Minikube and Kubespray is providing different methodology in Kubernetes architecture?, Since Kubernetes are product of one single source - Google.
If I am installing kubeadm, kubernetes-cni and kubelet in my ubuntu 16.04, Can I initiate nodes in the same machine ?
How can I clarify these confusions?
The taxonomy of concepts and terms is very complicated, and the documentation is still pretty sparse.
1. For creating and working with kubernetes,
why there is a variation like single node and master in same or in VMs?
Why there is a deviation in cluster container?
The deviation is to support many distinct use cases- container workload developers working on their laptops needing what amounts to a fake cluster without a lot of operational ceremony; kubernetes ops folks learning and testing on a small but real clusters; and real production workloads for varyingly-sized plants.
For the first case, for container workload development, there is a piece of software called minikube, which is like a distribution of kubernetes that automates creating a single virtual machine- using VirtualBox or other desktop-class virtual machine tooling- that is preconfigured to run a combined kubernetes master and node, sufficient to be able to run real kubernetes workloads, but on a laptop.
For the second case, for non production purposes, the master and worker functions can be run on a single machine, or a single master machine can be used with a small number of worker machines.
A production kubernetes cluster will usually have 3 or 5 or 7 master machines- VMs or bare metals. Multiple masters are needed to maintain quorum for etcd- where kubernetes stores all runtime state- in the case of machine failures. 3 master machines allow for 1 master machine to fail without disrupting the cluster. 5 masters will tolerate 2 master machine failures, etc.
This number of masters can support a large number of worker machines- dozens to hundreds- running the container workloads. In a production environment, one would not want to run client workloads on master machines.
2. How I can decide whether I need to choose single node and master
in same machine. Or do I need to use Vms for different nodes?
See above- for development, use minikube. For production, plan to use multiple redundant masters if you are running the cluster yourself, or use a cloud provider's managed kubernetes offering.
3. How the Minikube and Kubespray is providing different methodology
in kubernetes architecture?
Minikube is for development only. Kubespray is one of many tools that provides some automation help when building a production cluster. Kubespray's distinguishing feature is the use of Ansible for machine setup and automation. This may or may not be desirable, depending on your comfort and interest in Ansible and/or its competitors.
4. Why have so many options when kubernetes is the product of a
single source - google.
Kubernetes certainly originated in Google, but now there are hundreds or more engineers across many companies, including Microsoft, Amazon, RedHat, Oracle, and tons of tiny companies, actively working on it. It is a remarkable project.
5. If I am installing kubeadm, kubernetes-cni and kubelet in my ubuntu 16.04
Can I initiate nodes in the same machine ?
Kubeadm is a setup tool, not a production runtime tool, but yes, you can run containers on the same machine as the bits that are needed for a kubernetes master. In addition to etcd, kubelet, apiserver, controller manager, you need to run Docker as well- Kubelet talks to Docker to schedule containers. I would only advise NOT running anything else on this machine- improper configuration can cause problems with the machine serving as master/worker so any other work will be lost.

Running the same service in a GKE container, compared to a GCE VM

This is a general question about GKE compared to GCE. If one is running a lightweight service on a single small GCE VM, is it a reasonable thing to do to try running that same service from a single GKE container on the same size instance? Or does the overhead of cluster management make this unfeasible?
Specifics: I'm serving a low-traffic website from a tiny (f1-micro) GCE VM. For various reasons I thought I'd try moving it to serve from an apache/nginx container, with the same hardware underneath. In practice though, I find that GKE won't even let you create a cluster of f1-micro instances unless it has at least 3 nodes - the release notes say this is so there will be enough memory to manage pods.
I'd supposed that the same service would take up similar resources whether in a VM or a container, but the GKE's 3-node restriction makes it sound like simply managing the cluster eats more memory than serving my site does in the first place. Is that the case, or is the restriction meant for much heaver services than mine? (For reference, you can actually create a 3-node cluster of f1-micro instances and then change the size to 1 node, and it seems to run normally, but I haven't tried actually running a service this way.)
Thanks!
GKE enables logging and monitoring by default, which runs Fluentd and Heapster pods in your cluster. These eat up a good chunk of memory. Even if you disable logging/monitoring, you still have to run Docker, Kubelet, and the DNS pod. That chews through the f1-micro's 600MB pretty quickly.
I'd suggest a 1 node g1-small cluster over a 3 node (or 1 node) f1-micro. The per-node cluster-management overhead is smaller relatively, so your service would still be able to run in the same (or larger) footprint. But, if the resize-to-1 workaround is working for you, it seems fine to just roll with that.

How to determine efficiently if a Minion has joined a Kubernetes Cluster

I am working on writing some automation to setup a Kubernetes Cluster. The automation deploys the Kubernetes Master and once that is setup, it starts adding Minions in parallel. What is the most efficient way to determine programmatically if a Minion has joined the Kubernetes Cluster?
Currently I am querying the REST endpoint /v1/api/nodes exposed by the Kubernetes API-Server. My concern is that as the size of the cluster increases, querying the API-Server to pull details about all the minions may be compute and I/O intensive for the API-Server. I also did not find paging support in this API.
Thanks,
Sufian
You should look into kube-register https://github.com/kelseyhightower/kube-register. It uses fleet to register minions as they spin up. You should probably have it as a systemd unit so it runs on start up. Then for status, let the Api-server do it's thing with the polling status. Most clusters probably wouldn't be larger than 9 main nodes (you can have plenty worker nodes, I recommend looking at coreos's etcd docs to see about clustering) due to etcd's latency constraints in it's quorum over RAFT, so I wouldn't worry too much about the size of the cluster.
this is a mix between answer and comment on the other answer (I can not comment yet, sorry...)
As far as I know using the REST endpoint /v1/api/nodes is the best way to check if nodes are registered. How often do you call that endpoint? I wouldn't expect compute or I/O problems too fast.
kube-register was a useful tool to register new CoreOS nodes to the kubernetes cluster, but it is not needed anymore, since the kubelet registers itself in the meanwhile.
I think there is some misunderstanding in the other answer. I think you talk about 2 different clusters:
the etcd cluster: CoreOS recommends to run 3, 5 or 7 etcd instances in a cluster (https://coreos.com/etcd/docs/latest/admin_guide.html#cluster-management). On the remaining nodes you can configure etcd to run as a proxy (https://coreos.com/etcd/docs/latest/proxy.html). This should solve your etcd connection problem.
the kubernetes cluster: here you typically run 1 master and x "worker" nodes, just as you do already.