Kubernetes parallel computing - kubernetes

I want to know , Kubernetes has any parallel computing implementation ?
long time ago i used OpenHPC or OpenMosix for parallel computation cluster system .
Kubernetes can replace with this services ?
if your answer is NO , so What does the word cluster mean when you talk about kubernetes ?

Kubernetes and HPC / HTC are not yet integrated, but some attempts can be observed.
In Kubernetes, Containers and HPC article you can find some kind of comparison between HPC and Kubernetes with similarities and differences.
The main differences are the workload types they focus on. While HPC workload managers are focused on running distributed memory jobs and support high-throughput scenarios, Kubernetes is primarily built for orchestrating containerized microservice applications.
If you are eager to find more information, you can read some specialist books like Seamlessly Managing HPC Workloads Through Kubernetes.
Regarding second part:
if your answer is NO , so What does the word cluster mean when you talk about kubernetes ?
You can find many definitions in the internet, however one of the easiest to understand is in Redhat Documentation.
A Kubernetes cluster is a set of node machines for running containerized applications. If you’re running Kubernetes, you’re running a cluster.
At a minimum, a cluster contains a control plane and one or more compute machines, or nodes. The control plane is responsible for maintaining the desired state of the cluster, such as which applications are running and which container images they use. Nodes actually run the applications and workloads.
The cluster is the heart of Kubernetes’ key advantage: the ability to schedule and run containers across a group of machines, be they physical or virtual, on premises or in the cloud. Kubernetes containers aren’t tied to individual machines. Rather, they’re abstracted across the cluster.
In addition, you can also find useful information in Official Kubernetes Documentation like What is Kubernetes? and Kubernetes Concepts.

Related

Hybrid nodes on single kubernetes cluster

I am now running two kubernetes clusters.
First Cluster is running on bare metal, and Second Cluster is running on EKS.
but since maintaining EKS costs a lot, so I am finding ways to change this service as Single Cluster that autoscales on AWS.
I did tried to consider several solutions such as RHACM, Rancher and Anthos.
But those solutions are for controlling multi cluster.
I just want to change this cluster as "onpremise based cluster that autoscales (on AWS) when lack of resources"
I could find "EKS anywhere" solution but since price is too high, I want to build similar architecture.
need advice for any use cases for ingress controller, or (physical) loadbalancer, or other architecture that could satisfies those conditions
Cluster API is probably what you need. It is a concept of creating Clusters with Machine objects. These Machine objects are then provisioned using a Provider. This provider can be Bare Metal Operator provider for your bare metal nodes and Cluster API Provider AWS for your AWS nodes. All resting in a single cluster (see the docs below for many other provider types).
You will run a local Kubernetes cluster which will have the Cluster API running in it. This will include components that will allow you to be able to create different Machine objects and tell Kubernetes also how to provision those machines.
Here is some more reading:
Cluster API Book: Excellent reading on the topic.
Documentation for CAPI Provider - AWS.
Documentation for the Bare Metal Operator I worked on this project for a couple of years and the community is pretty amazing. This GitHub repository hosts the CAPI Provider for bare metal nodes.
This should definitely get you going. You can start by running different providers individually to get a taste of how they work and then work with Cluster API and see it in function.

What do you call the application running in a pod in Kubernetes?

In a typical theoretical system, I would call the different applications that make up the system nodes. However this is confusing in a Kubernetes cluster system for two reasons:
"Node.js" is often shortened to "node", and not all of the nodes in my system are "Node.js" processes.
Kubernetes uses the word "node" to refer to physical components in the cluster.
So the question is, what terminology is used to describe the subject that you would run on a pod? Are they projects? Processes? Application nodes? Applications?
None of the above sound right to me.
Pods contain one (or rarer more) application(s):
A Pod models an application-specific "logical host": it contains one or more application containers which are relatively tightly coupled.
Source
In the Kubernetes universe, Nodes are the physical or virtual machines that your cluster is running on. Pods run on the Nodes. I suggest to avoid the term Node for applications.
Usually it would be a "service" (although it clashes with k8s service) or a "microservice"

kubernetes with slurm, is this correct setup?

i saw that some people use Kubernetes co-exist with slurm, I was just curious as to why you need kubernetes with slurm? what is the main difference between kubernetes and slurm?
Slurm is open source job scheduling system for large and small Linux clusters. It is mainly used as Workload Manager/Job scheduler. Mostly used in HPC (High Performance Computing) and sometimes in BigData.
Kubernetes is an orchestration system for Docker containers using the concepts of ”labels” and ”pods” to group containers into logical units. It was mainly created to run micro-services and AFAIK currently Kubernetes is not supporting Slurm.
Slumr as Job scheduler have more scheduling options than Kubernetes, but K8s is container orchestration system not only Job scheduler. For example Kubernetes is supporting Array jobs and Slurm supports Parallel and array jobs. If you want to dive in to scheduling check this article.
As I mentioned before, Kubernetes is more focused on container orchestration and Slumr is focused on Job/Workload scheduling.
Only thing comes to my mind is that someone needed very personal-customized cluster using WLM-Operator + K8s + Slurm + Singularity to execute HPC/BigData jobs.
Usually Slurm Workload Manager is used by many of the world's supercomputers to optimize locality of task assignments on parallel computers.

Kubernetes - Single Cluster or Multiple Clusters

I'm migrating a number of applications from AWS ECS to Azure AKS and being the first production deployment for me in Kubernetes I'd like to ensure that it's set up correctly from the off.
The applications being moved all use resources at varying degrees with some being more memory intensive and others being more CPU intensive, and all running at different scales.
After some research, I'm not sure which would be the best approach out of running a single large cluster and running them all in their own Namespace, or running a single cluster per application with Federation.
I should note that I'll need to monitor resource usage per application for cost management (amongst other things), and communication is needed between most of the applications.
I'm able to set up both layouts and I'm sure both would work, but I'm not sure of the pros and cons of each approach, whether I should be avoiding one altogether, or whether I should be considering other options?
Because you are at the beginning of your kubernetes journey I would go with separate clusters for each stage you have (or at least separate dev and prod). You can very easily take your cluster down (I did it several times with resource starvation). Also not setting correctly those network policies you might find that services from different stages/namespaces (like test and sandbox) communicate with each other. Or pipelines that should deploy dev to change something in other namespace.
Why risk production being affected by dev work?
Even if you don't have to upgrade the control plane yourself, aks still has its versions and flags and it is better to test them before moving to production on a separate cluster.
So my initial decision would be to set some hard boundaries: different clusters. Later once you get more knowledge with aks and kubernetes you can review your decision.
As you said that communication is need among the applications I suggest you go with one cluster. Application isolation can be achieved by Deploying each application in a separate namespace. You can collect metrics at namespace level and can set resources quota at namespace level. That way you can take action at application level
A single cluster (with namespaces and RBAC) is easier to setup and manage. A single k8s cluster does support high load.
If you really want multiple clusters, you could try istio multi-cluster (istio service mesh for multiple cluster) too.
Depends... Be aware AKS still doesn't support multiple node pools (On the short-term roadmap), so you'll need to run those workloads in single pool VM type. Also when thinking about multiple clusters, think about multi-tenancy requirements and the blast radius of a single cluster. I typically see users deploying multiple clusters even though there is some management overhead, but good SCM and configuration management practices can help with this overhead.

Kubernetes Architecture - Kubernetes Cluster Management and initializing Nodes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am trying to change my deploy scenario from docker to Kubernetes. Now I explored the architecture of Kubernetes - Cluster, Nodes, Pods, Services, replica Sets/controller, Kubernetes-cni, kube-ctl etc. Now I need to begin with deployment into Kubernetes cluster. When I am exploring, I found documentations and discussions that can create single node and master in same machine or possible in VMs. Also found kubespray and minikube documentations for cluster creation.
Here I am adding my confusions about hands on with Kubernetes.
For creating and working with Kubernetes, why there is a variation like single node and master in same or in VMs? Why there is a deviation in cluster container?
How I can decide whether I need to choose single node and master in same machine or I need to use Vms for different nodes?
How the Minikube and Kubespray is providing different methodology in Kubernetes architecture?, Since Kubernetes are product of one single source - Google.
If I am installing kubeadm, kubernetes-cni and kubelet in my ubuntu 16.04, Can I initiate nodes in the same machine ?
How can I clarify these confusions?
The taxonomy of concepts and terms is very complicated, and the documentation is still pretty sparse.
1. For creating and working with kubernetes,
why there is a variation like single node and master in same or in VMs?
Why there is a deviation in cluster container?
The deviation is to support many distinct use cases- container workload developers working on their laptops needing what amounts to a fake cluster without a lot of operational ceremony; kubernetes ops folks learning and testing on a small but real clusters; and real production workloads for varyingly-sized plants.
For the first case, for container workload development, there is a piece of software called minikube, which is like a distribution of kubernetes that automates creating a single virtual machine- using VirtualBox or other desktop-class virtual machine tooling- that is preconfigured to run a combined kubernetes master and node, sufficient to be able to run real kubernetes workloads, but on a laptop.
For the second case, for non production purposes, the master and worker functions can be run on a single machine, or a single master machine can be used with a small number of worker machines.
A production kubernetes cluster will usually have 3 or 5 or 7 master machines- VMs or bare metals. Multiple masters are needed to maintain quorum for etcd- where kubernetes stores all runtime state- in the case of machine failures. 3 master machines allow for 1 master machine to fail without disrupting the cluster. 5 masters will tolerate 2 master machine failures, etc.
This number of masters can support a large number of worker machines- dozens to hundreds- running the container workloads. In a production environment, one would not want to run client workloads on master machines.
2. How I can decide whether I need to choose single node and master
in same machine. Or do I need to use Vms for different nodes?
See above- for development, use minikube. For production, plan to use multiple redundant masters if you are running the cluster yourself, or use a cloud provider's managed kubernetes offering.
3. How the Minikube and Kubespray is providing different methodology
in kubernetes architecture?
Minikube is for development only. Kubespray is one of many tools that provides some automation help when building a production cluster. Kubespray's distinguishing feature is the use of Ansible for machine setup and automation. This may or may not be desirable, depending on your comfort and interest in Ansible and/or its competitors.
4. Why have so many options when kubernetes is the product of a
single source - google.
Kubernetes certainly originated in Google, but now there are hundreds or more engineers across many companies, including Microsoft, Amazon, RedHat, Oracle, and tons of tiny companies, actively working on it. It is a remarkable project.
5. If I am installing kubeadm, kubernetes-cni and kubelet in my ubuntu 16.04
Can I initiate nodes in the same machine ?
Kubeadm is a setup tool, not a production runtime tool, but yes, you can run containers on the same machine as the bits that are needed for a kubernetes master. In addition to etcd, kubelet, apiserver, controller manager, you need to run Docker as well- Kubelet talks to Docker to schedule containers. I would only advise NOT running anything else on this machine- improper configuration can cause problems with the machine serving as master/worker so any other work will be lost.