Turning multiple end user machines (PCs, Macs, RPi ... etc) into one giant k8s cluster - kubernetes

I am new to kuberenetes.
is it possible to turn every end user machine (PCs, Macs, RPi ... etc) who with full consent downloaded my electron research app that should turn their machines into nodes that ultimately comprise a k8s cluster which then i can run kubeflow.org on to do ML research?
Thanks

Kubernetes relies on some container engine. Usually that's docker, there are efforts to create a common container interface for kubernetes and that's where CRI-O comes in, an abstraction that would allow any container engine to run underneath it.
That being said, containers "don't exist" they are a native abstraction in the linux kernel comprised of cgroups and namespaces and what that means is that the abstraction and isolation doesn't live in the hypervisor (which usually talks to the kernel) as is the case with regular virtual machines, but rather in the actual linux kernel.
MacOS uses its own kernel which, to the extent of my knowledge, doesn't support any sort of containers.
Windows does support containers via Hyper-V and i believe that windows server has a more native built-in support for them. See this link for a better explanation https://learn.microsoft.com/en-us/virtualization/windowscontainers/about/ and also for kubernetes https://kubernetes.io/docs/getting-started-guides/windows/.
As far as Raspberry PI goes there is an ongoing effort that brought k8s to ARM see this link (https://github.com/luxas/kubernetes-on-arm). That being said, you need an entire cluster of raspberry pis to actually make that work, as it would require a lot of resources. One raspberry pi won't get you very far.
How to go about this?
You need linux to run kubernetes. Everywhere.
If you want to create a "giant" kubernetes cluster your best bet is to use a virtualization technology for the PC that is running windows or for the Mac and create virtual machines that you can use as kubernetes nodes.
In short, you create virtual machines where there's no Linux and install kubernetes natively where there is.
Parallels, Veertu or plain Xhyve is a good way of running virtualization on MacOs.
VmWare or VirtualBox are good virtualizations for both windows and mac.
Libvirt and virtualbox are good solutions for linux virtualisation.

Related

Using aws-virtual-gpu-device-plugin on non AWS Kubernetes Cluster

I read about aws virtual gpu device plugin, how this can be used to split one GPU into multiple smaller GPUs and run concurrent jobs on each small part. While this is wonderful for my use case, I want to know if this works only with AWS EKS service or can be used on other cloud provider K8 clusters as well? Has anyone used it on non-aws infrastructure?
Please be aware of its limitations.
Virtual GPU device plugin for Kubernetes
Blockquote
Limitations
This solution is build on top of Volta Multi-Process Service(MPS). You can only use it on instances types with Tesla-V100 or newer. (Only Amazon EC2 P3 Instances and Amazon EC2 G4 Instances now)
Virtual GPU device plugin by default set GPU compute mode to EXCLUSIVE_PROCESS which means GPU is assigned to MPS process, individual process threads can submit work to GPU concurrently via MPS server. This GPU can not be used for other purpose.
Virtual GPU device plugin only on single physical GPU instance like P3.2xlarge if you request k8s.amazonaws.com/vgpu more than 1 in the workloads.
Virtual GPU device plugin can not work with Nvidia device plugin together. You can label nodes and use selector to install Virtual GPU device plugin.

How are operating system containers different from virtual machines?

Everywhere I can see is how Docker can be different from virtual machine but nowhere there is a answer on how basic OS containers are different from virtual machine.
If we consider the basics, it looks like both are same i.e. an operating system is running within a operating system.
Would anybody explain the underlying difference?
Virtual machines
Virtual machines use hardware virtualization. There is an additional layer between the original hardware and the virtual one, that the virtual machine thinks it's real.
This model doesn't reutilize anything from the host's OS. This way, you can run a Windows VM on a Linux host and vice-versa.
System Containers
Systems containers use operating-system-level virtualization. It reutilizes the host kernel from the host OS, and subdivide the real hardware directly to the containers. There isn't an additional layer to access the real hardware and, for this reason, the overhead (or loss of performance) is practically zero.
On the other hand, you can't run a Windows container inside a Linux host OS, since the kernel isn't the same.

OpenStack API Implementations

I have spent the last 6 hours reading through buzzword-riddled, lofty, high-level documents/blogs/articles/slideshares, trying to wrap my head around what OpenStack is, exactly. I understand that:
OpenStack is a free and open-source cloud computing software platform. Users primarily deploy it as an infrastructure as a service (IaaS) solution.
But again, that's a very lofty, high-level, gloss-over-the-details summary that doesn't really have meaning to me as an engineer.
I think I get the basic concept, but would like to bounce my understanding off of SO, and additionally I am having a tough time seeing the "forest through the trees" on the subject of OpenStack's componentry.
My understanding is that OpenStack:
Installs as an executable application on 1+ virtual machines (guest VMs); and
Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources; and
Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and
Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant
So please, if anything I have stated about OpenStack so far is incorrect, please begin by correcting me!
Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs and require the open source community to provide concrete implementations:
Nova (VM manager)
Keystone (auth provider)
Neutron (networking manager)
Cinder (block storage manager)
etc...
Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc. However, after reviewing all of the official documentation this morning, I can find no such providers for these APIs. This leaves a sick feeling in my stomach like I am fundamentally mis-understanding OpenStack's componentry. Can someone help connect the dots for me?
Not quite.
Installs as an executable application on 1+ virtual machines (guest VMs); and
OpenStack isn't a single executable, there are many different modules, some required and some optional. You can install OpenStack on a VM (see DevStack, a distro that is friendly to VMs) but that is not the intended usage for production, you would only do that for testing or evaluation purposes.
When you are doing it for real, you install OpenStack on a cluster of physical machines. The OpenStack Install Guide recommends the following minimal structure for your cloud:
A controller node, running the core services
A network node, running the networking service
One or more compute nodes, where instances are created
Zero or more object and/or block storage nodes
But note that this is a minimal structure. For a more robust install you would have more than one controller and network nodes.
Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources;
The OpenStack nodes (be them VMs or physical machines, it does not make a difference at this point) talk among themselves. Through configuration they all know how to reach the others.
Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and
No. In OpenStack jargon, the term "instance" is associated with the virtual machines that are created in the compute nodes. Here you meant "controller node", which does include the core services and the dashboard. And once again, these do not necessarily run on VMs.
Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant
I think this is easier to understand if you forget about the "guest VM". In a production environment OpenStack would be installed on physical machines. The compute nodes are beefy machines that can host many VMs. The nova-compute service runs on these nodes and interfaces to a hypervisor, such as KVM, to allocate virtual machines, which OpenStack calls "instances".
If your compute nodes are hosted on VMs instead of on physical machines things work pretty much in the same way. In this setup typically the hypervisor is QEMU, which can be installed in a VM, and then can create VMs inside the VM just fine, though there is a big performance hit when compared to running the compute nodes on physical hardware.
Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs
No. These services expose themselves as APIs, but that is not all they are. The APIs are also implemented.
and require the open source community to provide concrete implementations
Most services need to interface with an external service. Nova needs to talk to a hypervisor, neutron to interfaces, bridges, gateways, etc., cinder and swift to storage providers, and so on. This is really a small part of what an OpenStack service does, there is a lot more built on top that is independent of the low level external service. The OpenStack services include the support for the most common external services, and of course anybody who is interested can implement more of these.
Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc.
No. There is one Nova API implementation, and one Neutron API implementation. Based on configuration you tell each of these services how to interface with lower level services such as the hypervisor the networking stack, etc. And as I said above, support for a range of these is already implemented, so if you are using with ordinary x86 hardware for your nodes, then you should be fine.

Should I use VirtualBox on a production server?

I just completed my vagrant box for a product that made by my company.
I needed that because we're running same product on different
operating systems. I want to serve sites inside virtual machines, I
have questions:
Am I on correct way? Can a virtual machine used as production
server?
If you say yes:
How should I keep virtualbox running? Are there any script or sth
to restart if something crashes?
What happens if somebody accidentally gives "vagrant destroy"
command? What should I do if I don't want to lose my database and user
uploaded files?
We have some import scripts that running every beginning of the
month. sometimes they're using 7gb ram (running 1500 lines of mysql
code with lots of asynchronised instances). Can it be dangerous to run
inside VirtualBox?
Are there any case study blog post about this?
Vagrant is mainly for Development environment. I personally recommend using Type 1 hypervisor (Bare metal), VirtualBox is a desktop virtulization tool (Type 2, running on top of a traditional OS), not recommended for production.
AWS is ok, the VMs are running as Xen guest, Xen is on bare metal;-)
I wouldn't.
The w/ Vagrant + Virtualbox is that these are development instances. I would look at Amazon Web Services for actually deploying your project into the wild.

Bare Metal/Native Hypervisors?

there are two types of hypervisors. As in http://www.tricerat.com/topics/hypervisor. I would like to know some examples for Bare Metal/Native Hypervisors ?
examples for Embedded/Host Hypervisors = vmware , xen , virtual pc
am i correct ?
I'm not sure, but I think Hyper-V is an example of what you're looking for.
Bare metal virtualization can be done using RedHat Vitalization Hypervisor 3.2 or 3.3 latest and open source is oVirt.
They do not require OS to run hypevisor on host and manager usually runs on another machine or guest to save host resources where hypervisor is installed.
Yes Xen and Vmware ESXi are also hypervisors used for bare metal virtualization.