Kubernetes: one pod, more containers on more nodes - kubernetes

somebody could please help me to create a yaml config file for Kubernetes in order to face a situation like: one pod with 3 containers (for example) and these containers have to be deployed on 3 nodes of a cluster (Google GCE).
|P| |Cont1| ----> |Node1|
|O| ---> |Cont2| ----> |Node2| <----> GCE cluster
|D| |Cont3| ----> |Node3|
Thanks

From Kuberenets Concepts,
Pods in a Kubernetes cluster can be used in two main ways: Pods that
run a single container. The “one-container-per-Pod” model is the most
common Kubernetes use case; in this case, you can think of a Pod as a
wrapper around a single container, and Kubernetes manages the Pods
rather than the containers directly. Pods that run multiple containers
that need to work together. A Pod might encapsulate an application
composed of multiple co-located containers that are tightly coupled
and need to share resources. These co-located containers might form a
single cohesive unit of service–one container serving files from a
shared volume to the public, while a separate “sidecar” container
refreshes or updates those files. The Pod wraps these containers and
storage resources together as a single manageable entity.
In short, most likely, you should place each container in a single Pod to truly benefit from the microservices architecture vs the monolithic architecture commonly deployed in VMs. However there are some cases where you want to consider co-locating containers. Namely, as described in this article (Patterns for Composite Containers) some of the composite containers applications are:
Sidecar containers
extend and enhance the "main" container
Ambassador containers
proxy a local connection to the world
Adapter containers
standardize and normalize output
Once you define and run the Deployments, the Scheduler will be responsible to select the most suitable placement for your Pods, unless you manually assign Nodes by defining Labels in Deployment's YAML (not recommended unless you know what you're doing).

You can assign multiple containers to a single pod. You can assign pods to a specific node-pool. But I am not sure whether is it possible to assign multiple containers to multiple nodes running in side a single pod.
What you can do here is to assign each container to different pods (3 containers --> 3 pods) and then assign each pod to a different node-pool by adding this code to your deployment's .yaml file.
nodeSelector:
nodeclass: pool1

Related

Can you make a kubernetes container deployment conditional on whether a configmap variable is set?

If I have a k8s deployment file for a service with multiple containers like api and worker1, can I make it so that there is a configmap with a variable worker1_enabled, such that if my service is restarted, container worker1 only runs if worker1_enabled=true in the configmap?
The short answer is No.
According to k8s docs, Pods in a Kubernetes cluster are used in two main ways:
Pods that run a single container. The "one-container-per-Pod" model is the most common Kubernetes use case; in this case, you can think of a Pod as a wrapper around a single container; Kubernetes manages Pods rather than managing the containers directly.
Pods that run multiple containers that need to work together. A Pod can encapsulate an application composed of multiple co-located containers that are tightly coupled and need to share resources. These co-located containers form a single cohesive unit of service—for example, one container serving data stored in a shared volume to the public, while a separate sidecar container refreshes or updates those files. The Pod wraps these containers, storage resources, and an ephemeral network identity together as a single unit.
Unless your application requires it, it is better to separate the worker and api containers into their own pod. So you may have one deployment for worker and one for api.
As for deploying worker when worker1_enabled=true, that can be done with helm. You have to create a chart such that when the value of worker1_enabled=true is set, worker is deployed.
Last note, a service in kubernetes is an abstract way to expose an application running on a set of Pods as a network service.

Run different replica count for different containers within same pod

I have a pod with 2 closely related services running as containers. I am running as a StatefulSet and have set replicas as 5. So 5 pods are created with each pod having both the containers.
Now My requirement is to have the second container run only in 1 pod. I don't want it to run in 5 pods. But my first service should still run in 5 pods.
Is there a way to define this in the deployment yaml file for Kubernetes? Please help.
a "pod" is the smallest entity that is managed by kubernetes, and one pod can contain multiple containers, but you can only specify one pod per deployment/statefulset, so there is no way to accomplish what you are asking for with only one deployment/statefulset.
however, if you want to be able to scale them independently of each other, you can create two deployments/statefulsets to accomplish this. this is imo the only way to do so.
see https://kubernetes.io/docs/concepts/workloads/pods/ for more information.
Containers are like processes,
Pods are like VMs,
and Statefulsets/Deployments are like the supervisor program controlling the VM's horizontal scaling.
The only way for your scenario is to define the second container in a new deployment's pod template, and set its replicas to 1, while keeping the old statefulset with 5 replicas.
Here are some definitions from documentations (links in the references):
Containers are technologies that allow you to package and isolate applications with their entire runtime environment—all of the files necessary to run. This makes it easy to move the contained application between environments (dev, test, production, etc.) while retaining full functionality. [1]
Pods are the smallest, most basic deployable objects in Kubernetes. A Pod represents a single instance of a running process in your cluster. Pods contain one or more containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod's resources. [2]
A deployment provides declarative updates for Pods and ReplicaSets. [3]
StatefulSet is the workload API object used to manage stateful applications. Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods. [4]
Based on all that information - this is impossible to match your requirements using one deployment/Statefulset.
I advise you to try the idea #David Maze mentioned in a comment under your question:
If it's possible to have 4 of the main application container not having a matching same-pod support container, then they're not so "closely related" they need to run in the same pod. Run the second container in a separate Deployment/StatefulSet (also with a separate Service) and you can independently control the replica counts.
References:
Documentation about Containers
Documentation about Pods
Documentation about Deployments
Documentation about StatefulSet

What's the difference between pod and container from container runtime's perspective?

Kubernetes documentation describes pod as a wrapper around one or more containers. containers running inside of a pod share a set of namespaces (e.g. network) which makes me think namespaces are nested (I kind doubt that). What is the wrapper here from container runtime's perspective?
Since containers are just processes constrained by namespaces, Cgroups e.g. Perhaps, pod is just the first container launched by Kubelet and the rest of containers are started and grouped by namespaces.
The main difference is networking, the network namespace is shared by all containers in the same Pod. Optionally, the process (pid) namespace can also be shared. That means containers in the same Pod all see the same localhost network (which is otherwise hidden from everything else, like normal for localhost) and optionally can send signals to processes in other containers.
The idea is the Pods are groups of related containers, not really a wrapper per se but a set of containers that should always deploy together for whatever reason. Usually that's a primary container and then some sidecars providing support services (mesh routing, log collection, etc).
Pod is just a co-located group of container and an Kubernetes object.
Instead of deploying them separate you can do deploy a pod of containers.
Best practices is that you should not actually run multiple processes via single container and here is the place where pod idea comes to a place. So with running pods you are grouping containers together and orchestrate them as single object.
Containers in a pod runs the same Network namespace (ip address and port space) so you have to be careful no to have the same port space used by two processes.
This differs for example when it comes to filesystem, since the containers fs comes from the image fs. The file systems are isolated unless they will share one Volume.

Can Pods be thought of as Namespaces?

My understanding is since pod is defined as a group of containers which provides shared resources such as storage and network among those containers, can it be thought of as a namespace in a worker node that is to say, different pods are representing different namespaces in a worker node machine?
Or otherwise is pod actually a process which is first started (or run or executed) by the deployment and then it starts the containers inside it?
Can i see it through ps command? (I did try it, there are only docker containers running so I am ruling out pod being a process)
If we start from the basics
What is a namespace (in a generic manner)?
A namespace is a declarative region that provides a scope to the identifiers (the names of types, functions, variables, etc) inside it. Namespaces are used to organize code into logical groups and to prevent name collisions that can occur especially when your code base includes multiple libraries.
What is a Pod (in K8s)?
A pod is a group of one or more containers (such as Docker containers), with shared storage/network, and a specification for how to run the containers. A pod’s contents are always co-located and co-scheduled, and run in a shared context. A pod models an application-specific “logical host” - it contains one or more application containers which are relatively tightly coupled — in a pre-container world, being executed on the same physical or virtual machine would mean being executed on the same logical host.
While Kubernetes supports more container runtimes than just Docker, Docker is the most commonly known runtime, and it helps to describe pods in Docker terms.
The shared context of a pod is a set of Linux namespaces, cgroups, and potentially other facets of isolation - the same things that isolate a Docker container. Within a pod’s context, the individual applications may have further sub-isolations applied.
Some deep dive into Pods
What is a Namespace (in k8s terms)?
Namespaces are intended for use in environments with many users spread across multiple teams, or projects.
Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces.
Namespaces are a way to divide cluster resources between multiple users.
So I think its suffice to say:
Yes Pods have a namespace :
Pods kind of represent a namespace but on a container level (where they share the same context of networks, volumes/storage only among a set of containers)
But namespaces (in terms of K8s) are a bigger level of isolation -- on a cluster level which shared by all the containers (services, deployments, dns-names, IPs, config-maps, secrets, roles, etc).
Also you should see this link
Hope this clears a bit of fog on the issue.
Yes, you could say that a pod is a namespace that is shared by containers. When using the Docker executor, a pause container is created which establishes the network, file-system, and process namespace for subsequent containers to utilise.
This is because Docker doesn't understand pods as a first class primitive, and you won't see the pause container with an other run-time.

Why using pods and not directly containers in an OpenShift V3 environment

Kubernetes is an orchestration tool for the management of containers.
Kubernetes creates pods which are containing containers, instead of managing containers directly.
I read this about pods
I'm working with OpenShift V3 which is using pods. But in my apps, all demo's and all examples I see:
One pod contains one containers (it's possible to contain more and that could be an advantage of using pods). But in an OpenShift environment I don't see the advantage of this pods.
Can some explain me why OpenShift V3 is using kubernetes with pods and containers instead of an orchestration tool which is working with containers immediately (without pods).
There are many cases where our users want to run pods with multiple containers within OpenShift. A common use-case for running multiple containers is where a pod has a 'primary' container that does some job, and a 'side-car' container that does something like write logs to a logging agent.
The motivation for pods is twofold -- to make it easier to share resources between containers, and to enable deploying and replicating groups of containers that share resources. You can read more about them in the user-guide.
The reason we still use a Pod when only a single container is that containers do not have all the notions that are attached to pods. For example, pods have IP addresses. Containers do not -- they share the IP address associated with the pod's network namespace.
Hope that helps. Let me know if you'd like more clarification, or we can discuss on slack.