Running startup-scripts on Kubernetes Clusters - kubernetes

How do I run a start script on Kubernetes cluster? I saw that there are ways to run start scripts on compute instances but couldn't find any documentation regarding the same for kubernetes clusters.

You can use init containers to execute script before the actual container is started.
Init containers are specialized containers that run before app containers in a Pod . Init containers can contain utilities or setup scripts not present in an app image.
Because init containers have separate images from app containers, they have some advantages for start-up related code:
Init containers can contain utilities or custom code for setup that
are not present in an app image. For example, there is no need to
make an image FROM another image just to use a tool like sed, awk,
python, or dig during setup.
The application image builder and deployer roles can work
independently without the need to jointly build a single app image.
Init containers can run with a different view of the filesystem than
app containers in the same Pod. Consequently, they can be given
access to Secrets that app containers cannot access.
Because init containers run to completion before any app containers
start, init containers offer a mechanism to block or delay app
container startup until a set of preconditions are met. Once
preconditions are met, all of the app containers in a Pod can start
in parallel.
Init containers can securely run utilities or custom code that would
otherwise make an app container image less secure. By keeping
unnecessary tools separate you can limit the attack surface of your
app container image.

Related

Running other non-cluster containers on k8s node

I have a k8s cluster that runs the main workload and has a lot of nodes.
I also have a node (I call it the special node) that some of special container are running on that that is NOT part of the cluster. The node has access to some resources that are required for those special containers.
I want to be able to manage containers on the special node along with the cluster, and make it possible to access them inside the cluster, so the idea is to add the node to the cluster as a worker node and taint it to prevent normal workloads to be scheduled on it, and add tolerations on the pods running special containers.
The idea looks fine, but there may be a problem. There will be some other containers and non-container daemons and services running on the special node that are not managed by the cluster (they belong to other activities that have to be separated from the cluster). I'm not sure that will be a problem, but I have not seen running non-cluster containers along with pod containers on a worker node before, and I could not find a similar question on the web about that.
So please enlighten me, is it ok to have non-cluster containers and other daemon services on a worker node? Does is require some cautions, or I'm just worrying too much?
Ahmad from the above description, I could understand that you are trying to deploy a kubernetes cluster using kudeadm or minikube or any other similar kind of solution. In this you have some servers and in those servers one is having some special functionality like GPU etc., for deploying your special pods you can use node selector and I hope you are already doing this.
Coming to running separate container runtime on one of these nodes you need to consider two points mainly
This can be done and if you didn’t integrated the container runtime with
kubernetes it will be one more software that is running on your server
let’s say you used kubeadm on all the nodes and you want to run docker
containers this will be separate provided you have drafted a proper
architecture and configured separate isolated virtual network
accordingly.
Now comes the storage part, you need to create separate storage volumes
for kubernetes and container runtime separately because if any one
software gets failed or corrupted it should not affect the second one and
also for providing the isolation.
If you maintain proper isolation starting from storage to network then you can run both kubernetes and container runtime separately however it is not a suggested way of implementation for production environments.

Do Kubernetes pods replicas share read-only file system from the underlying image?

Let's say I deployed 2 pods to Kubernetes and they both have the same underlying image which includes some read-only file system.
By default, do the pods share this file system? Or each pod copies the file system and hence has a separate copy of it?
I would appreciate any answer and especially would love to see some documentation or resources I can read in order to delve deeper into this issue.
Thanks in advance!
In short, it depends on where the pods are running. If they are running on the same node, then yes, they share the same read-only copy of the image, and if on separate nodes, then they have their own read-only copy of the image. Keep reading if you are interesting in knowing more technical details of this.
Inside Kubernetes Pods
A pod can be viewed as a set of containers bound together. It is a construct provided by Kubernetes to be able to have certain benefits out of the box. We can understand your question better if we zoom into a single node that is part of a Kubernetes cluster.
This node will have a kubelet binary running on it, which will receive certain "instructions" from the api-server on running pods. These "instructions" will be passed onto the cri (Container Runtime Interface) running on your node (let's assume it is the docker-engine). And this cri will be responsible for actually running the needed containers and report back to the kubelet which will report back to the api-server ultimately reporting to the pod-controller that the pod containers are Running.
Now, the question becomes, do multiple pods share the same image? I said the answer is yes for pods on the same node and this is how it works.
Say you run the first pod, the docker daemon running on your k8s node pulls this image from the configured registry and stores it in the local cache of the node. It then starts a container using this image. Note that a container that runs, utilizes the image as simply a read-only file-system, and depending on the storage driver configured in docker, you can have a "writeable layer" on top of this read-only filesystem that is used to allow you to read/write on the file-system of your container. This writeable layer is temporary and vanishes when you delete the container.
When you run the second pod, the daemon finds that the image is already available locally, and simply creates the small writeable layer for your container, on top of an existing image from the cache and provides this as a "writeable file system" to your container. This speeds things up.
Now, in case of docker, these read-only layers of the image (as one 'file-system') are shared across all containers running on the host. This makes sense since there is no need to copy a read-only file system and sharing it with multiple containers is safe. And each container can maintain its uniqueness by storing its data in the thin writeable layer that it has.
References
For further reading, you can use:
Read about storage drivers in docker. It explains how multiple containers share the r/o layer of the image.
Read details about different storage driver types to see how this "thin writeable layer" is implemented in practice by docker.
Read about container runtimes in Kubernetes to understand that docker isn't the only supported runtime. There are others but more or less, the same will hold true for them as well, as it makes sense to cache images locally and re-use the read-only image file system for multiple containers.
Read more about the kubelet component of Kubernetes to understand how it can support multiple run-times and how it helps the pod-controller setup and manage containers.
And of course, finally you can find more details about pods here. This will make a lot more sense after you've read the material above.
Hope this helps!

Mutli-container dependency in pod in kubernetes

We have our application build on kubernetes. We are have many multi-containers pods.
We are facing challenges when as our many containers depends on each other to run application.
We first required database container to come up and then application container to run.
Is there is any equivalent solution to resolve this dependency where our database container will be up first then our application container??
There's no feature like that in Kubernetes because each application should be responsible for (re)connecting to their dependencies.
However, you can do a similar thing by using initContainer which can let other containers in the same pod not start until the initContainer exits with 0.
As the example shows, if you run a simple shell script on a busybox that waits until it can connect to your application's dependencies, your applications will start after their dependencies can be connected.

Why kubernetes does not work directly with containers

Somebody, please, explain me (or direct to a detailed resource) why kubernetes uses this wrapper (pod) to work with containers. Every resource I go across just quotes same words - "it is the smallest unit in k8s". What I am looking for is the reason for it from engineering perspective. I do understand that it provides namespace for storage and networking for containers inside, but best practice is keeping a single container in a pod anyways.
I've used docker-compose a lot before I familiarized myself with k8s, and have hard times to understand the need for this additional layer (wrapper) around pretty straightforward entity, container.
The reason for this decision is simply because a Pod may contain more than one container, doing different things.
First of all, A pod may have an init-container which is responsible to do some starting operations to ensure that the main container / containers work properly. I could have an init-container load some configuration and preparing it for the main application, or do some basic operations such as restoring a backup or similar things.
I can basically inject a series of operations to exec before starting the main application without building again the main application container image.
Second, even if the majority of applications are perfectly fine having only one container for Pod, there are several situations where more than one container in the same Pod may be useful.
An example could be having the main application running, and then a side-car container doing a proxy in front of the main application, maybe being the responsible for checking JWT tokens.. or another example could be a secondary application extracting metrics from the main application or similar things.
Last, let me quote Kubernetes documentation (https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/)
The primary reason that Pods can have multiple containers is to support helper applications that assist a primary application. Typical examples of helper applications are data pullers, data pushers, and proxies. Helper and primary applications often need to communicate with each other. Typically this is done through a shared filesystem, as shown in this exercise, or through the loopback network interface, localhost. An example of this pattern is a web server along with a helper program that polls a Git repository for new updates.
Update
Like you said, init containers.. or multiple containers in the same Pod are not a must, all the functionalities that I listed can also be obtained in other ways, such as en entrypoints or two separate Pods communicating with each other instead of two containers in the same Pod.
There are several benefits in using those functionalities tho, let me quote the Kubernetes documentation once more (https://kubernetes.io/docs/concepts/workloads/pods/init-containers/)
Because init containers have separate images from app containers, they have some advantages for start-up related code:
Init containers can contain utilities or custom code for setup that
are not present in an app image. For example, there is no need to make
an image FROM another image just to use a tool like sed, awk, python,
or dig during setup.
The application image builder and deployer roles
can work independently without the need to jointly build a single app
image.
Init containers can run with a different view of the filesystem
than app containers in the same Pod. Consequently, they can be given
access to Secrets that app containers cannot access.
Because init
containers run to completion before any app containers start, init
containers offer a mechanism to block or delay app container startup
until a set of preconditions are met. Once preconditions are met, all
of the app containers in a Pod can start in parallel.
Init containers
can securely run utilities or custom code that would otherwise make an
app container image less secure. By keeping unnecessary tools separate
you can limit the attack surface of your app container image
The same applies to multiple containers running in the same Pod, they can communicate safely with each other, without exposing that communication to other on the cluster, because they keep it local.

sidecar vs init container in kubernetes

I am having trouble distinguishing between a sidecar and an init container. So far, I understand that the real app containers wait for init container to do something. However, sidecar could do the same thing , could it not? And vice versa, init containers don't die off, so also run "on the side". Hence , my confusion.
Thanks for the help.
Init-containers are used to initialize something inside your Pod. The init-containers will run and exit. After every init container which exits with a code 0, your main containers will start.
Examples for init-containers are:
Moving some file into your application containers, e.g. Themes or Configuration. This example is also described in the Kubernetes docs.
Kubernetes itself does not know anything about sidecars. Sidecar-Containers are a pattern to solve some use-cases. Usually, Kubernetes distinguishes between Init-Containers and Containers running inside your Pod.
Typically, we call Sidecars all containers, that do not provide a user-focused service. For example, this could be a proxy or something for easier database access. If you're running a Java-App you could use a sidecar to export JVM metrics in Prometheus format.
The difference here is, that your sidecar-containers must run all the time. If one of your not-init-containers exits, kubernetes will restart the whole pod.
And that's the difference.
Init containers run and exit before your main application starts
Sidecars run side-by-side with your main container(s) and provide some kind of service for them.