What's the benefit of not using ImagePullPolicy set to Always? - kubernetes

In the Kubernetes documentation it mentions that the caching semantic of using ImagePullPolicy: Always makes the ImagePullPolicy quite efficient. What reasons would I want to choose a different ImagePullPolicy?

It heavily depends on your versioning/tagging strategy.
When new replicas of your application are created (because your app has scaled up, or a pod has died and it's been replaced by a new pod), if you use ImagePullPolicy: Always and you have pushed a different version of your application using the same tag (like people do when using latest), the new created replicas might run a totally different version of the application than the rest of the replicas, leading to an inconsistent behavior.
You also may want to use a different value than Always when on a development environment like minikube.

There isn't much of a disadvantage to ImagePullPolicy: Always, but having the control means that:
if you have an underlying image provider that doesn't provide caching (ie you're not using Docker), you have control to make sure that it's not called every time the kubelet wants the image
even with Docker caching, it's still going to be slightly faster to not attempt to pull the image every time. If you know that you never re-use a tag (which is recommended) or you're explicitly specifying the image digest then there's no benefit to ImagePullPolicy: Always
if you have an installation where you're pulling the image onto the node using a separate mechanism, you may want it to not attempt an automatic fetch if something goes wrong and the image doesn't exist.
Note that in fiunchinho's answer it mentions that you can use it to keep various replicas of an application in sync. This would be dangerous as images are pulled per node, so you could end up with different versions of the application running on different nodes.

Related

In a Kubernetes Deployment when should I use deployment strategy Recreate

Kubernetes provides us two deployment strategies. One is Rolling Update and another one is Recreate. I should use Rolling Update when I don't want go off air. But when should I be using Recreate?
There are basically two reasons why one would want/need to use Recreate:
Resource issues. Some clusters simply do not have enough resources to be able to schedule additional Pods, which then results in them being stuck and the update procedure with it. This happens especially for local development clusters and/or applications that consume large amout resources.
Bad applications. Some applications (especially legacy or monolithic setups) simply cannot handle it when new Pods - that do the exact same thing as they do - spin up. There are too many reasons as to why this may happen to cover all of them here but essentially it means that an application is not suitable for scaling.
+1 to F1ko's answer, however let me also add a few more details and some real world examples to what was already said.
In a perfect world, where every application could be easily updated with no downtime we would be fully satisfied having only Rolling Update strategy.
But as the world isn't a perfect place and all things don't go always so smoothly as we could wish, in certain situations there is also a need for using Recreate strategy.
Suppose we have a stateful application, running in a cluster, where individual instances need to comunicate with each other. Imagine our aplication has recently undergone a major refactoring and this new version can't talk any more to instances running the old version. Moreover, we may not even want them to be able to form a cluster together as we can expect that it may cause some unpredicteble mess and in consequence neither old instances nor new ones will work properly when they become available at the same time. So sometimes it's in our best interest to be able to first shutdown every old replica and only once we make sure none of them is runnig, spawn a replica that runs a new version.
It can be the case when there is a major migration, let's say a major change in database structure etc. and we want to make sure that no pod, running the old version of our app is able to write any new data to the db during the migration.
So I would say, that in majority of cases it is very application-specific, individual scenario involving major migrations, legacy applications etc which would require accepting a certain downtime and Recreate all the pods at once, rather then updating them one-by-one like in Rolling Update strategy.
Another example which comes to my mind. Let's say you have an extremely old version of Mongodb running in a replicaset consisting of 3 members and you need to migrate it to a modern, currently supported version. As far as I remember, individual members of the replicaset can form a cluster only if there is 1 major version difference between them. So, if the difference is of 2 or more major versions, old and new instances won't be able to keep running in the same cluster anyway. Imagine that you have enough resources to run only 4 replicas at the same time. So rolling update won't help you much in such case. To have a quorum, so that the master can be elected, you need at least 2 members out of 3 available. If the new one won't be able to form a cluster with the old replicas, it's much better to schedule a maintanance window, shut down all the old replicas and have enough resources to start 3 replicas with a new version once the old ones are removed.

Kubernetes API custom image metadata

I try to use the Kubernetes API to read metadata via annotations from container images. The metadata is applicable to every instance of the respecting image and is needed in order to run any resulting container properly. Following this SO question it is not possible to read Docker image labels from the kubernetes API directly.
My next thought was to use custom annotations added to the image manifest, although this seems to be a pretty hacky solution for such a "simple" task. Anyway if I add the annotations to the manifest using docker, I see no way to read them from the Kubernetes API.
I think I am on the completely wrong track here. This seems to be a rather simple task which other people likely have implemented already...anyway I cannot find any further information regarding this. Is it really that hard to read image metadata via kubernetes before deploying a container of that image?
Thanks in advance for any help!
Edit:
The reason I am asking is because I want to grant the containers of specific images access to specific serial USB devices (e.g. FTDI232) on diverse host systems. Since I have no idea which path (e.g. /dev/ttyUSB0) will be assigned to the USB devices, I wrote a program that is monitoring USB devices and, in case an appropriate device is plugged in or gets plugged in, creates the container and passes it the corresponding path. From inside the container I want to access the serial device via a static, non-changing path (e.g. /dev/FTDI232)
Yes. The K8s API is limited when it comes to this, I believe the abstractions for container image metadata are at lower level and probably left out for a reason. You can always look at the CRI spec to see what's supported (note that the doc is out of date so you might have to look at the code).
If the end goal is to use Kubernetes to run your workloads it sounds like the more feasible route here is just to write a script that reads that image manifest outside Kubernetes and create the manifest files that you need to run your workloads after (based on that metadata) and then finally apply it to your cluster.
If you are using a common container image registry you could also write something that pulls the images from that registry to just pick metadata and metadata changes.

if an application needs several containers running on the same host, why not just make a single container with everything you need?

I started working on kubernetes. I already worked with single container pods. Now i want to working on multiple container pod. i read the statement like
if an application needs several containers running on the same host, why not just make a single container with everything you need?
means two containers with single IP address. My dought is, in which cases two or more containers uses same host?
Could you please anybody explain me above scenario with an example?
This is called "multiple processes per container".
https://docs.docker.com/config/containers/multi-service_container/
t's discussed on the internet many times and it has many gotchas. Basically there's not a lot of benefit of doing it.
Ideally you want container to host 1 process and its threads/subprocesses.
So if your database process is in a crash loop, let it crash and let docker restart it. This should not impact your web container.
Also putting processes in separate containers lets you set separate memory/CPU limits so that you can set different limits for your web container and database container.
That's why Kubernetes exposes POD concept which lets you run multiple containers in the same namespace. Read this page fully: https://kubernetes.io/docs/concepts/workloads/pods/pod/
Typically it is recommended to run one process in a docker container. Although it is very much possible to run multiple processes in a container it is discouraged more on the basis of application architecture and DevOps perspective rather than technical reasons.
Following discussion gives a better insight into this debate: https://devops.stackexchange.com/questions/447/why-it-is-recommended-to-run-only-one-process-in-a-container
When one runs multiple processes from different packages/tools in the same docker it could run into dependency and upgradability issues that docker meant to solve in the first place. Nevertheless, for many applications, it makes much sense to scale and manage application in blocks rather than individual components, sometimes it makes life bit easy. So PODs are basically a balance between the two. It gives the isolation for each process for better service manageability and yet groups them together so that the application as a whole can be better managed.
I would say segregating responsibilities can be a boon. Maintenance is whole lot easier this way. A container can have a single entrypoint and a pod's health is checked via that main process in the entry point. If your front end (say Java over Tomcat ) application is working fine but database is not working (although one must never use production database in container), you will not get the feedback that u might get when they are different containers.
Also, one of the best part of docker images are they are available as different modules and maintenance is super easy that way.

Why should I store kubernetes deployment configuration into source control if kubernetes already keeps track of it?

One of the documented best practices for Kubernetes is to store the configuration in version control. It is mentioned in the official best practices and also summed up in this Stack Overflow question. The reason is that this is supposed to speed-up rollbacks if necessary.
My question is, why do we need to store this configuration if this is already stored by Kubernetes and there are ways with which we can easily go back to a previous version of the configuration using for example kubectl? An example is a command like:
kubectl rollout history deployment/nginx-deployment
Isn't storing the configuration an unnecessary duplication of a piece of information that we will then have to keep synchronized?
The reason I am asking this is that we are building a configuration service on top of Kubernetes. The user will interact with it to configure multiple deployments, I was wondering if we should keep a history of the Kubernetes configuration and the content of configMaps in a database for possible roll backs or if we should just rely on kubernetes to retrieve the current configuration and rolling back to previous versions of the configuration.
You can use Kubernetes as your store of configuration, to your point, it's just that you probably shouldn't want to. By storing configuration as code, you get several benefits:
Configuration changes get regular code reviews.
They get versioned, are diffable, etc.
They can be tested, linted, and whatever else you desired.
They can be refactored, share code, and be documented.
And all this happens before actually being pushed to Kubernetes.
That may seem bad ("but then my configuration is out of date!"), but keep in mind that configuration is actually never in date - just because you told Kubernetes you want 3 replicas running doesn't mean there are, or if there were that 1 isn't temporarily down right now, and so on.
Configuration expresses intent. It takes a different process to actually notice when your intent changes or doesn't match reality, and make it so. For Kubernetes, that storage is etcd and it's up to the master to, in a loop forever, ensure the stored intent matches reality. For you, the storage is source control and whatever process you want, automated or not, can, in a loop forever, ensure your code eventually becomes reflected in Kubernetes.
The rollback command, then, is just a very fast shortcut to "please do this right now!". It's for when your configuration intent was wrong and you don't have time to fix it. As soon as you roll back, you should chase your configuration and update it there as well. In a sense, this is indeed duplication, but it's a rare event compared to the normal flow, and the overall benefits outweigh this downside.
Kubernetes cluster doesn't store your configuration it runs it, as you server runs your application code.

How to organize pods with non-replicatable containers in kubernetes?

I'm trying to get my head around kubernetes.
I understand that pods are a great way to organize containers that are related. I understand that replication controllers are a great way to ensure they are up and running.
However, I do not get how to do it in real life.
Given a webapp with, say a rails app on unicorn, behind nginx with a postgres database.
The nginx and rails app can autoscale horizontally (if they are shared nothing), but postgres can't out of the box.
Does that mean I can't organize the postgres database within the same pod as nginx and rails, when I want to have two servers behind a loadbalancer? Does postgres need an own replication controller and is simply a service within the cluster?
The general question about that is: In common webscenarios, what kind of containers are going into one pod? I know that this can't be answered generally, so the ideas behind it are interesting.
The answer to this really depends on how far down the rabbithole you want to go. You are really describing 3 independent parts of your app - ngingx, rails, postgres. Each of those parts has different needs when it comes to monitoring, scaling, and updating.
You CAN put all 3 into a single pod. That replicates the experience of a VM, but with some advantages for manageability, deployment etc. But you're already calling out one of the major disadvantages - you want to scale (for example) the rails app but not the postgres instance. It's time to decompose.
What if you instead made 2 pods - one for rails+nginx and one for postgres. Now you can scale your frontend without messing up your database deployment. You might go even further and split your rails app and nginx into distinct pods, if that makes sense. Or split your rails app into 5 smaller apps.
This is the whole idea behind microservices (which is a very hyped word, I know). Decompose the problem into smaller and more manageable chunks. This is WHY kubernetes exists - to help you manage the resulting ocean of microservices.
To answer your last question - there is no single recipe. It's all about how your application is structured and what makes sense FOR YOU. Maybe you want to decompose along team boundaries, or along departments in your company, or along admin roles. The questions to ask yourself are things like "if I update or scale this pod, is there any part of it that I don't want updated/sclaed at the same time?"
In some sense it is a data normalization problem. A pod should be a model of one concept or owner or scope. I hope that helps a little.
You should put containers into the same pod when you want to deploy and update them at the same time or if they need to share local state (disk, network, etc). In some edge cases, you may also want to co-locate them for performance reasons.
In your scenario, since nginx and the rails app can scale horizontally, they should be in their own pods so that you can provision the right number of replicas for each tier of your application (unless you would always scale them at the same time). The postgres database would be in a separate pod, accessed via a service.
This allows you to update to a newer version of nginx without changing anything else about your service. Same for the rails app. And they could each scale independently.