Why container restart policy is defined in Pod specification? - kubernetes

Does anyone know why restartPolicy field is defined on the Pod level instead of the container level?
It would seem that this setting is more closely related to the container, not the Pod.
Then how to controll restart policy of single container in multi-container Pod?

I think restart policy is part of the POD spec.
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- name: 1st
image: image-1
command: ["./bash", "-test1"]
- name: 2nd
image: image-2
command: ["./bash", "-test2"]
restartPolicy: Never
Restart policy gets set the at POD spec level, and get applied to all the containers in POD even if init container is there.
If there are multi containers inside the POD, we have to consider those as tightly coupled.
Official documents says something like this : link
Pods that run multiple containers that need to work together. A Pod can encapsulate an application composed of multiple co-located
containers that are tightly coupled and need to share resources. These
co-located containers form a single cohesive unit of service—for
example, one container serving data stored in a shared volume to the
public, while a separate sidecar container refreshes or updates those
files. The Pod wraps these containers, storage resources, and an
ephemeral network identity together as a single unit.
Note: Grouping multiple co-located and co-managed containers in a
single Pod is a relatively advanced use case. You should use this
pattern only in specific instances in which your containers are
tightly coupled.
If you want to restart the single container in POD you won't be able to do it, you have keep that container out of POD then by POD design.
Even if you will see the container restart policy it's talk about the : POD spec restart policy only.

Related

Is it possible to have a single deployment in Kubernetes with different scaling for differnet containers?

I have a very simple docker-compose app with nginx and php-fpm services.
I would like to create a Helm chart that would deploy this exact same thing as a unit (a POD I guess?) but I would like to be able to scale the php-fpm service.
How can I achieve that?
My idea is to have these containers in a single deployment so I don't have a lot of deployments scattered.
i.e
App1:
- Container 1 (php-fpm autoscaled or manually scaled)
- Container 2 (nginx)
- Container 3 (Redis)
- Container 4 (something else that this app needs)
App2
- Container 1
- Container 2
- Container 3 etc...
I could do different deployments but then it would be like
app1_php-fpm
app1_nginx
app1_redis
app2_container1
app2_container2
app2_container3
Instead of a single pod.
The "one-container-per-Pod" model is the most common Kubernetes use
case; in this case, you can think of a Pod as a wrapper around a
single container; Kubernetes manages Pods rather than managing the
containers directly.
Nginx, Redis should run as separate services, so that they can scale independent of your application in a decoupled manner.
Nginx can reverse proxy multiple applications so you don't need to have a separate Nginx instance for every app.
https://kubernetes.io/docs/concepts/workloads/pods/#using-pods
It's best practice to keep the single container inside the pod however we can run the multiple containers also.
Keep the Redis & Nginx containers as separate deployment and you can add the HPA on that they can scale up and down based on load.
By keeping the deployment separate would also help in doing any maintenance if required or deployment without downtime.
However, if you don't have an option to keep Nginx aside you can keep that also with fpm container as it's also a light weight container.
For example here I have one fpm and Nginx containers running inside single POD : https://github.com/harsh4870/Kubernetes-wordpress-php-fpm-nginx/blob/master/wordpress-deployment.yaml
if possible divide all the applications as a single deployment so that it will be very easy to manage the deployments and maintenance and scaling also.

Is a Kubernetes StatefulSet the right answer if all we need is a consistent identifier for each Pod in a group of identical Pods?

We are looking at a Kubernetes scenario that requires us to maintain N pods for a given Deployment (let's assume for simplicitly that N is static and N = 3). Currently we are using a Deployment and a ReplicaSet for this.
Within each pod, is there any way (through environment variable injection or similar) for us to get a unique identifier that shows which pod that pod is (i.e. "1", "2", "3" or similar... the exact format is unimportant).
What is especially important (because of the system these pods connect to) is that if pod "2" dies, the replacement pod also reports its identifier as "2", not as something new, e.g. "4"... in other words, the set of identifiers does not change over time unless the size of the set is increased / decreased. Currently we are using the pod name, but that is not stable in this way; the pod name is new and unique every time.
Is this what a StatefulSet is for? The documentation seems to focus in particular on storage volumes, but this is not a priority for us. How would we actually obtain the unique and stable ID inside the container in code?
Yes, Statefulset is the way to go if the pods need to have their identity defined in some way.
Here is the quote from a relevant section from the docs:
Like a Deployment, a StatefulSet manages Pods that are based on an
identical container spec. Unlike a Deployment, a StatefulSet maintains
a sticky identity for each of their Pods. These pods are created from
the same spec, but are not interchangeable: each has a persistent
identifier that it maintains across any rescheduling.
So, if you have a Statefulset object named myapp with 3 replicas then the pods will be named as myapp-0, myapp-1 and myapp-2.
Further, if any of the pods die say myapp-1 then the new pod created as the replacement of that will again be myapp-1.
You can expose the pod name to the containers via environment variables through Downward API and use it inside the scripts:
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
A relevant note that I missed to mention is that with Statefulsets the pods are brought up one by one unlike Deployments. So, for the above example myapp, myapp-1 will only get started after myapp-0 is ready.

Does ReplicaSets replace Pods?

I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no information about my old Pods ?
If I apply now the Replicaset does this reference to the Pod settings, so with all settings like readinessProbe/livenessProbe ... ?
My Questions came up because in my replicaset.yml is a container section where I specified my docker image, but why does it need that information, isn't it a redundant information, because this information is in my pods.yml ?
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: test1
spec:
replicas: 2
template:
metadata:
labels:
app: web
spec:
containers:
- name: test1
image: test/test
Pods are the smallest deployable units of computing that can be
created and managed in Kubernetes.
A Pod (as in a pod of whales or pea pod) is a group of one or more
containers (such as Docker containers), with shared storage/network,
and a specification for how to run the containers.
See, https://kubernetes.io/docs/concepts/workloads/pods/pod/.
So, you can specify how your Pod will be scheduled (one or more containers, ports, probes, volumes, etc.).
But in case of the node failure or anything bad that can harm to the Pod, then that Pod won't be rescheduled (you have to rescheduled manually). So, in that case, you need a controller. Kubernetes provides some controllers (each one for different purposes). They are -
ReplicaSet
ReplicationController
Deployment
StatefulSet
DaemonSet
Job
CronJob
All of the above controllers and the Pod together are called as Workload. Because they all have a podTemplate section. And they all create some number of identical Pods ass specified by the spec.replicas field (if this field exists in the corresponding workload manifest). They all are upper-level concept than Pod.
Though the Deployment is more suitable than the ReplicaSet, this answer focuses on ReplicaSet over Pod cause the question is between the Pod and ReplicaSet.
In addition, each one of the above controllers has it's own purpose. Like a ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
A ReplicaSet contains a podTemplate field including selectors to identify and acquire Pod(s). A pod template specifying the configuration of new Pods it should create to meet the number of replicas criteria. It creates and deletes Pod(s) as needed to reach the desired number. When a ReplicaSet needs to create new Pod(s), it uses its Pod template.
The Pod(s) maintained by a ReplicaSet has metadata.ownerReferences field, to tell which resource owns the current Pod(s).
A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a controller and it matches a ReplicaSet’s selector, it will be immediately acquired by said ReplicaSet.
Ref: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
**
Now, its time to answer your questions
Since ReplicaSet is one of the Pod controller (listed above), obviously, it needs a podTemplate (using this template, your Pods will be scheduled). All of the Pods the ReplicaSet creates will have the same Pod configuration (same containers, same ports, same readiness/livelinessProbe, volumes, etc.). And having this podTemplate is not redundant info, it's needed. So, if you have a Pod controller like ReplicaSet or other (as your need), you don't need the Pod itself anymore. Because the ReplicaSet (or the other controllers) will create Pod(s).
**
Guess, you got the answer.
Does ReplicaSets replace Pods?
Yes, if you have replicaset.yml you don't need pods.yml.
I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no
information about my old Pods ? If I apply now the Replicaset does
this reference to the Pod settings, so with all settings like
readinessProbe/livenessProbe ... ?
No, the ReplicaSet manifest has to contain the Pod specification in order to determine what is the configuration of the pods that should be deployed.
With the labels, you link the ReplicaSet to running Pods.
You don't link the ReplicaSet.yml manifest to the Pods.yml manifest.
Don’t use naked Pods (that is, Pods not bound to a ReplicaSet or Deployment) if you can avoid it. Naked Pods will not be rescheduled in the event of a node failure.
In 99% of the cases, there isn't a separate pods.yml manifest.
The pods + the ReplicaSet are defined in a single manifest, hence the containers section in the replicaset.yml.

StatefulSet, ReplicaSet or DaemonSet. What is the best for a single Pod?

I want to deploy a single Pod on a Node to host my service (like GitLab for the example). The problem is : a Pod will not be re-created after the Node failure (like a reboot). The solution(s) : Use a StatefulSet, ReplicaSet or DaemonSet to ensure the Pod creation after a Node failure. But what is the best for this case ?
This Pod is stateful (I am using volume hostPath to keep the data) and is deployed using nodeSelector to keep it always on the same Node.
Here is a simple YAML file for the example : https://pastebin.com/WNDYTqSG
It creates 3 Pods (one for each Set) with a volume to keep the data statefully. In practice, all of these solutions can feet my needs, but I don't know if there are best practices for this case.
Can you help me to choose between these solutions to deploy a single stateful Pod please ?
Deployment is the most common option to manage a Pod or set of Pods. These are normally used instead of ReplicaSets as they are more flexible and creating a Deployment results in a ReplicaSet - see https://www.mirantis.com/blog/kubernetes-replication-controller-replica-set-and-deployments-understanding-replication-options/
You would only need a StatefulSet if you had multiple Pods and needed dedicated persistence per Pod or you had multiple Pods and the Pods need individual names because they relate to each other (e.g. one is a leader) - https://stackoverflow.com/a/48006210/9705485
A DaemonSet would be used when you want one Pod/replica per Node

What is the difference between a pod and a deployment?

I have been creating pods with type:deployment but I see that some documentation uses type:pod, more specifically the documentation for multi-container pods:
apiVersion: v1
kind: Pod
metadata:
name: ""
labels:
name: ""
namespace: ""
annotations: []
generateName: ""
spec:
? "// See 'The spec schema' for details."
: ~
But to create pods I can just use a deployment type:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ""
spec:
replicas: 3
template:
metadata:
labels:
app: ""
spec:
containers:
etc
I noticed the pod documentation says:
The create command can be used to create a pod directly, or it can
create a pod or pods through a Deployment. It is highly recommended
that you use a Deployment to create your pods. It watches for failed
pods and will start up new pods as required to maintain the specified
number. If you don’t want a Deployment to monitor your pod (e.g. your
pod is writing non-persistent data which won’t survive a restart, or
your pod is intended to be very short-lived), you can create a pod
directly with the create command.
Note: We recommend using a Deployment to create pods. You should use
the instructions below only if you don’t want to create a Deployment.
But this raises the question of what kind:pod is good for? Can you somehow reference pods in a deployment? I didn't see a way. It looks like what you get with pods is some extra metadata but none of the deployment options such as replica or a restart policy. What good is a pod that doesn't persist data, survives a restart? I think I'd be able to create a multi-container pod with a deployment as well.
Radek's answer is very good, but I would like to pitch in from my experience, you will almost never use an object with the kind pod, because that doesn't make any sense in practice.
Because you need a deployment object - or other Kubernetes API objects like a replication controller or replicaset - that needs to keep the replicas (pods) alive (that's kind of the point of using kubernetes).
What you will use in practice for a typical application are:
Deployment object (where you will specify your apps container/containers) that will host your app's container with some other specifications.
Service object (that is like a grouping object and gives it a so-called virtual IP (cluster IP) for the pods that have a certain label - and those pods are basically the app containers that you deployed with the former deployment object).
You need to have the service object because the pods from the deployment object can be killed, scaled up and down, and you can't rely on their IP addresses because they will not be persistent.
So you need an object like a service, that gives those pods a stable IP.
Just wanted to give you some context around pods, so you know how things work together.
Hope that clears a few things for you, not long ago I was in your shoes :)
Both Pod and Deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template. It is rather unlikely that you will ever need to create Pods directly for a production use-case.
Kubernetes has three Object Types you should know about:
Pods - runs one or more closely related containers
Services - sets up networking in a Kubernetes cluster
Deployment - Maintains a set of identical pods, ensuring that they have the correct config and that the right number of them exist.
Pods:
Runs a single set of containers
Good for one-off dev purposes
Rarely used directly in production
Deployment:
Runs a set of identical pods
Monitors the state of each pod, updating as necessary
Good for dev
Good for production
And I would agree with other answers, forget about Pods and just use Deployment. Why? Look at the second bullet point, it monitors the state of each pod, updating as necessary.
So, instead of struggling with error messages such as this one:
Forbidden: pod updates may not change fields other than spec.containers[*].image
So just refactor or completely recreate your Pod into a Deployment that creates a pod to do what you need done. With Deployment you can change any piece of configuration you want to and you need not worry about seeing that error message.
Pod is container instance.
That is the output of replicas: 3
Think of one deployment can have many running instances(replica).
//deployment.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: tomcat-deployment222
spec:
selector:
matchLabels:
app: tomcat
replicas: 3
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat:9.0
ports:
- containerPort: 8080
I want to add some informations from Kubernetes In Action book, so you can see all picture and connect relation between Kubernetes resources like Pod, Deployment and ReplicationController(ReplicaSet)
Pods
are the basic deployable unit in Kubernetes. But in real-world use cases, you want your deployments to stay up and running automatically and remain healthy without any manual intervention. For this the recommended approach is to use a Deployment, which under the hood create a ReplicaSet.
A ReplicaSet, as the name implies, is a set of replicas (Pods) maintained with their Revision history.
(ReplicaSet extends an older object called ReplicationController -- which is exactly the same but without the Revision history.)
A ReplicaSet constantly monitors the list of running pods and makes sure the running number of pods matching a certain specification always matches the desired number.
Removing a pod from the scope of the ReplicationController comes in handy
when you want to perform actions on a specific pod. For example, you might
have a bug that causes your pod to start behaving badly after a specific amount
of time or a specific event.
A Deployment
is a higher-level resource meant for deploying applications and updating them declaratively.
When you create a Deployment, a ReplicaSet resource is created underneath (eventually more of them). ReplicaSets replicate and manage pods, as well. When using a Deployment, the actual pods are created and managed by the Deployment’s ReplicaSets, not by the Deployment directly
Let’s think about what has happened. By changing the pod template in your Deployment resource, you’ve updated your app to a newer version—by changing a single field!
Finally, Roll back a Deployment either to the previous revision or to any earlier revision so easy with Deployment resource.
These images are from Kubernetes In Action book, too.
Pod is a collection of containers and basic object of Kuberntes. All containers of pod lie in same node.
Not suitable for production
No rolling updates
Deployment is a kind of controller in Kubernetes.
Controllers use a Pod Template that you provide to create the Pods for which it is responsible.
Deployment creates a ReplicaSet which in turn make sure that,
CurrentReplicas is always same as desiredReplicas .
Advantages :
You can rollout and rollback your changes using deployment
Monitors the state of each pod
Best suitable for production
Supports rolling updates
In Kubernetes we can deploy our workloads using different type of API objects like Pods, Deployment, ReplicaSet, ReplicationController and StatefulSets.
Out of those Pods are the smallest deployable unit in Kubernetes. Any workload/application that runs in Kubernetes, has to run inside a container part of a Pod. A Pod could run multiple containers (meaning multiple applications) within it. A Pod is a wrapper on top of one/many running containers. Using a Pod, kubernetes could control, monitor, operate the containers.
Now using stand alone Pods we can't do lot of things. We can't change configurations, volumes inside Pods. We can't restart the Pod if one is down.
So there is another API Object called Deployment comes into picture which maintains the desired state (how many instances, how much compute resource application uses) of the application. The Deployment maintaines multiple instances of same application by running multiple Pods. Deployments unlike Pods are mutable. Deployments uses another API Object called ReplicaSet to maintain the desired state. Deployments through ReplicaSet spawns another Pod if one is down.
So Pod runs applications in containers. Deployments run Pods and maintains desired state of the application.
Try to avoid Pods and implement Deployments instead for managing containers as objects of kind Pod will not be rescheduled (or self healed) in the event of a node failure or pod termination.
A Deployment is generally preferable because it defines a ReplicaSet to ensure that the desired number of Pods is always available and specifies a strategy to replace Pods, such as RollingUpdate.
May be this example will be helpful for beginners !!
1) Listing PODs
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 92s
webapp-mysql-75dfdf859f-9c54j 1/1 Running 0 92s
2) Deleting web-app pode - which is created using deployment
controlplane $ kubectl -n my-namespace delete pod webapp-mysql-75dfdf859f-9c54j
pod "webapp-mysql-75dfdf859f-9c54j" deleted
3) Listing PODs ( You can see, it is recreated automatically)
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 2m42s
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 45s
4) Deleting mysql POD whcih is created directly ( with out deployment)
controlplane $ kubectl -n my-namespace delete pod mysql
pod "mysql" deleted
5) Listing PODs ( You can see mysql POD is lost for ever )
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 76s
In kubernetes Pods are the smallest deployable units. Every time when we create a kubernetes object like Deployments, replica-sets, statefulsets, daemonsets it creates pod.
As mentioned above deployments create pods based on desired state mentioned in your deployment object. So for example you want 5 replicas of a application, you mentioned replicas: 5 in your deployment manifest. Now deployment controller is responsible to create 5 identical replicas (no less, no more) of given application with all metadata like RBAC policy, networks policy, labels, annotations, health check, resource quotas, taint/tolerations and others and associate with each pods it creates.
There are some cases when you wants to create pod, for example if you are running a test sidecar where you don't need to run application forever, you don't need multiple replicas, and you run application when you wants to execute in that case pod is suitable. For example helm test, which is a pod definition that specifies a container with a given command to run.
I am also a beginner in k8s so correct me if I am wrong.
We know that a pod is created when we create a deployment. What I observed is that if you see the YAML file of the deployment, you can see its kind:deployment. But if you see the YAML file of the pod, you see its kind:pod.