What is difference between Kubernetes Jobs & Deployments - kubernetes

I see that Kubernetes Job & Deployment provide very similar configuration. Both can deploy one or more pods with certain configuration. So I have few queries around these:
Is the pod specification .spec.template different in Job & Deployment?
What is difference in Job's completions & Deployment's replicas?
If a command is run in a Deployment's only container and it completes (no server or daemon process containers), the pod would terminate. The same is applicable in a Job as well. So how is the pod lifecycle different in either of the resources?

Many resources in Kubernetes use a Pod template. Both Deployments and Jobs use it, because they manage Pods.
Controllers for workload resources create Pods from a pod template and manage those Pods on your behalf.
PodTemplates are specifications for creating Pods, and are included in workload resources such as Deployments, Jobs, and DaemonSets.
The main difference between Deployments and Jobs is how they handle a Pod that is terminated. A Deployment is intended to be a "service", e.g. it should be up-and-running, so it will try to restart the Pods it manage, to match the desired number of replicas. While a Job is intended to execute and successfully terminate.

Regarding spec.template: both Job and Deployment would include a similar definition. See: https://v1-21.docs.kubernetes.io/docs/reference/generated/kubernetes-api/v1.21/#podtemplate-v1-core
Job completions and parallelism lets you split a task in sub-tasks. See https://kubernetes.io/docs/concepts/workloads/controllers/job/#parallel-jobs , https://kubernetes.io/docs/tasks/job/indexed-parallel-processing-static/ . Replicas in a Deployment would not offer this.
In a Deployment, the default restartPolicy of your Pod is set Always. In a Job: Never. A job is not meant to restart your container once it would have exited. A deployment is not meant to exit.

Related

Set retention policy for Pods created by Kubernetes CronJob

I understand that Kubernetes CronJobs create pods that run on the schedule specified inside the CronJob. However, the retention policy seems arbitrary and I don't see a way where I can retain failed/successful pods for a certain period of time.
I am not sure about what you are exactly asking here.
CronJob does not create pods. It creates Jobs (which also manages) and those jobs are creating pods.
As per Kubernetes Jobs Documentation If the Jobs are managed directly by a higher level controller, such as CronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy. In short, pods and jobs will not be deleted utill you remove CronJob. You will be able to check logs from Pods/Jobs/CronJob. Just use kubctl describe
As default CronJob keeps history of 3 successfulJobs and only 1 of failedJob. You can change this limitation in CronJob spec by parameters:
spec:
successfulJobsHistoryLimit: 10
failedJobsHistoryLimit: 0
0 means that CronJob will not keep any history of failed jobs
10 means that CronJob will keep history of 10 succeeded jobs
You will not be able to retain pod from failed job because, when job fails it will be restarted until it was succeded or reached backoffLimit given in the spec.
Other option you have is to suspend CronJob.
kubctl patch cronjob <name_of_cronjob> -p '{"spec:"{"suspend":true}}'
If value of spuspend is true, CronJob will not create any new jobs or pods. You will have access to completed pods and jobs.
If none of the above helpd you, could you please give more information what do you exactly expect?
CronJob spec would be helpful.

Difference between daemonsets and deployments

In Kelsey Hightower's Kubernetes Up and Running, he gives two commands :
kubectl get daemonSets --namespace=kube-system kube-proxy
and
kubectl get deployments --namespace=kube-system kube-dns
Why does one use daemonSets and the other deployments?
And what's the difference?
Kubernetes deployments manage stateless services running on your cluster (as opposed to for example StatefulSets which manage stateful services). Their purpose is to keep a set of identical pods running and upgrade them in a controlled way. For example, you define how many replicas(pods) of your app you want to run in the deployment definition and kubernetes will make that many replicas of your application spread over nodes. If you say 5 replica's over 3 nodes, then some nodes will have more than one replica of your app running.
DaemonSets manage groups of replicated Pods. However, DaemonSets attempt to adhere to a one-Pod-per-node model, either across the entire cluster or a subset of nodes. A Daemonset will not run more than one replica per node. Another advantage of using a Daemonset is that, if you add a node to the cluster, then the Daemonset will automatically spawn a pod on that node, which a deployment will not do.
DaemonSets are useful for deploying ongoing background tasks that you need to run on all or certain nodes, and which do not require user intervention. Examples of such tasks include storage daemons like ceph, log collection daemons like fluentd, and node monitoring daemons like collectd
Lets take the example you mentioned in your question: why iskube-dns a deployment andkube-proxy a daemonset?
The reason behind that is that kube-proxy is needed on every node in the cluster to run IP tables, so that every node can access every pod no matter on which node it resides. Hence, when we make kube-proxy a daemonset and another node is added to the cluster at a later time, kube-proxy is automatically spawned on that node.
Kube-dns responsibility is to discover a service IP using its name and only one replica of kube-dns is enough to resolve the service name to its IP. Hence we make kube-dns a deployment, because we don't need kube-dns on every node.

Can you tell kubernetes to start one pod before another?

Can I add some config so that my daemon pods start before other pods can be scheduled or nodes are designated as ready?
Adding post edit:
These are 2 different pods altogether, the daemonset is a downstream dependency to any pods that might get scheduled on the host.
There's no such a thing as Pod hierarchy in Kubernetes between multiple separate types of pods. Meaning belonging to different Deployments, Statefulsets, Daemonsets, etc. In other words, there is no notion of a master pod and children pods. If you like to create your custom hierarchy you can build your own tooling around, for example waiting for the status of all pods in a DaemonSet to start or create a new Pod or Kubernetes workload resource.
The closest in terms of pod dependency in K8s is StatefulSets.
As per the docs:
For a StatefulSet with N replicas, when Pods are being deployed, they are created sequentially, in order from {0..N-1}.

How to deploy specific pod to all nodes including master, but only for specific pod

I have a security pod that needs to run everywhere including master. I do not want, however, master to run any other (non kubernetes) pods.
I know I can taint master node, and I know I can setup affinity for a pod. Yet (unless I am misunderstanding something) that isn't quite what I want.
What I want is to setup affinity in a way that this security pod runs on every single node including master as a part of same daemon set. It is important that I only have a single definition due to how this security pod gets deployed.
Can this be done?
I am running Kubernetes 1.8
I think this is more or less duplicate to this question.
What you need is a combination of two features:
DaemonSet will allow you to schedule Pod to run on every node
Tolerations in the DaemonSet Pods will allow this workload to run even on the node which has the master taint.
That way your security pods will run everywhere even on the master with the taint because they can tolerate it. I think there is an example directly on the DaemonSet website.
But other pods without this toleration will not be scheduled on master because they do not tolerate the taint.

How do I debug kubernetes scheduling?

I have added podAntiAffinity to my DeploymentConfig template.
However, pods are being scheduled on nodes that I expected would be excluded by the rules.
How can I view logs of the kubernetes scheduler to understand why it chose the node it did for a given pod?
PodAntiAffinity has more to do with other pods than nodes specifically. That is, PodAntiAffinity specifies which nodes to exclude based on what pods are already scheduled on that node. And even here you can make it a requirement vs. just a preference. To directly pick the node on which a pod is/is not scheduled, you want to use NodeAffinity. The guide.