I am building an application which should execute tasks in a separate container/pods.
this application would be running in a specific namespace the new pods must be created in the same namespace as well.
I understand we can similar via custom CRD and Operators, but I found it is overly complicated and we need Golang knowledge for the same.
Is there any way this could be achived without having to learn Operators and GoLang?
I am ok to use kubctl or api within my container and wanted to connect the host and to the same namespace.
Yes, this is certainly possible using a ServiceAccount and then connecting to the API from within the Pod.
First, create a ServiceAccount in your namespace using
kubectl create serviceaccount my-service-account
For your newly created ServiceAccount, give it the permissions you want using Roles and RoleBindings. The subject would be something like this:
subjects:
- kind: ServiceAccount
name: my-service-account
namespace: my-namespace
Then, add the ServiceAccount to the Pod from where you want to create other Pods from (see documentation). Credentials are automatically mounted inside the Pod using automountServiceAccountToken.
Now from inside the Pod you can either use kubectl or call the API using the credentials inside the Pod. There are libraries for a lot of programming languages to talk to Kubernetes, use those.
Platform: OEL 7.7 + kube 1.15.5 + docker 19.03.1
We're building an erasure-coded object store on k8s using a containerized kubelet approach. We're having a tough time coming up with a viable disk life cycle approach. As it is now, we must provide an "extra_binds" argument to the kubelet which specifies the base mount point where our block devices are mounted. (80 SSDs per node, formatted as ext4)
That all works fine. Creating PV's and deploying apps works fine. Our problem comes when a PVC is deleted and we want to scrub the disk(s) that were used and make the disk(s) available again.
So far the only thing that works is to cordon that node, remove the extra binds from kubelet, bounce the node, reconfigure the block device, re-add the kubelet binds. Obviously this is too clunky for production. For starters, bouncing kubelet is not an option.
Once a PV gets used, something is locking this block device, even though checking lsof on the bare metal system shows non open handles. I can't unmount or create a new filesystem on the device. Merely bouncing kubelet doesn't free up the "lock".
Anyone using a containerized kubernetes control plane with an app using local disks in a similar fashion? Anyone found a viable way to work around this issue?
Our long term plan is to write an operator that manages disks but even with an operator I don't see how it can mitigate this problem.
Thanks for any help,
First look at your Finalizers:
$ kubectl describe pvc <PVC_NAME> | grep Finalizers
$ kubectl describe pv <PV_NAME> | grep Finalizers
if they are set to Finalizers: [kubernetes.io/pvc-protection] (explained here) that mean they are protected and you need to edit that, for example using:
$ kubectl patch pvc <PVC_NAME> -p '{"metadata":{"finalizers":null}}'
As for forcefully removing PersistentVolumes you can try
$ kubectl delete pv <PV_NAME> --force --grace-period=0
Also please check VolumeAttachment do still exist $ kubectl get volumeattachment as they might be blocked.
I also remember there was as issue on stack Kubernetes PV refuses to bind after delete/re-create stating that pv holds uid of pvc that was claimed by.
You can check that by displaying whole yaml of the pv:
$ kubectl get pv <PV_NAME> -o yaml and looking for:
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: packages-pvc
namespace: default
resourceVersion: "10218121"
uid: 1aede3e6-eaa1-11e9-a594-42010a9c0005
You would need to provide more information regarding your k8s cluster and pv, pvc configuration so I could go deeper into to or even test it.
I have followed instructions from this blog post to set up a k3s cluster on a couple of raspberry pi 4:
I'm now trying to get my hands dirty with traefik as front, but I'm having issues with the way it has been deployed as a 'HelmChart' I think.
From the k3s docs
It is also possible to deploy Helm charts. k3s supports a CRD
controller for installing charts. A YAML file specification can look
as following (example taken from
/var/lib/rancher/k3s/server/manifests/traefik.yaml):
So I have been starting up my k3s with the --no-deploy traefik option to manually add it with settings. So I therefore manually apply a yaml like this:
apiVersion: helm.cattle.io/v1
kind: HelmChart
metadata:
name: traefik
namespace: kube-system
spec:
chart: https://%{KUBERNETES_API}%/static/charts/traefik-1.64.0.tgz
set:
rbac.enabled: "true"
ssl.enabled: "true"
kubernetes.ingressEndpoint.useDefaultPublishedService: "true"
dashboard:
enabled: true
domain: "traefik.k3s1.local"
But when trying to iterate over settings to get it working as I want, I'm having trouble tearing it down. If I try kubectl delete -f on this yaml it just hangs indefinitely. And I can't seem to find a clean way to delete all the resources manually either.
I've been resorting now to just reinstall my entire cluster over and over because I can't seem to cleanup properly.
Is there a way to delete all the resources created by a chart like this without the helm cli (which I don't even have)?
Are you sure that kubectl delete -f is hanging?
I had the same issue as you and it seemed like kubectl delete -f was hanging, but it was really just taking a long time.
As far as I can tell, when you issue the kubectl delete -f a pod in the kube-system namespace with a name of helm-delete-* should spin up and try to delete the resources deployed via helm. You can get the full name of that container by running kubectl -n kube-system get pods, find the one with kube-delete-<name of yaml>-<id>. Then use the pod name to look at the logs using kubectl -n kube-system logs kube-delete-<name of yaml>-<id>.
An example of what I did was:
kubectl delete -f jenkins.yaml # seems to hang
kubectl -n kube-system get pods # look at pods in kube-system namespace
kubectl -n kube-system logs helm-delete-jenkins-wkjct # look at the delete logs
I see two options here:
Use the --now flag to delete your yaml file with minimal delay.
Use --grace-period=0 --force flags to force delete the resource.
There are other options but you'll need Helm CLI for them.
Please let me know if that helped.
As a k8s cluster administrator, I want to specify on which nodes (using labels) pods will be scheduled, but without modifying any PodSpec section.
So, nodeSelector, affinity and taints can't be options.
Is there any other solution ?
PS: the reason I can't modify the PodSpec is that deployed applications are available as Helm charts and I don't have hand on those files. Moreover, if I change the PodSpec, it will be lost on next release upgrade.
You can use the PodNodeSelector admission controller for this:
This admission controller has the following behavior:
If the Namespace has an annotation with a key scheduler.kubernetes.io/nodeSelector, use its value as the node selector.
If the namespace lacks such an annotation, use the clusterDefaultNodeSelector defined in the PodNodeSelector plugin configuration file as the node selector.
Evaluate the pod’s node selector against the namespace node selector for conflicts. Conflicts result in rejection.
Evaluate the pod’s node selector against the namespace-specific whitelist defined the plugin configuration file. Conflicts result in rejection.
First of all you will need to enable this admission controller. The way to enable it depends on your environment, but it's done via the parameter kube-apiserver --enable-admission-plugins=PodNodeSelector.
Then create a namespace and annotate it with whatever node label you want all Pods in that namespace to have:
kubectl create ns node-selector-test
kubectl annotate ns node-selector-test \
scheduler.alpha.kubernetes.io/node-selector=mynodelabel=mynodelabelvalue
To test it you could do something like this:
kubectl run busybox \
-n node-selector-test -it --restart=Never --attach=false --image=busybox
kubectl get pod busybox -n node-selector-test -o yaml
It should output something like this:
apiVersion: v1
kind: Pod
metadata:
name: busybox
....
spec:
...
nodeSelector:
mynodelabel: mynodelabelvalue
Now, unless that label exists on some nodes, this Pod will never be scheduled, so put this label on a node to see it scheduled:
kubectl label node myfavoritenode mynodelabel=mynodelabelvalue
No, there are no other ways to specify on which nodes pod will be scheduled, only labels and selectors.
I think the problem with a Helm is related to that issue.
For now, the only way for you is to change a spec, remove release and deploy a new one with updated specs.
UPD
#Janos Lenart provides a way how to manage scheduling per-namespace. That is a good idea if your releases already are split among namespaces and if you don't want to spawn different pods on different nodes in a single release. Otherwise, you will have to create new releases in new namespaces and in that case I highly recommend you to use selectors in the Pods spec.
I have been creating pods with type:deployment but I see that some documentation uses type:pod, more specifically the documentation for multi-container pods:
apiVersion: v1
kind: Pod
metadata:
name: ""
labels:
name: ""
namespace: ""
annotations: []
generateName: ""
spec:
? "// See 'The spec schema' for details."
: ~
But to create pods I can just use a deployment type:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ""
spec:
replicas: 3
template:
metadata:
labels:
app: ""
spec:
containers:
etc
I noticed the pod documentation says:
The create command can be used to create a pod directly, or it can
create a pod or pods through a Deployment. It is highly recommended
that you use a Deployment to create your pods. It watches for failed
pods and will start up new pods as required to maintain the specified
number. If you don’t want a Deployment to monitor your pod (e.g. your
pod is writing non-persistent data which won’t survive a restart, or
your pod is intended to be very short-lived), you can create a pod
directly with the create command.
Note: We recommend using a Deployment to create pods. You should use
the instructions below only if you don’t want to create a Deployment.
But this raises the question of what kind:pod is good for? Can you somehow reference pods in a deployment? I didn't see a way. It looks like what you get with pods is some extra metadata but none of the deployment options such as replica or a restart policy. What good is a pod that doesn't persist data, survives a restart? I think I'd be able to create a multi-container pod with a deployment as well.
Radek's answer is very good, but I would like to pitch in from my experience, you will almost never use an object with the kind pod, because that doesn't make any sense in practice.
Because you need a deployment object - or other Kubernetes API objects like a replication controller or replicaset - that needs to keep the replicas (pods) alive (that's kind of the point of using kubernetes).
What you will use in practice for a typical application are:
Deployment object (where you will specify your apps container/containers) that will host your app's container with some other specifications.
Service object (that is like a grouping object and gives it a so-called virtual IP (cluster IP) for the pods that have a certain label - and those pods are basically the app containers that you deployed with the former deployment object).
You need to have the service object because the pods from the deployment object can be killed, scaled up and down, and you can't rely on their IP addresses because they will not be persistent.
So you need an object like a service, that gives those pods a stable IP.
Just wanted to give you some context around pods, so you know how things work together.
Hope that clears a few things for you, not long ago I was in your shoes :)
Both Pod and Deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template. It is rather unlikely that you will ever need to create Pods directly for a production use-case.
Kubernetes has three Object Types you should know about:
Pods - runs one or more closely related containers
Services - sets up networking in a Kubernetes cluster
Deployment - Maintains a set of identical pods, ensuring that they have the correct config and that the right number of them exist.
Pods:
Runs a single set of containers
Good for one-off dev purposes
Rarely used directly in production
Deployment:
Runs a set of identical pods
Monitors the state of each pod, updating as necessary
Good for dev
Good for production
And I would agree with other answers, forget about Pods and just use Deployment. Why? Look at the second bullet point, it monitors the state of each pod, updating as necessary.
So, instead of struggling with error messages such as this one:
Forbidden: pod updates may not change fields other than spec.containers[*].image
So just refactor or completely recreate your Pod into a Deployment that creates a pod to do what you need done. With Deployment you can change any piece of configuration you want to and you need not worry about seeing that error message.
Pod is container instance.
That is the output of replicas: 3
Think of one deployment can have many running instances(replica).
//deployment.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: tomcat-deployment222
spec:
selector:
matchLabels:
app: tomcat
replicas: 3
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat:9.0
ports:
- containerPort: 8080
I want to add some informations from Kubernetes In Action book, so you can see all picture and connect relation between Kubernetes resources like Pod, Deployment and ReplicationController(ReplicaSet)
Pods
are the basic deployable unit in Kubernetes. But in real-world use cases, you want your deployments to stay up and running automatically and remain healthy without any manual intervention. For this the recommended approach is to use a Deployment, which under the hood create a ReplicaSet.
A ReplicaSet, as the name implies, is a set of replicas (Pods) maintained with their Revision history.
(ReplicaSet extends an older object called ReplicationController -- which is exactly the same but without the Revision history.)
A ReplicaSet constantly monitors the list of running pods and makes sure the running number of pods matching a certain specification always matches the desired number.
Removing a pod from the scope of the ReplicationController comes in handy
when you want to perform actions on a specific pod. For example, you might
have a bug that causes your pod to start behaving badly after a specific amount
of time or a specific event.
A Deployment
is a higher-level resource meant for deploying applications and updating them declaratively.
When you create a Deployment, a ReplicaSet resource is created underneath (eventually more of them). ReplicaSets replicate and manage pods, as well. When using a Deployment, the actual pods are created and managed by the Deployment’s ReplicaSets, not by the Deployment directly
Let’s think about what has happened. By changing the pod template in your Deployment resource, you’ve updated your app to a newer version—by changing a single field!
Finally, Roll back a Deployment either to the previous revision or to any earlier revision so easy with Deployment resource.
These images are from Kubernetes In Action book, too.
Pod is a collection of containers and basic object of Kuberntes. All containers of pod lie in same node.
Not suitable for production
No rolling updates
Deployment is a kind of controller in Kubernetes.
Controllers use a Pod Template that you provide to create the Pods for which it is responsible.
Deployment creates a ReplicaSet which in turn make sure that,
CurrentReplicas is always same as desiredReplicas .
Advantages :
You can rollout and rollback your changes using deployment
Monitors the state of each pod
Best suitable for production
Supports rolling updates
In Kubernetes we can deploy our workloads using different type of API objects like Pods, Deployment, ReplicaSet, ReplicationController and StatefulSets.
Out of those Pods are the smallest deployable unit in Kubernetes. Any workload/application that runs in Kubernetes, has to run inside a container part of a Pod. A Pod could run multiple containers (meaning multiple applications) within it. A Pod is a wrapper on top of one/many running containers. Using a Pod, kubernetes could control, monitor, operate the containers.
Now using stand alone Pods we can't do lot of things. We can't change configurations, volumes inside Pods. We can't restart the Pod if one is down.
So there is another API Object called Deployment comes into picture which maintains the desired state (how many instances, how much compute resource application uses) of the application. The Deployment maintaines multiple instances of same application by running multiple Pods. Deployments unlike Pods are mutable. Deployments uses another API Object called ReplicaSet to maintain the desired state. Deployments through ReplicaSet spawns another Pod if one is down.
So Pod runs applications in containers. Deployments run Pods and maintains desired state of the application.
Try to avoid Pods and implement Deployments instead for managing containers as objects of kind Pod will not be rescheduled (or self healed) in the event of a node failure or pod termination.
A Deployment is generally preferable because it defines a ReplicaSet to ensure that the desired number of Pods is always available and specifies a strategy to replace Pods, such as RollingUpdate.
May be this example will be helpful for beginners !!
1) Listing PODs
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 92s
webapp-mysql-75dfdf859f-9c54j 1/1 Running 0 92s
2) Deleting web-app pode - which is created using deployment
controlplane $ kubectl -n my-namespace delete pod webapp-mysql-75dfdf859f-9c54j
pod "webapp-mysql-75dfdf859f-9c54j" deleted
3) Listing PODs ( You can see, it is recreated automatically)
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 2m42s
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 45s
4) Deleting mysql POD whcih is created directly ( with out deployment)
controlplane $ kubectl -n my-namespace delete pod mysql
pod "mysql" deleted
5) Listing PODs ( You can see mysql POD is lost for ever )
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 76s
In kubernetes Pods are the smallest deployable units. Every time when we create a kubernetes object like Deployments, replica-sets, statefulsets, daemonsets it creates pod.
As mentioned above deployments create pods based on desired state mentioned in your deployment object. So for example you want 5 replicas of a application, you mentioned replicas: 5 in your deployment manifest. Now deployment controller is responsible to create 5 identical replicas (no less, no more) of given application with all metadata like RBAC policy, networks policy, labels, annotations, health check, resource quotas, taint/tolerations and others and associate with each pods it creates.
There are some cases when you wants to create pod, for example if you are running a test sidecar where you don't need to run application forever, you don't need multiple replicas, and you run application when you wants to execute in that case pod is suitable. For example helm test, which is a pod definition that specifies a container with a given command to run.
I am also a beginner in k8s so correct me if I am wrong.
We know that a pod is created when we create a deployment. What I observed is that if you see the YAML file of the deployment, you can see its kind:deployment. But if you see the YAML file of the pod, you see its kind:pod.