Why does deleting a kubernetes namespace take so long? - kubernetes

I'm attempting to write some integration tests that setup a deployment and an ingress and then make web requests, effectively curl commands, against the ingress to test the configuration of the ingress. Backends and services are also created to gaurantee that the ingress is correctly routing and proxying to the backends.
However, tear down of the setup, to run a new set of tests is slow. By 'teardown' here I mean I simply delete the namespace in which all of these deployments live. This can take quite a while. Why is that? And what are the best ways to quickly tear down such a setup?

Kubernetes works largely through controllers, which loop endlessly looking for small pieces of work to do (like schedule a pod somewhere, unschedule a pod, remove an ingress route, etc); this makes it highly reliable but sometimes comes at the cost of relatively high latency for your operations. Namespace deletions require bringing down all the resources in a cluster, which requires a lot of small steps and therefore can take a while to finish.
There is a --force option for kubectl delete, but it comes with some scary-sounding warnings:
--force=false: If true, immediately remove resources from API and
bypass graceful deletion. Note that immediate deletion of some
resources may result in inconsistency or data loss and requires
confirmation.
So, this probably isn't advisable as a regular thing to do (perhaps someone more familiar with its behavior can add on to this).
Another option is to let the delete proceed asynchronously and just not block your CI jobs on it. The --wait=false flag (by default, set to true) will make sure the request is entered successfully but won't block kubectl from exiting while the delete actually happens. Your namespace will enter the Terminating state and eventually get deleted (unless something prevents it from coming down).
kubectl delete namespace my-test-namespace-1 --wait=false
This does mean that your next CI run may find the namespace is still there. To avoid a conflict, you could use a random suffix or incrementing counter for the namespace's name.

Related

Limit/Controlled kubectl exec command

I am still bit of kubernetes newbe. But I am looking for a way to give developers controlled access to kubectl exec command . I want to give them run most of the read-only command but prevent some high risk command and also prevent interactive download/install etc. Also want to log all their action during sessions for audit purpose.
I am not seeing any straight forward way to do it using rbac. Also not seeing those options in rancher either. I am looking for some guidance and direction to achieve such capability.
I am sure some of you have achieved it some way.
Kubernetes RBAC can only validate whenever you can or cannot exec into pods, (by checking create verb on pods/exec resource), after that it switches to SPDY protocol and passes your input and returning back output from analog of docker exec on your container runtime, without actually caring about what's going in and out
With rbac you also have to specify pod name, which might be problematic if you are using Deployments, where each new revision will generate a different pod name. Since pattern matching is not implemented in rbac - you would have to change your role every time new pod name is generated.
So the answer is "No, you can' do it with rbac"
An alternative solution would be to use some kind of CI/CD (jenkins,gitlab-ci etc.) or orchestration tool (rundeck, ansible-tower etc) where you will create some kind of script, where your developers would pass arguments to a job, controlled by you, i.e.
kubectl exec deploy/foo -- /bin/bar baz "$DEV_ARGUMENT"
Which, essentially, means, that you would be responsible for managing access to that job/script, creating and maintaining serviceAccount for that script, etc.
If you are afraid of image mutability, i.e. you don't want your developers to install something in running container, but otherwise are okay with giving them shell on it (remember, they can still read any secrets/env vars/configMaps and even serviceAccount tokens that pod uses of you mount them by default), you should consider the following:
Don't run your containers as root. Try to use images, that support rootles operation, and then either specify correct non-root UID in runAsUser field in securityContext, or configure runAsNonRoot: true flag to deny containers running as root.
Better general solution would be to utilize PodSecurityPolicy (deprecated, removed in 1.25), Pod Security Admission or some 3rd party admission contoller like OPA Gatekeeper to deny containers running as root in your namespace
You can also make your pods immutable by using readOnlyRootFilesystem in security context, which will deny write operation to pod ephemeral storage (but if you mounted any volume as RW - they still will be accessible to write operations). Feasibility of this approach depends on whenever your apps use some kind of temporary files of not
Relevant links:
kubernetes RBAC role verbs to exec to pod
https://github.com/kubernetes/kubernetes/issues/44703#issuecomment-324826356 - issue, discussing current rbac limitations
https://itnext.io/how-it-works-kubectl-exec-e31325daa910
https://erkanerol.github.io/post/how-kubectl-exec-works/ - bot links explaining how exec actually works

Kubernetes rolling deploy: terminate a pod only when there are no containers running

I am trying to deploy updates to pods. However I want the current pods to terminate only when all the containers inside the pod have terminated and their process is complete.
The new pods can keep waiting to start untill all container in the old pods have completed. We have a mechanism to stop old pods from picking up new tasks and therefore they should eventually terminate.
It's okay if twice the pods exist at some instance of time. I tried finding solution for this in kubernetes docs but wan't successful. Pointers on how / if this is possible would be helpful.
well I guess then you may have to create a duplicate kind of deployment with new image as required and change the selector in service to new deployment, which will prevent external traffic from entering pre-existing pods and new calls can go to new pods. Then later you can check for something like -
Kubectl top pods -c containers
and if the load appears to be static and low, then preferrably you can delete the old pods related deployment later.
But for this thing everytime the service selectors have to be updated and likely for keeping track of things you can append the git commit hash to the service selector to keep it unique everytime.
But rollback to previous versions if required from inside Kubernetes cluster will be difficult, so preferably you can trigger the wanted build again.
I hope this makes some sense !!

Kubernetes - deployment initialization - how to ensure it happens only once?

I using Kubernetes 1.12. I have a service (e.g. pod) which may have multiple instances (e.g. replicas > 1)
My goal is to perform a maintenance task (e.g. create\upgrade database, generate certificate, etc) before any of service instances are up.
I was considering to use Init Container, but at least as I understand, Init Container will be executed anytime additional replica (pod) is created and worse - that might happen in parallel. In that case, multiple Init Containers might work in parallel and thus corrupt my database and everything else.
I need a clear solution to perform a bootstrap maintenance task only once per deployment. How you would suggest to do that?
I encountered the same problem running db migrations before each deployment. Here's a solution based on a Job resource:
kubectl apply -f migration-job.yml
kubectl wait --for=condition=complete --timeout=60s job/migration
kubectl delete job/migration
kubectl apply -f deployment.yml
migration-job.yml defines a Job configured with restartPolicy: Never and a reasonably low activeDeadlineSeconds. Using kubectl wait ensures that any errors or timeout in migration-job.yml causes the script to fail and thus prevent applying deployment.yml.
One of the ways you could use to retain startup sequence controll would be to use StatefulSet. With sequential startup, next pod will not start untill previous is done, removing parallel init risk.
Personally I would prefer this init to have its own locking mechanism and stick to regular Deploymants.
Remember that you need to take into account not only first startup on Deployment creation, but also cases for rolling releases, scaling, outages etc.

Is there a way to make kubectl apply restart deployments whose image tag has not changed?

I've got a local deployment system that is mirroring our production system. Both are deployed by calling kubectl apply -f deployments-and-services.yaml
I'm tagging all builds with the current git hash, which means that for clean deploys to GKE, all the services have a new docker image tag which means that apply will restart them, but locally to minikube the tag is often not changing which means that new code is not run. Before I was working around this by calling kubectl delete and then kubectl create for deploying to minikube, but as the number of services I'm deploying has increased, that is starting to stretch the dev cycle too far.
Ideally, I'd like a better way to tell kubectl apply to restart a deployment rather than just depending on the tag?
I'm curious how people have been approaching this problem.
Additionally, I'm building everything with bazel which means that I have to be pretty explicit about setting up my build commands. I'm thinking maybe I should switch to just delete/creating the one service I'm working on and leave the others running.
But in that case, maybe I should just look at telepresence and run the service I'm dev'ing on outside of minikube all together? What are best practices here?
I'm not entirely sure I understood your question but that may very well be my reading comprehension :)
In any case here's a few thoughts that popped up while reading this (again not sure what you're trying to accomplish)
Option 1: maybe what you're looking for is to scale down and back up, i.e. scale your deployment to say 0 and then back up, given you're using configmap and maybe you only want to update that, the command would be kubectl scale --replicas=0 -f foo.yaml and then back to whatever
Option 2: if you want to apply the deployment and not kill any pods for example, you would use the cascade=false (google it)
Option 3: lookup the rollout option to manage deployments, not sure if it works on services though
Finally, and that's only me talking, share some more details like which version of k8s are you using? maybe provide an actual use case example to better describe the issue.
Kubernetes, only triggers a deployment when something has changed, if you have image pull policy to always you can delete your pods to get the new image, if you want kube to handle the deployment you can update the kubernetes yaml file to container a constantly changing metadata field (I use seconds since epoch) which will trigger a change. Ideally you should be tagging your images with unique tags from your CI/CD pipeline with the commit reference they have been built from. this gets around this issue and allows you to take full advantage of the kubernetes rollback feature.

How to update a set of pods running in kubernetes?

What is the preferred way of updating a set of pods (e.g. after making code changes & pushing underlying docker image to docker hub) controlled by a replication controller in kubernetes cluster?
I can see 2 ways:
Deleting & re-creating replication controller manually
Using kubectl rolling-update
With the rolling-update I have to change the replication controller name. Since I'm storing replication controller definition in YAML file and not generating it manually, having to change the file to push out a code update seems to bring about bad habits like alternating between 2 names for the replication controller (e.g. controllerA and controllerB) to avoid name conflict.
What is the better way?
Update: kubectl rolling-update has been deprecated and the replacement command is kubectl rollout. Also note that since I wrote the original answer the Deployment resource has been added and is a better choice than ReplicaSets as the rolling update is performed server side instead of by the client.
You should use kubectl rolling-update. We recently added a feature to do a "simple rolling update" which will update the image in a replication controller without renaming it. It's the last example shown in the kubectl help rolling-update output:
// Update the pods of frontend by just changing the image, and keeping the old name
$ kubectl rolling-update frontend --image=image:v2
This command also supports recovery -- if you cancel your update and restart it later, it will resume from where it left off. Even though it creates a new replication controller behind the scenes, at the end of the update the new replication controller takes the name of the old replication controller so it appears as pure update rather than switching to an entirely new replication controller.
The best option I've found so far is Skaffold, which automatically builds the image, pushes it the image registry and updates the corresponding pods/controllers. It can even watch for code changes and rebuild the image as soon as changes are saved with skaffold dev command. This only requires adding a simple skaffold.yaml that specifies the image on the registry and path to the Kubernetes manifests. This workflow is described in details in the Getting Started guide.
The following explanations are from Kubernetes In Action's book
Deleting & re-creating replication controller manually
Doing a rolling update manually is laborious and error-prone. Depending on the number of replicas, you’d need to run a dozen or more commands in the proper order to perform the update process.Luckily, Kubernetes allows you to perform the rolling update with a single command.
Using kubectl rolling-update
Instead of performing rolling updates using ReplicationControllers manually, you can have kubectl perform them. Using kubectl to perform the update makes the process much easier, but, this is now an out dated way of updating apps.
Why performing an update like this isn’t as good as it could be is because it’s imperative. How Kubernetes is about you telling it the desired state of the system and having Kubernetes achieve that state on its own, by figuring out the best way to do it.
Using Deployments for updating apps declaratively --THE BEST ALTERNATIVE--
A Deployment is a higher-level resource meant for deploying applications and updating them declaratively, instead of doing it through a ReplicationController or a ReplicaSet, which are both considered lower-level concepts.
Using a Deployment instead of the lower-level constructs makes updating an app much easier, because you’re defining the desired state through the single Deployment resource and letting Kubernetes take care of the rest.
One more thing, Rolling back a rollout is possible because Deployments.