We have constantly issues with our OpenShift Deployments. Credentials are missing suddenly (or suddenly we have the wrong credentials configured), deployments are scaled up and down suddenly etc.
Nobody of the team is aware of anything he did. However I am quite sure that this happens unknowingly from my recent experiences.
Is there any way to check the history of modifications to a resource? E.g. the last "oc/kubectl apply -f" - optimally with the contents that were modified and the user?
For a one off issue, you can also look at the replicaSets present in that namespace and examine them for differences. Depending on how much history you keep it may have already been lost, if it was present to begin with.
Try:
kubectl get rs -n my-namespace
Or, dealing with DeploymentConfigs, replicaControllers:
oc get rc -n my-namespace
For credentials, assuming those are in a secret and not the deployment itself, you wouldn't have that history without going to audit logs.
You need to configure and enable audit log, checkout the oc manual here.
In addition to logging metadata for all requests, logs request bodies
for every read and write request to the API servers...
K8s offers only scant functionality regarding tracking changes. Most prominently, I would look at kubectl rollout history for Deployments, Daemonsets and StatefulSets. Still, this will only tell you when and what was changes, but not who did it.
Openshift does not seem to offer much on top, since audit logging is cumbersome to configure and analyze.
With a problem like yours, the best remedy I see would be to revoke direct production access to K8s by the team and mandate changes to be rolled out via pipeline. That way you can use Git to track who did what.
Related
Is there a way that I can get release logs for a particular K8s release within my K8s cluster as the replica-sets related to that deployment is no longer serving pods?
For an example kubectl rollout history deployment/pod1-dep would result
1
2 <- failed deploy
3 <- Latest deployment successful
If I want to pick the logs related to events in 2, would it be a possible task, or is there a way that we can such functionality with this.
This is a Community Wiki answer, posted for better visibility, so feel free to edit it and add any additional details you consider important.
As David Maze rightly suggested in his comment above:
Once a pod is deleted, its logs are gone with it. If you have some
sort of external log collector that will generally keep historical
logs for you, but you needed to have set that up before you attempted
the update.
So the answer to your particular question is: no, you can't get such logs once those particular pods are deleted.
I'm attempting to write some integration tests that setup a deployment and an ingress and then make web requests, effectively curl commands, against the ingress to test the configuration of the ingress. Backends and services are also created to gaurantee that the ingress is correctly routing and proxying to the backends.
However, tear down of the setup, to run a new set of tests is slow. By 'teardown' here I mean I simply delete the namespace in which all of these deployments live. This can take quite a while. Why is that? And what are the best ways to quickly tear down such a setup?
Kubernetes works largely through controllers, which loop endlessly looking for small pieces of work to do (like schedule a pod somewhere, unschedule a pod, remove an ingress route, etc); this makes it highly reliable but sometimes comes at the cost of relatively high latency for your operations. Namespace deletions require bringing down all the resources in a cluster, which requires a lot of small steps and therefore can take a while to finish.
There is a --force option for kubectl delete, but it comes with some scary-sounding warnings:
--force=false: If true, immediately remove resources from API and
bypass graceful deletion. Note that immediate deletion of some
resources may result in inconsistency or data loss and requires
confirmation.
So, this probably isn't advisable as a regular thing to do (perhaps someone more familiar with its behavior can add on to this).
Another option is to let the delete proceed asynchronously and just not block your CI jobs on it. The --wait=false flag (by default, set to true) will make sure the request is entered successfully but won't block kubectl from exiting while the delete actually happens. Your namespace will enter the Terminating state and eventually get deleted (unless something prevents it from coming down).
kubectl delete namespace my-test-namespace-1 --wait=false
This does mean that your next CI run may find the namespace is still there. To avoid a conflict, you could use a random suffix or incrementing counter for the namespace's name.
I'm trying to understand what does kubectl rollout status <deployment name> do.
I'm using k8s-node-api, and from this thread (https://github.com/kubernetes-client/javascript/issues/536), the maintainer suggest using k8s-watch api to watch for changes in the deployment, but I'm not sure what to check.
Questions:
How to make sure the new deployment succeed?
How to make the the new deployment failed?
Is it safe to assume that if the spec/containers/0/image changes to something different than what I'm expecting, it means there is a new deployment and I should stop watching?
My questions are probably ambiguous because I'm new to k8s.
Any guidance will be great!
I can't use Kubectl - I'm writing a code that does that based on what kubectl does.
As we have discussed in comment section I have mentioned that to check any object and processes in Kubernetes you have to use kubectl - see: kubernetes.io/docs/reference/kubectl/overview.
Take a look how to execute proper command to gain required information - kubectl-rollout.
If you want to check how rollout process looks from backgroud look at the source code - src-code-rollout-kubernetes.
Pay attention on that if you are using node-api:
The node-api group was migrated to a built-in API in the > k8s.io/api repo with the v1.14
release. This repo is no longer maintained, and no longer synced
with core kubernetes as of the v1.18 release.
I often use 2 following command for check out deployment status
kubectl describe deployment <your-deployment-name>
kubectl get deployment <your-deployment-name> -oyaml
The first will show you some events about process of schedule a deployment.
The second is more detailed. It contains all of your deployment's resource info as yaml format.
Is that enough for your need ?
After digging through k8s source code, I was able to implement this logic by my self in Node.js:
How to make sure the new deployment succeed?
How to make the the new deployment failed?
https://github.com/stavalfi/era-ci/blob/master/packages/steps/src/k8s/utils.ts#L387
Basically, I'm subscribing to events about a specific deplyoment (AFTER chancing something in it, for example, the image).
Is it safe to assume that if the spec/containers/0/image changes to something different than what I'm expecting, it means there is a new deployment and I should stop watching?
Yes. But https://github.com/stavalfi/era-ci/blob/master/packages/steps/src/k8s/utils.ts#L62 will help also to idenfity that there is a new deployment going on and the yours is no longer the "latest-deployment".
For More Info
I wrote an answer about how deployment works under the hood: https://stackoverflow.com/a/66092577/806963
Is there any command to revert back to previous configuration on a resource?
For example, if I have a Service kind resource created declaratively, and then I change the ports manually, how can I discard live changes so the original definition that created the resource is reapplied?
Is there any tracking on previous applied configs? it could be even nicer if we could say: reconfigure my service to current appied config - 2 versions.
EDIT: I know deployments have rollout options, but I am wondering about a Kind-wise mechanism
Since you're asking explicitly about the last-applied-configuration annotation...
Very simple:
kubectl apply view-last-applied deployment/foobar-module | kubectl apply -f -
Given that apply composes via stdin ever so flexibly — there's no dedicated kubectl apply revert-to-last-applied subcommand, as it'd be redundant reimplementation of the simple pipe above.
One could also suspect, that such a revert built-in could never be made perfect, (as Nick_Kh notices) for complicated reasons. A subcommand named revert evokes a lot of expectation from users which it would never fulfill.
So we get a simplified approximation: a spec.bak saved in resource annotations, ready to be re-apply'd.
Actually, Kubernetes does not support rollback option for the inherent resources besides Deployments and DaemonSets.
However, you can consider to use Helm, which is a well known package manager for Kubernetes. Helm provides a mechanism for restoring previous state for your package release and includes all entire object resources to be reverted.
This feature Helm represents with helm rollback command:
helm rollback [flags] [RELEASE] [REVISION]
Full command options you can find in the official Helm Documentation.
I've got a local deployment system that is mirroring our production system. Both are deployed by calling kubectl apply -f deployments-and-services.yaml
I'm tagging all builds with the current git hash, which means that for clean deploys to GKE, all the services have a new docker image tag which means that apply will restart them, but locally to minikube the tag is often not changing which means that new code is not run. Before I was working around this by calling kubectl delete and then kubectl create for deploying to minikube, but as the number of services I'm deploying has increased, that is starting to stretch the dev cycle too far.
Ideally, I'd like a better way to tell kubectl apply to restart a deployment rather than just depending on the tag?
I'm curious how people have been approaching this problem.
Additionally, I'm building everything with bazel which means that I have to be pretty explicit about setting up my build commands. I'm thinking maybe I should switch to just delete/creating the one service I'm working on and leave the others running.
But in that case, maybe I should just look at telepresence and run the service I'm dev'ing on outside of minikube all together? What are best practices here?
I'm not entirely sure I understood your question but that may very well be my reading comprehension :)
In any case here's a few thoughts that popped up while reading this (again not sure what you're trying to accomplish)
Option 1: maybe what you're looking for is to scale down and back up, i.e. scale your deployment to say 0 and then back up, given you're using configmap and maybe you only want to update that, the command would be kubectl scale --replicas=0 -f foo.yaml and then back to whatever
Option 2: if you want to apply the deployment and not kill any pods for example, you would use the cascade=false (google it)
Option 3: lookup the rollout option to manage deployments, not sure if it works on services though
Finally, and that's only me talking, share some more details like which version of k8s are you using? maybe provide an actual use case example to better describe the issue.
Kubernetes, only triggers a deployment when something has changed, if you have image pull policy to always you can delete your pods to get the new image, if you want kube to handle the deployment you can update the kubernetes yaml file to container a constantly changing metadata field (I use seconds since epoch) which will trigger a change. Ideally you should be tagging your images with unique tags from your CI/CD pipeline with the commit reference they have been built from. this gets around this issue and allows you to take full advantage of the kubernetes rollback feature.