resetting restartCount of pods with static names - kubernetes

For monitoring purposes, I want to rely on a pod's restartCount. However, I cannot seem to do that for certain apps, as restartCount is not reset even after rebooting the whole node the pod is scheduled to run on.
Usually, restarting a pod resets this, unless the pod name of the restarted pod is the same (e.g. true for etcd, kube-controller-manager, kube-scheduler and kube-apiserver).
For those cases, there is a longrunning minor issue as well as the idea to use kubectl patch.
To sum up the info there, kubectl edit will not allow to change anything in status. Unfortunately, neither does e.g.
kubectl -n kube-system patch pod kube-controller-manager-some.node.name --type='json' -p='[{"op": "replace", "path": "/status/containerStatuses/0/restartCount", "value": 14}]'
The Pod "kube-controller-manager-some.node.name" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)
So, I am checking if anyone has found a workaround?
Thanks!
Robert

This seems to be quite an old issue (2017). Take a look here.
I believe the solution to it was supposed to be implementing unique UIDs for static pods. This issue got reopened few days ago as another github issue and hasn't been implemented to this day.
I have found a workaround for it. You need to change static pod manifest file e.g. by adding some random annotation to pod.
Let me know if it was helpful.

Related

how to take a problematic pod offline to troubleshoot

HI I know there's a way i can pull out a problematic node out of loadbalancer to troubleshoot. But how can i pull a pod out of service to troubleshoot. What tools or command can do it ?
Change its labels so they no longer matches the selector: in the Service; we used to do that all the time. You can even put it back into rotation if you want to test a hypothesis. I don't recall exactly how quickly it takes effect, but I would guess "real quick" is a good approximation. :-)
## for example:
$ kubectl label pod $the_pod -app.kubernetes.io/name
## or, change it to non-matching
$ kubectl label pod $the_pod app.kubernetes.io/name=i-am-debugging-this-pod
As mentioned in Oreilly's "Kubernetes recipes: Maintenance and troubleshooting" page here
Removing a Pod from a Service
Problem
You have a well-defined service (see not available) backed by several
pods. But one of the pods is misbehaving, and you would like to take
it out of the list of endpoints to examine it at a later time.
Solution
Relabel the pod using the --overwrite option—this will allow you to
change the value of the run label on the pod. By overwriting this
label, you can ensure that it will not be selected by the service
selector (not available) and will be removed from the list of
endpoints. At the same time, the replica set watching over your pods
will see that a pod has disappeared and will start a new replica.
To see this in action, start with a straightforward deployment
generated with kubectl run (see not available):
For commands, check the recipes page mentioned above. There is also a section talking about "Debugging Pods" which will be helpful

Editing Kubernetes pod on-the-fly

For the debug and testing purposes I'd like to find a most convenient way launching Kubernetes pods and altering its specification on-the-fly.
The launching part is quite easy with imperative commands.
Running
kubectl run nginx-test --image nginx --restart=Never
gives me exactly what I want: the single pod not managed by any controller like Deployment or ReplicaSet. Easy to play with and cleanup when it needed.
However when I'm trying to edit the spec with
kubectl edit po nginx-test
I'm getting the following warning:
pods "nginx-test" was not valid:
* spec: Forbidden: pod updates may not change fields other than spec.containers[*].image, spec.initContainers[*].image, spec.activeDeadlineSeconds or spec.tolerations (only additions to existing tolerations)
i.e. only the limited set of Pod spec is editable at runtime.
OPTIONS FOUND SO FAR:
Getting Pod spec saved into the file:
kubectl get po nginx-test -oyaml > nginx-test.yaml
edited and recreated with
kubectl apply -f
A bit heavy weight for changing just one field though.
Creating a Deployment not single Pod and then editing spec section in Deployment itself.
The cons are:
additional API object needed (Deployment) which you should not forget to cleanup when you are done
the Pod names are autogenerated in the form of nginx-test-xxxxxxxxx-xxxx and less
convenient to work with.
So is there any simpler option (or possibly some elegant workaround) of editing arbitrary field in the Pod spec?
I would appreciate any suggestion.
You should absolutely use a Deployment here.
For the use case you're describing, most of the interesting fields on a Pod cannot be updated, so you need to manually delete and recreate the pod yourself. A Deployment manages that for you. If a Deployment owns a Pod, and you delete the Deployment, Kubernetes knows on its own to delete the matching Pod, so there's not really any more work.
(There's not really any reason to want a bare pod; you almost always want one of the higher-level controllers. The one exception I can think of is kubectl run a debugging shell inside the cluster.)
The Pod name being generated can be a minor hassle. One trick that's useful here: as of reasonably recent kubectl, you can give the deployment name to commands like kubectl logs
kubectl logs deployment/nginx-test
There are also various "dashboard" type tools out there that will let you browse your current set of pods, so you can do things like read logs without having to copy-and-paste the full pod name. You may also be able to set up tab completion for kubectl, and type
kubectl logs nginx-test<TAB>

Best practices for upgrading static pod in kubernetes

We have deployed etcd of k8s using static pod, it's 3 of them. We want to upgrade pod to define some labels and readiness probe for them. I have searched but found no questions/article mentioned. So I'd like to know the best practice for upgrading static pod.
For example, I found modifying yaml file directly may result pod unscheduled for a long time, maybe I should remove the old file and create a new file?
You need to recreate the pod if you want to define readiness probe for it, for labels an edit should suffice.
Following error is thrown by Kubernetes if editing readinessProbe:
# * spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)
See also https://stackoverflow.com/a/40363057/499839
Have you considered using DaemonSets? https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/

pods keep creating themselves even I deleted all deployments

I am running k8s on aws, and I updated the deployment of nginx - which normally, it works fine-, but after this time, the nginx deployment won't show up in "kubectl get deployments".
I want to kill all the pods related to nginx, but they keep reproduce themselves. I deleted all deployments "kubectl delete --all deployments", other pods just got terminated, but not nginx.
I have no idea where I can stop the pods recreating.
any idea where to start ?
check the deployment, replication controller and replica set and remove them.
kubectl get deploy,rc,rs
In modern kubernetes, there is also an annotation kubernetes.io/created-by on the Pod showing its "owner", as seen here, but I can't lay my hands on the documentation link right now. However, I found a pastebin containing a concrete example of the contents of the annotation

Kubernetes deployments: Editing the 'spec' of a pod's YAML file fails

The env element added in spec.containers of a pod using K8 dashboard's Edit doesn't get saved. Does anyone know what the problem is?
Is there any other way to add environment variables to pods/containers?
I get this error when doing the same by editing the file using nano:
# pods "EXAMPLE" was not valid:
# * spec: Forbidden: pod updates may not change fields other than `containers[*].image` or `spec.activeDeadlineSeconds`
Thanks.
Not all fields can be updated. This fact is sometimes mentioned in the kubectl explain output for the object (and the error you got lists the fields that can be changed, so the others probably cannot).:
$ kubectl explain pod.spec.containers.env
RESOURCE: env <[]Object>
DESCRIPTION:
List of environment variables to set in the container. Cannot be updated.
EnvVar represents an environment variable present in a Container.
If you deploy your Pods using a Deployment object, then you can change the environment variables in that object with kubectl edit since the Deployment will roll out updated versions of the Pod(s) that have the variable changes and kill the older Pods that do not. Obviously, that method is not changing the Pod in place, but it is one way to get what you need.
Another option for you may be to use ConfigMaps. If you use the volume plugin method for mounting the ConfigMap and your application is written to be aware of changes to the volume and reload itself with new settings on change, it may be an option (or at least give you other ideas that may work for you).
We cannot edit env variables, resource limit, service account of a pod that is running live.
But definitely, we can edit/update image name, toleration and active deadline seconds,, etc.
However, the "deployment" can be easily edited because "pod" is a child template of deployment specification.
In order to "edit" the running pod with desired changes, the following approach can be used.
Extract the pod definition to a file, Make necessary changes, Delete the existing pod, and Create a new pod from the edited file:
kubectl get pod my-pod -o yaml > my-new-pod.yaml
vi my-new-pod.yaml
kubectl delete pod my-pod
kubectl create -f my-new-pod.yaml
Not sure about others but when I edited the pod YAML from google Kubernetes Engine workloads page, the same error came to me. But if I retry after some time it worked.
feels like some update was going on at the same time earlier, so I try to edit YAML fast and apply the changes and it worked.