Kubernetes 1.18 on EKS StartupProbe with old ServiceAccountName - kubernetes

I've got a deployment which worked just fine on K8S 1.17 on EKS. After upgrading K8S to 1.18, I tried to use startupProbe feature with a simple deployment. Everything works as expected. But when I tried to add the startupProbe to my production deployment, it didn't work. The cluster simply drops the startupProbe entry when creating pods (the startupProbe entry exists in deployment object definition on the cluster though). Interestingly when I change the serviceAccountName entry to default (instead of my application service account) in the deployment manifest, everything works as expected.
So the question now is, why existing service accounts can't have startup probes?
Thanks.

Posting this as a community member answer. Feel free to expand.
Issue
startupProbe is not applied to Pod if serviceAccountName is set
When adding serviceAccountName and startupProbeto the pod template in my deployment, the resulting pods will not have a startup probe.
There is github issue about that.
Solution
This issue is being addressed here, currently it is still open and there is no specific answer for this.
As mentioned by #mcristina422
I think this is due to the old version of k8s.io/api being used in the webhook. The API for the startup probe was added more recently. Updating the k8s packages should fix this

Related

Any way we can add an ENV to a pod or a new pod in kubernetes?

Summarize the problem:
Any way we can add an ENV to a pod or a new pod in kubernetes?
For example, I want to add HTTP_PROXY to many pods and the new pods it will generate in kubeflow 1.4. So these pods can be access to internet.
Describe what you’ve tried:
I searched and found istio maybe do that, but it's too complex for me.
The second, there are too many yamls in kubeflow, as to I cannot modify them one by one to use configmap or add ENV just in them.
So anyone has a good simle way to do this? Like doing this in kubernetes configuation.
Use "PodPreset" object to inject common environment variables and other params to all the matching pods.
Please follow below article
https://v1-19.docs.kubernetes.io/docs/tasks/inject-data-application/podpreset/
If PodPreset is indeed removed from v1.20, then you seem to need a webhook.
You will have to run an additional service in your cluster that will change the configuration of the pods.
Here is an example, on the basis of which I created my webhook, which changed the configuration of the pods in the cluster, in this example the developer used the logic adding a sidecar to the pod, but you can set your own to forward the required ENV:
https://github.com/morvencao/kube-mutating-webhook-tutorial/blob/master/medium-article.md

AKS not deleting orphaned resources

After some time, I have problems with some of our clusters where auto-delete of orphaned resources stop working. So if I remove a deployment nor the replicaset or the pods are removed, or if I remove a replicaset, a new one is created but the previous pods are still there.
I can't even update some deployments because that will create a new replicaset+pods.
This is an actual problem as we are creating and removing some resources and relying on auto-child removal.
The thing is that, destroying and creating again a cluster makes it working perfectly and we weren't able to trace to something we did that caused the problem.
I tried to upgrade both master and agent nodes to a newer version and restarting kubelet in agent nodes but that doesn't solve anything.
Could anyone knows where could be the problem or which component is in charge of the cascade deletion of orphan resources?
Does this happen to someone else? It happend to us already in 3 different clusters with different Kubernetes version.
I have tested it creating the test deployment in K8s documentation, and then delete it:
kubectl apply -f https://k8s.io/examples/application/deployment.yaml
kubectl delete deployments.apps nginx-deployment
But the pods are still there.
Thanks in advance
The problem was caused by a faulty CRD / Admission Webhook. It could seem strange, but a wrong CRD or a faulty pod acting as webhook will make kube-controller-manager fail for all resources (at least in AKS). After removing the CRD's and the faulty webhook it started to work again. (The reason why the webhook was failing is another different thing)

One Traefik Pod in Kubernetes fails with error: 'command traefik error: field not found, node: redirect'

I'm running Traefik on a Kubernetes cluster to manage Ingress, which has been running ok for a long time.
I recently implemented Cluster-Autoscaling, which works fine except that on one Node (newly created by the Autoscaler) Traefik won't start. It sits in CrashLoopBackoff, and when I log the Pod I get: [date] [time] command traefik error: field not found, node: redirect.
Google found no relevant results, and the error itself is not very descriptive, so I'm not sure where to look.
My best guess is that it has something to do with the RedirectRegex Middleware configured in Traefik's config file:
[entryPoints.http.redirect]
regex = "^http://(.+)(:80)?/(.*)"
replacement = "https://$1/$3"
Traefik actually works still - I can still access all of my apps from their urls in my browser, even those which are on the node with the dead Traefik Pod.
The other Traefik Pods on other Nodes still run happily, and the Nodes are (at least in theory) identical.
After further googling, I found this on Reddit. Turns out Traefik updated a few days ago to v2.0, which is not backwards compatible.
Only this pod had the issue, because it was the only one for which a new (v2.0) image was pulled (being the only recently created Node).
I reverted to v1.7 until I have time to fix it properly. Had update the Daemonset to use v1.7, then kill the Pod so it could be recreated from the old image.
The devs have a Migration Guide that looks like it may help.
"redirect" is gone but now there is "RedirectScheme" and "RedirectRegex" as a new concept of "Middlewares".
It looks like they are moving to a pipeline approach, so you can define a chain of "middlewares" to apply to an "entrypoint" to decide how to direct it and what to add/remove/modify on packets in that chain. "backends" are now "providers", and they have a clearer, modular concept of configuration. It looks like it will offer better organization than earlier versions.

Is it possible to add/modify kubernetes container spec based on clusterwide setting

I have a kubernetes-based application that uses an operator to build and deploy containers in pods. Sometimes I'd like to run containers in privileged mode to enable performance tracing, but since I'm not deploying the pod/containers directly from a manifest, I cannot simply add privileged mode and the debugfs filesystem mount.
That leaves me to fork the operator code, change where it builds the container spec, and redeploy with the modified operator. Doable, but awkward.
So my question is, is it possible to impose additional attributes to be added to container specs based on some clusterwide setting, either before pods are deployed by the operator? Or to modify the container spec after deployment? I tried that with kubectl edit pod mypod, but that didn't work.
This is on a physical cluster installed with kubespray.
There are three things to consider:
Your operator can create a controller (e.g. Deployment) instead of Pod, which allows modifications in the Pod Spec area, thus triggering Deployment's rollout (see rolling update strategy).
Use MutatingAdmissionWebhook
so before creating the Pod, its manifest would be modified/overwritten on the fly.
More info regarding MutatingAdmissionWebhook can be found here and here.
A workaround solution in a form of modifying the supply spec -> swapping the pod-a.
More about this was discussed here.
Please let me know if any of the above helped.

Spring Cloud Data Flow + Kubernetes, asking for the task pod to be deployed on non-default namespaces

I have a setup with scdf-server on kubernetes working fine, it deploys each task in an on-demand pod on the very same default namespace, the one that hosts the scdf-server pod.
Now, I need to deploy a pod in another namespace and I can't find the argument/property to use in the scdf server dashboard for the pod to be created in the given namespace. Does anybody know how to find that? I tried spring.cloud.deployer.kubernetes.namespace, deployer.kubernetes.namespace, spring.cloud.deployer.kubernetes.environmentVariables, deployer.<app>.kubernetes.namespace, spring.cloud.dataflow.task.platform.kubernetes.namespace, scheduler.kubernetes.environmentVariables SPRING_CLOUD_SCHEDULER_KUBERNETES_NAMESPACE... as both 'properties' and 'arguments' text boxes...
This seems like a duplicate thread that was posted in SCDF gitter channel. The properties were described and pointed out in the commentary - more details here.