Delete successful jobs with Ansible kubernetes.core.k8s module? - kubernetes

I'm trying to delete Kubernetes successful jobs with Ansible kubernetes.core.k8s module.
Job:
apiVersion: v1
kind: Pod
metadata:
name: helm-install-traefik-crd-n2gbz
generateName: helm-install-traefik-crd-
namespace: kube-system
uid: 8615f527-e6fa-4d48-af5a-8b087d6d229a
resourceVersion: '2218'
creationTimestamp: '2023-02-14T02:35:30Z'
labels:
controller-uid: 032b353d-24d7-4e8c-a5a6-f77bbf949a36
helmcharts.helm.cattle.io/chart: traefik-crd
job-name: helm-install-traefik-crd
ownerReferences:
- apiVersion: batch/v1
kind: Job
name: helm-install-traefik-crd
uid: 032b353d-24d7-4e8c-a5a6-f77bbf949a36
controller: true
blockOwnerDeletion: true
There are multiple jobs to be deleted, each with different pod names, so I tried:
- name: Get pod info
kubernetes.core.k8s_info:
api_version: v1
kind: Pod
label_selectors:
- helmcharts.helm.cattle.io/chart: traefik-crd
- job-name: helm-install-traefik-crd
namespace: kube-system
What is the correct format for label_selectors? I could not find any documentation examples.
Ideally, I would like to use kubernetes.core.k8s_info and get the pod names with label_selectors, then use that list of pod names with kubernetes.core.k8s to delete them.

label and label value should be separated by =:
kind: Pod
label_selectors:
- helmcharts.helm.cattle.io/chart = traefik-crd

Related

CronJob Pod with RBAC Role via Serviceaccount Keeps Throwing Forbidden Error

I want to run patching of statefulsets for a specific use case from a Pod via a cronjob. To do so I created the following plan with a custom service account, role and rolebinding to permit the Pod access to the apps api group with the patch verb but I keep running into the following error:
Error from server (Forbidden): statefulsets.apps "test-statefulset" is forbidden: User "system:serviceaccount:test-namespace:test-serviceaccount" cannot get resource "statefulsets" in API group "apps" in the namespace "test-namespace"
my k8s plan:
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
env: test
name: test-serviceaccount
namespace: test-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
env: test
name: test-role
namespace: test-namespace
rules:
- apiGroups:
- apps/v1
resourceNames:
- test-statefulset
resources:
- statefulsets
verbs:
- patch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
name: test-binding
namespace: test-namespace
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: test-role
subjects:
- kind: ServiceAccount
name: test-serviceaccount
namespace: test-namespace
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
labels:
name:test-job
namespace: test-namespace
spec:
concurrencyPolicy: Forbid
failedJobsHistoryLimit: 3
jobTemplate:
metadata:
labels:
env: test
spec:
activeDeadlineSeconds: 900
backoffLimit: 1
parallelism: 1
template:
metadata:
labels:
env: test
spec:
containers:
- args:
- kubectl -n test-namespace patch statefulset test-statefulset -p '{"spec":{"replicas":0}}'
- kubectl -n test-namespace patch statefulset test-statefulset -p '{"spec":{"replicas":1}}'
command:
- /bin/sh
- -c
image: bitnami/kubectl
restartPolicy: Never
serviceAccountName: test-serviceaccount
schedule: '*/5 * * * *'
startingDeadlineSeconds: 300
successfulJobsHistoryLimit: 3
suspend: false
So far to debug:
I have checked if the pod and serviceaccount association worked as expected and it looks like it did. I see the name of secret mounted on the Pod the cronjob starts is correct.
Used a simpler role where apiGroups was "" i.e. all core groups and tried to "get pods" from that pod, same error
role description:
Name: test-role
Labels: env=test
Annotations: <none>
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
statefulsets.apps/v1 [] [test-statefulset] [patch]
rolebinding description:
Name: test-binding
Labels: env=test
Annotations: <none>
Role:
Kind: Role
Name: test-role
Subjects:
Kind Name Namespace
---- ---- ---------
ServiceAccount test-serviceaccount test-namespace
Stateful sets need two verbs to apply a patch :
GET and PATCH. PATCH alone wont work

Time-based scaling with Kubernetes CronJob: How to avoid deployments overriding minReplicas

I have a HorizontalPodAutoscalar to scale my pods based on CPU. The minReplicas here is set to 5:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: myapp-web
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp-web
minReplicas: 5
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
I've then added Cron jobs to scale up/down my horizontal pod autoscaler based on time of day:
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: production
name: cron-runner
rules:
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["patch", "get"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: cron-runner
namespace: production
subjects:
- kind: ServiceAccount
name: sa-cron-runner
namespace: production
roleRef:
kind: Role
name: cron-runner
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa-cron-runner
namespace: production
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: django-scale-up-job
namespace: production
spec:
schedule: "56 11 * * 1-6"
successfulJobsHistoryLimit: 0 # Remove after successful completion
failedJobsHistoryLimit: 1 # Retain failed so that we see it
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
serviceAccountName: sa-cron-runner
containers:
- name: django-scale-up-job
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- kubectl patch hpa myapp-web --patch '{"spec":{"minReplicas":8}}'
restartPolicy: OnFailure
----
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: django-scale-down-job
namespace: production
spec:
schedule: "30 20 * * 1-6"
concurrencyPolicy: Forbid
successfulJobsHistoryLimit: 0 # Remove after successful completion
failedJobsHistoryLimit: 1 # Retain failed so that we see it
jobTemplate:
spec:
template:
spec:
serviceAccountName: sa-cron-runner
containers:
- name: django-scale-down-job
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- kubectl patch hpa myapp-web --patch '{"spec":{"minReplicas":5}}'
restartPolicy: OnFailure
This works really well, except that now when I deploy it overwrites this minReplicas value with the minReplicas in the HorizontalPodAutoscaler spec (in my case, this is set to 5)
I'm deploying my HPA using kubectl apply -f ~/autoscale.yaml
Is there a way of handling this situation? Do I need to create some kind of shared logic so that my deployment scripts can work out what the minReplicas value should be? Or is there a simpler way of handling this?
I think you could also consider the following two options:
Use helm to manage the life-cycle of your application with lookup function:
The main idea behind this solution is to query the state of specific cluster resource (here HPA) before trying to create/recreate it with helm install/upgrade commands.
Helm.sh: Docs: Chart template guide: Functions and pipelines: Using the lookup function
I mean to check the current minReplicas value each time before you upgrade your application stack.
Manage the HPA resource separately to application manifest files
Here you can handover this task to a dedicated HPA operator, which can coexist with your CronJobs that adjust minReplicas according specific schedule:
Banzaicloud.com: Blog: K8S HPA Operator

How to correctly update apiVersion of manifests prior to cluster upgrade?

So I did update the manifest and replaced apiVersion: extensions/v1beta1 to apiVersion: apps/v1
apiVersion: apps/v1
kind: Deployment
metadata:
name: secretmanager
namespace: kube-system
spec:
selector:
matchLabels:
app: secretmanager
template:
metadata:
labels:
app: secretmanager
spec:
...
I then applied the change
k apply -f deployment.yaml
Warning: kubectl apply should be used on resource created by either kubectl create --save-config or kubectl apply
deployment.apps/secretmanager configured
I also tried
k replace --force -f deployment.yaml
That recreated the POD (downtime :( ) but still if you try to output the yaml config of the deployment I see the old value
k get deployments -n kube-system secretmanager -o yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "1"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"apps/v1","kind":"Deployment",
"metadata":{"annotations":{},"name":"secretmanager","namespace":"kube-system"}....}
creationTimestamp: "2020-08-21T21:43:21Z"
generation: 2
name: secretmanager
namespace: kube-system
resourceVersion: "99352965"
selfLink: /apis/extensions/v1beta1/namespaces/kube-system/deployments/secretmanager
uid: 3d49aeb5-08a0-47c8-aac8-78da98d4c342
spec:
So I still see this apiVersion: extensions/v1beta1
What I am doing wrong?
I am preparing eks kubernetes v1.15 to be migrated over to v1.16
The Deployment exists in multiple apiGroups, so it is ambiguous. Try to specify e.g. apps/v1 with:
kubectl get deployments.v1.apps
and you should see your Deployment but with apps/v1 apiGroup.

How to run kubectl within a job in a namespace?

Hi I saw this documentation where kubectl can run inside a pod in the default pod.
Is it possible to run kubectl inside a Job resource in a specified namespace?
Did not see any documentation or examples for the same..
When I tried adding serviceAccounts to the container i got the error:
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:my-namespace:internal-kubectl" cannot list resource "pods" in API group "" in the namespace "my-namespace"
This was when i tried sshing into the container and running the kubctl.
Editing the question.....
As I mentioned earlier, based on the documentation I had added the service Accounts, Below is the yaml:
apiVersion: v1
kind: ServiceAccount
metadata:
name: internal-kubectl
namespace: my-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: modify-pods
namespace: my-namespace
rules:
- apiGroups: [""]
resources:
- pods
verbs:
- get
- list
- delete
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: modify-pods-to-sa
namespace: my-namespace
subjects:
- kind: ServiceAccount
name: internal-kubectl
roleRef:
kind: Role
name: modify-pods
apiGroup: rbac.authorization.k8s.io
---
apiVersion: batch/v1
kind: Job
metadata:
name: testing-stuff
namespace: my-namespace
spec:
template:
metadata:
name: testing-stuff
spec:
serviceAccountName: internal-kubectl
containers:
- name: tester
image: bitnami/kubectl
command:
- "bin/bash"
- "-c"
- "kubectl get pods"
restartPolicy: Never
On running the job, I get the error:
Error from server (Forbidden): pods is forbidden: User "system:serviceaccount:my-namespace:internal-kubectl" cannot list resource "pods" in API group "" in the namespace "my-namespace"
Is it possible to run kubectl inside a Job resource in a specified namespace? Did not see any documentation or examples for the same..
A Job creates one or more Pods and ensures that a specified number of them successfully terminate. It means the permission aspect is the same as in a normal pod, meaning that yes, it is possible to run kubectl inside a job resource.
TL;DR:
Your yaml file is correct, maybe there were something else in your cluster, I recommend deleting and recreating these resources and try again.
Also check the version of your Kubernetes installation and job image's kubectl version, if they are more than 1 minor-version apart, you may have unexpected incompatibilities
Security Considerations:
Your job role's scope is the best practice according to documentation (specific role, to specific user on specific namespace).
If you use a ClusterRoleBinding with the cluster-admin role it will work, but it's over permissioned, and not recommended since it's giving full admin control over the entire cluster.
Test Environment:
I deployed your config on a kubernetes 1.17.3 and run the job with bitnami/kubectl and bitnami/kubectl:1:17.3. It worked on both cases.
In order to avoid incompatibility, use the kubectl with matching version with your server.
Reproduction:
$ cat job-kubectl.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: testing-stuff
namespace: my-namespace
spec:
template:
metadata:
name: testing-stuff
spec:
serviceAccountName: internal-kubectl
containers:
- name: tester
image: bitnami/kubectl:1.17.3
command:
- "bin/bash"
- "-c"
- "kubectl get pods -n my-namespace"
restartPolicy: Never
$ cat job-svc-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: internal-kubectl
namespace: my-namespace
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: modify-pods
namespace: my-namespace
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: modify-pods-to-sa
namespace: my-namespace
subjects:
- kind: ServiceAccount
name: internal-kubectl
roleRef:
kind: Role
name: modify-pods
apiGroup: rbac.authorization.k8s.io
I created two pods just to add output to the log of get pods.
$ kubectl run curl --image=radial/busyboxplus:curl -i --tty --namespace my-namespace
the pod is running
$ kubectl run ubuntu --generator=run-pod/v1 --image=ubuntu -n my-namespace
pod/ubuntu created
Then I apply the job, ServiceAccount, Role and RoleBinding
$ kubectl get pods -n my-namespace
NAME READY STATUS RESTARTS AGE
curl-69c656fd45-l5x2s 1/1 Running 1 88s
testing-stuff-ddpvf 0/1 Completed 0 13s
ubuntu 0/1 Completed 3 63s
Now let's check the testing-stuff pod log to see if it logged the command output:
$ kubectl logs testing-stuff-ddpvf -n my-namespace
NAME READY STATUS RESTARTS AGE
curl-69c656fd45-l5x2s 1/1 Running 1 76s
testing-stuff-ddpvf 1/1 Running 0 1s
ubuntu 1/1 Running 3 51s
As you can see, it has succeeded running the job with the custom ServiceAccount.
Let me know if you have further questions about this case.
Create service account like this.
apiVersion: v1
kind: ServiceAccount
metadata:
name: internal-kubectl
Create ClusterRoleBinding using this.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: modify-pods-to-sa
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: internal-kubectl
Now create pod with same config that are given at Documentation.
When you use kubectl from the pod for any operation such as getting pod or creating roles and role bindings it will use the default service account. This service account don't have permission to perform those operations by default. So you need to
create a service account, role and rolebinding using a more privileged account.You should have a kubeconfig file with admin privilege or admin like privilege. Use that kubeconfig file with kubectl from outside the pod to create the service account, role, rolebinding etc.
After that is done create pod by specifying that service account and you should be able perform operations which are defined in the role from within this pod using kubectl and the service account.
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
serviceAccountName: internal-kubectl

How to delete Kubernetes job automatically after job completion

I am running a kubernetes job on GKE and want to delete the job automatically after the job is completed.
Here is my configuration file for the job.
I set ttlSecondsAfterFinished: 0 but the job was not deleted automatically.
Am I missing something?
cluster / node version: 1.12.8-gke.10
apiVersion: batch/v1
kind: Job
metadata:
name: myjob
spec:
# automatically clean up finished job
ttlSecondsAfterFinished: 0
template:
metadata:
name: myjob
spec:
containers:
- name: myjob
image: gcr.io/GCP_PROJECT/myimage:COMMIT_SHA
command: ["bash"]
args: ["deploy.sh"]
# Do not restart containers after they exit
restartPolicy: Never
Looks like this feature is still not available on GKE now.
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
https://cloud.google.com/kubernetes-engine/docs/concepts/alpha-clusters#about_feature_stages
To ensure stability and production quality, normal GKE clusters only enable features that
are beta or higher. Alpha features are not enabled on normal clusters because they are not
production-ready or upgradeable.
It depends how did you create job.
If you are using CronJob you can use spec.successfulJobsHistoryLimit and spec.failedJobsHistoryLimit and set values to 0. It will say K8s to not sotre any previously finished jobs.
If you are creating pods using YAMLs you have to deleted it manually. However you can also set CronJob to execute command each 5 minutes.
kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(#.status.succeeded==1)].metadata.name}')
It will delete all jobs with status succeded.
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: jp-runner
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["get", "list", "delete"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: jp-runner
subjects:
- kind: ServiceAccount
name: sa-jp-runner
roleRef:
kind: Role
name: jp-runner
apiGroup: ""
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: sa-jp-runner
---
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: clean-jobs
spec:
concurrencyPolicy: Forbid
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
serviceAccountName: sa-jp-runner
containers:
- name: clean-jobs
image: bitnami/kubectl:latest
command:
- /bin/sh
- -c
- kubectl delete jobs $(kubectl get jobs -o=jsonpath='{.items[?(#.status.succeeded==1)].metadata.name}')
restartPolicy: Never
backoffLimit: 0