how to deploy a statefulset when a pod security policy is in place in kubernetes - kubernetes

I am trying to play around with PodSecurityPolicies in kubernetes so pods can't be created if they are using the root user.
This is my psp definition:
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: eks.restrictive
spec:
hostNetwork: false
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: MustRunAsNonRoot
fsGroup:
rule: RunAsAny
volumes:
- '*'
and this is my statefulset definition
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx # has to match .spec.template.metadata.labels
serviceName: "nginx"
replicas: 3 # by default is 1
template:
metadata:
labels:
app: nginx # has to match .spec.selector.matchLabels
spec:
securityContext:
#only takes integers.
runAsUser: 1000
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: k8s.gcr.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "my-storage-class"
resources:
requests:
storage: 1Gi
When trying to create this statefulset I get
create Pod web-0 in StatefulSet web failed error: pods "web-0" is forbidden: unable to validate against any pod security policy:
It doesn't specify what policy am I violating, and since I am specifying I want to run this on user 1000, I am not running this as root (Hence my understanding is that this statefulset pod definition is not violating any rules defined in the PSP). There is no USER specified in the Dockerfile used for this image.
The other weird part, is that this works fine for standard pods (kind: Pod, instead of kind:Statefulset), for example, this works just fine, when the same PSP exists:
apiVersion: v1
kind: Pod
metadata:
name: my-nodejs
spec:
securityContext:
runAsUser: 1000
containers:
- name: my-node
image: node
ports:
- name: web
containerPort: 80
protocol: TCP
command:
- /bin/sh
- -c
- |
npm install http-server-g
npx http-server
What am I missing / doing wrong?

You seems to have forgotten about binding this psp to a service account.
You need to apply the following:
cat << EOF | kubectl apply -f-
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: psp-role
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['use']
resourceNames:
- eks.restrictive
EOF
cat << EOF | kubectl apply -f-
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: psp-role-binding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: psp-role
subjects:
- kind: ServiceAccount
name: default
namespace: default
EOF
If you dont want to use the default account you can create a separate service account and bind the role to it.
Read more about it k8s documentation - pod-security-policy.

Related

Populating a Containers environment values with mounted configMap in Kubernetes

I'm currently learning Kubernetes and recently learnt about using ConfigMaps for a Containers environment variables.
Let's say I have the following simple ConfigMap:
apiVersion: v1
data:
MYSQL_ROOT_PASSWORD: password
kind: ConfigMap
metadata:
creationTimestamp: null
name: mycm
I know that a container of some deployment can consume this environment variable via:
kubectl set env deployment mydb --from=configmap/mycm
or by specifying it manually in the manifest like so:
containers:
- env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
configMapKeyRef:
key: MYSQL_ROOT_PASSWORD
name: mycm
However, this isn't what I am after, since I'd to manually change the environment variables each time the ConfigMap changes.
I am aware that mounting a ConfigMap to the Pod's volume allows for the auto-updating of ConfigMap values. I'm currently trying to find a way to set a Container's environment variables to those stored in the mounted config map.
So far I have the following YAML manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: mydb
name: mydb
spec:
replicas: 1
selector:
matchLabels:
app: mydb
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: mydb
spec:
containers:
- image: mariadb
name: mariadb
resources: {}
args: ["export MYSQL_ROOT_PASSWORD=$(cat /etc/config/MYSQL_ROOT_PASSWORD)"]
volumeMounts:
- name: config-volume
mountPath: /etc/config
env:
- name: MYSQL_ROOT_PASSWORD
value: temp
volumes:
- name: config-volume
configMap:
name: mycm
status: {}
I'm attempting to set the MYSQL_ROOT_PASSWORD to some temporary value, and then update it to mounted value as soon as the container starts via args: ["export MYSQL_ROOT_PASSWORD=$(cat /etc/config/MYSQL_ROOT_PASSWORD)"]
As I somewhat expected, this didn't work, resulting in the following error:
/usr/local/bin/docker-entrypoint.sh: line 539: /export MYSQL_ROOT_PASSWORD=$(cat /etc/config/MYSQL_ROOT_PASSWORD): No such file or directory
I assume this is because the volume is mounted after the entrypoint. I tried adding a readiness probe to wait for the mount but this didn't work either:
readinessProbe:
exec:
command: ["sh", "-c", "test -f /etc/config/MYSQL_ROOT_PASSWORD"]
initialDelaySeconds: 5
periodSeconds: 5
Is there any easy way to achieve what I'm trying to do, or is it impossible?
So I managed to find a solution, with a lot of inspiration from this answer.
Essentially, what I did was create a sidecar container based on the alpine K8s image that mounts the configmap and constantly watches for any changes, since the K8s API automatically updates the mounted configmap when the configmap is changed. This required the following script, watch_passwd.sh, which makes use of inotifywait to watch for changes and then uses the K8s API to rollout the changes accordingly:
update_passwd() {
kubectl delete secret mysql-root-passwd > /dev/null 2>&1
kubectl create secret generic mysql-root-passwd --from-file=/etc/config/MYSQL_ROOT_PASSWORD
}
update_passwd
while true
do
inotifywait -e modify "/etc/config/MYSQL_ROOT_PASSWORD"
update_passwd
kubectl rollout restart deployment $1
done
The Dockerfile is then:
FROM docker.io/alpine/k8s:1.25.6
RUN apk update && apk add inotify-tools
COPY watch_passwd.sh .
After building the image (locally in this case) as mysidecar, I create the ServiceAccount, Role, and RoleBinding outlined here, adding rules for deployments so that they can be restarted by the sidecar.
After this, I piece it all together to create the following YAML Manifest (note that imagePullPolicy is set to Never, since I created the image locally):
apiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
app: mydb
name: mydb
spec:
replicas: 3
selector:
matchLabels:
app: mydb
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: mydb
spec:
serviceAccountName: secretmaker
containers:
- image: mysidecar
name: mysidecar
imagePullPolicy: Never
command:
- /bin/sh
- -c
- |
./watch_passwd.sh $(DEPLOYMENT_NAME)
env:
- name: DEPLOYMENT_NAME
valueFrom:
fieldRef:
fieldPath: metadata.labels['app']
volumeMounts:
- name: config-volume
mountPath: /etc/config
- image: mariadb
name: mariadb
resources: {}
envFrom:
- secretRef:
name: mysql-root-passwd
volumes:
- name: config-volume
configMap:
name: mycm
status: {}
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: secretmaker
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
labels:
app: mydb
name: secretmaker
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["create", "get", "delete", "list"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
labels:
app: mydb
name: secretmaker
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: secretmaker
subjects:
- kind: ServiceAccount
name: secretmaker
namespace: default
---
It all works as expected! Hopefully this is able to help someone out in the future. Also, if anybody comes across this and has a better solution please feel free to let me know :)

Kubernetes API to create a CRD using Minikube, with deployment pod in pending state

I have a problem with Kubernetes API and CRD, while creating a deployment with a single nginx pod, i would like to access using port 80 from a remote server, and locally as well. After seeing the pod in a pending state and running the kubectl get pods and then after around 40 seconds on average, the pod disappears, and then a different nginx pod name is starting up, this seems to be in a loop.
The error is
* W1214 23:27:19.542477 1 requestheader_controller.go:193] Unable to get configmap/extension-apiserver-authentication in kube-system. Usually fixed by 'kubectl create rolebinding -n kube-system ROLEBINDING_NAME --role=extension-apiserver-authentication-reader --serviceaccount=YOUR_NS:YOUR_SA'
I was following this article about service accounts and roles,
https://thorsten-hans.com/custom-resource-definitions-with-rbac-for-serviceaccounts#create-the-clusterrolebinding
I am not even sure i have created this correctly?
Do i even need to create the ServiceAccount_v1.yaml, PolicyRule_v1.yaml and ClusterRoleBinding.yaml files to resolve my error above.
All of my .yaml files for this are below,
CustomResourceDefinition_v1.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
# name must match the spec fields below, and be in the form: <plural>.<group>
name: webservers.stable.example.com
spec:
# group name to use for REST API: /apis/<group>/<version>
group: stable.example.com
names:
# kind is normally the CamelCased singular type. Your resource manifests use this.
kind: WebServer
# plural name to be used in the URL: /apis/<group>/<version>/<plural>
plural: webservers
# shortNames allow shorter string to match your resource on the CLI
shortNames:
- ws
# singular name to be used as an alias on the CLI and for display
singular: webserver
# either Namespaced or Cluster
scope: Cluster
# list of versions supported by this CustomResourceDefinition
versions:
- name: v1
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
cronSpec:
type: string
image:
type: string
replicas:
type: integer
# Each version can be enabled/disabled by Served flag.
served: true
# One and only one version must be marked as the storage version.
storage: true
Deployments_v1_apps.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
# Unique key of the Deployment instance
name: nginx-deployment
spec:
# 1 Pods should exist at all times.
replicas: 1
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 100
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
# Apply this label to pods and default
# the Deployment label selector to this value
app: nginx
spec:
containers:
# Run this image
- image: nginx:1.14
name: nginx
ports:
- containerPort: 80
hostname: nginx
nodeName: webserver01
securityContext:
runAsNonRoot: True
#status:
#availableReplicas: 1
Ingress_v1_networking.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
spec:
rules:
- http:
paths:
- path: /
pathType: Exact
backend:
resource:
kind: nginx-service
name: nginx-deployment
#service:
# name: nginx
# port: 80
#serviceName: nginx
#servicePort: 80
Service_v1_core.yaml
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- port: 80
protocol: TCP
targetPort: 80
ServiceAccount_v1.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: user
namespace: example
PolicyRule_v1.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: "example.com:webservers:reader"
rules:
- apiGroups: ["example.com"]
resources: ["ResourceAll"]
verbs: ["VerbAll"]
ClusterRoleBinding_v1.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: "example.com:webservers:cdreader-read"
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: "example.com:webservers:reader"
subjects:
- kind: ServiceAccount
name: user
namespace: example

Unable to create ALBIngressController in Kubernetes environment

I’m creating an Application Load Balancer (AWSALBIngressController-v1.1.6) in the Kubernetes environment. While creating that, for some reason I’m getting the following error -
E1111 06:02:13.117566 1 controller.go:217] kubebuilder/controller "msg"="Reconciler error" "error"="failed to reconcile LB managed SecurityGroup: failed to reconcile managed LoadBalancer securityGroup: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" "controller"="alb-ingress-controller" "request"={"Namespace”:”sampleNamespace”,”Name":"alb-ingress"}
Following are the ALB config files for reference-
ALB Controller Deployment file-
---
# Application Load Balancer (ALB) Ingress Controller Deployment Manifest.
# This manifest details sensible defaults for deploying an ALB Ingress Controller.
# GitHub: https://github.com/kubernetes-sigs/aws-alb-ingress-controller
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
name: alb-ingress-controller
# Namespace the ALB Ingress Controller should run in. Does not impact which
# namespaces it's able to resolve ingress resource for. For limiting ingress
# namespace scope, see --watch-namespace.
namespace: kube-system
spec:
selector:
matchLabels:
app.kubernetes.io/name: alb-ingress-controller
template:
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
spec:
containers:
- name: alb-ingress-controller
args:
- --ingress-class=alb
- --watch-namespace=sampleNamespace
- --cluster-name=ckuster-xl
- --aws-vpc-id=vpc-3d53e783
- --aws-region=us-east-1
- --default-tags=Name=tag1-xl-ALB,mgr=mgrname
# newer version (v1.1.7) of the alb-ingress-controller image requires iam permission to wafv2
# even when no wafv2 annotation is used
image: docker.io/amazon/aws-alb-ingress-controller:v1.1.6
resources:
requests:
cpu: 100m
memory: 90Mi
limits:
cpu: 200m
memory: 200Mi
serviceAccountName: alb-ingress-controller
RBAC yaml file-
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
name: alb-ingress-controller
rules:
- apiGroups:
- ""
- extensions
resources:
- configmaps
- endpoints
- events
- ingresses
- ingresses/status
- services
verbs:
- create
- get
- list
- update
- watch
- patch
- apiGroups:
- ""
- extensions
resources:
- nodes
- pods
- secrets
- services
- namespaces
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/name: alb-ingress-controller
name: alb-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: alb-ingress-controller
subjects:
- kind: ServiceAccount
name: alb-ingress-controller
namespace: kube-system
I tried few solutions like adding args - --aws-api-debug, - --aws-region args in ALB deployment file and --auto-discover-base-arn , --auto-discover-default-role in kiam server yaml file but it didn't work.

Whitelisting sysctls for containers in Kubernetes Kind

I'm trying to deploy a container in a Kubernetes Kind cluster. The container I'm trying to deploy needs a couple of sysctls flags to be set.
The deployment fails with
forbidden sysctl: "kernel.msgmnb" not whitelisted
UPDATE
I have since added a cluster policy as suggested, created a role that grants usage to it and assigned the Cluster Role to the default service account:
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: sysctl-psp
spec:
privileged: false # Don't allow privileged pods!
# The rest fills in some required fields.
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
runAsUser:
rule: RunAsAny
fsGroup:
rule: RunAsAny
volumes:
- '*'
allowedUnsafeSysctls:
- kernel.msg*
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: role_allow_sysctl
rules:
- apiGroups: ['policy']
resources: ['podsecuritypolicies']
verbs: ['*']
resourceNames:
- sysctl-psp
- apiGroups: ['']
resources:
- replicasets
- services
- pods
verbs: ['*']
- apiGroups: ['apps']
resources:
- deployments
verbs: ['*']
The cluster role binding is like this:
kubectl -n <namespace> create rolebinding default:role_allow_sysctl --clusterrole=role_allow_sysctl --serviceaccount=<namespace>:default
I am then trying to create a deployment and a service in the same namespace:
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-app
labels:
app: test-app
spec:
selector:
matchLabels:
app: test-app
tier: dev
strategy:
type: Recreate
template:
metadata:
labels:
app: test-app
tier: dev
spec:
securityContext:
sysctls:
- name: kernel.msgmnb
value: "6553600"
- name: kernel.msgmax
value: "1048800"
- name: kernel.msgmni
value: "32768"
- name: kernel.sem
value: "128 32768 128 4096"
containers:
- image: registry:5000/<container>:1.0.0
name: test-app
imagePullPolicy: IfNotPresent
ports:
- containerPort: 10666
name:port-1
---
The problem remains the same however, I'm getting multiple pods spawned, all failing with the same message forbidden sysctl: "kernel.msgmnb" not whitelisted
I don't think that --alowed-unsafe-sysctls flag could work with Kind nodes, because Kind nodes themselves are containers, whose sysctl FS is read-only.
My workaround is to change the needed sysctl values on my host machine. Kind nodes (and in turn their containers) will reuse these values.

Kubernetes puzzle: Populate environment variable from file (mounted volume)

I have a Pod or Job yaml spec file (I can edit it) and I want to launch it from my local machine (e.g. using kubectl create -f my_spec.yaml)
The spec declares a volume mount. There would be a file in that volume that I want to use as value for an environment variable.
I want to make it so that the volume file contents ends up in the environment variable (without me jumping through hoops by somehow "downloading" the file to my local machine and inserting it in the spec).
P.S. It's obvious how to do that if you have control over the command of the container. But in case of launching arbitrary image, I have no control over the command attribute as I do not know it.
apiVersion: batch/v1
kind: Job
metadata:
generateName: puzzle
spec:
template:
spec:
containers:
- name: main
image: arbitrary-image
env:
- name: my_var
valueFrom: <Contents of /mnt/my_var_value.txt>
volumeMounts:
- name: my-vol
path: /mnt
volumes:
- name: my-vol
persistentVolumeClaim:
claimName: my-pvc
You can create deployment with kubectl endless loop which will constantly poll volume and update configmap from it. After that you can mount created configmap into your pod. It's a little bit hacky but will work and update your configmap automatically. The only requirement is that PV must be ReadWriteMany or ReadOnlyMany (but in that case you can mount it in read-only mode to all pods).
apiVersion: v1
kind: ServiceAccount
metadata:
name: cm-creator
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: cm-creator
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create", "update", "get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cm-creator
namespace: default
subjects:
- kind: User
name: system:serviceaccount:default:cm-creator
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: cm-creator
apiGroup: rbac.authorization.k8s.io
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cm-creator
namespace: default
labels:
app: cm-creator
spec:
replicas: 1
serviceAccountName: cm-creator
selector:
matchLabels:
app: cm-creator
template:
metadata:
labels:
app: cm-creator
spec:
containers:
- name: cm-creator
image: bitnami/kubectl
command:
- /bin/bash
- -c
args:
- while true;
kubectl create cm myconfig --from-file=my_var=/mnt/my_var_value.txt --dry-run -o yaml | kubectl apply -f-;
sleep 60;
done
volumeMounts:
- name: my-vol
path: /mnt
readOnly: true
volumes:
- name: my-vol
persistentVolumeClaim:
claimName: my-pvc