"Operators make use of the controller pattern"
What is controller pattern in k8s operators ?
The controller pattern has been summarized in these 3 sentences:
A controller tracks at least one Kubernetes resource type. These objects have a spec field that represents the desired state. The controller(s) for that resource are responsible for making the current state come closer to that desired state.
Ref: https://kubernetes.io/docs/concepts/architecture/controller/#controller-pattern
Basically, the Kubernetes controllers watches for the respective resource in a control loop. Once, it find a resource, it reads the desired state from the spec and do some work to make the cluster state same as the desired state.
For example, you have created a Deployment where you have specified in the spec that you want 1 pod that runs your application. Now, Deployment controller sees this and create 1 pod in the cluster to match your desired state. Now, if you update the Deployment spec and say that you now want 2 pods. The Deployment controller will see this change as it is always watching for Deployment and create another pod in the cluster to match the desired state.
You can find more details about these in the following resources:
Inside of Kubernetes Controller
A deep dive into Kubernetes controllers
The Job controller is an example of a Kubernetes built-in controller. Built-in controllers manage state by interacting with the cluster API server.
Ref: https://kubernetes.io/docs/concepts/architecture/controller/#control-via-api-server
In my understanding, you create a xxx-operator, it means you add a Kind for your k8s, the kubectl explain <kind-name> can show the Kind msg you've given.
So, use the xx-operator, you can set your app with a simpler yaml. You can also use tools like Kustomize or Helm based on this way.
e.g. :
run this, at first:
kubectl explain ZookeeperCluster
you will get an err.
you can install a helm chart pravega/zookeeper-operator:
helm install --create-namespace -n op-pravega-zk -- zookeeper-operator
pravega/zookeeper-operator
you will get out msg after installed but it's helpless:
NAME: zookeeper-operator
LAST DEPLOYED: Thu Nov 25 11:20:31 2021
NAMESPACE: op-pravega-zk
STATUS: deployed
REVISION: 1
TEST SUITE: None
but, now, if you run this again:
kubectl explain ZookeeperCluster
you will get these:
KIND: ZookeeperCluster
VERSION: zookeeper.pravega.io/v1beta1
DESCRIPTION:
ZookeeperCluster is the Schema for the zookeeperclusters API
FIELDS:
apiVersion <string>
APIVersion defines the versioned schema of this representation of an
object. Servers should convert recognized schemas to the latest internal
value, and may reject unrecognized values. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
kind <string>
Kind is a string value representing the REST resource this object
represents. Servers may infer this from the endpoint the client submits
requests to. Cannot be updated. In CamelCase. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
metadata <Object>
Standard object's metadata. More info:
https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata
spec <Object>
ZookeeperClusterSpec defines the desired state of ZookeeperCluster
status <Object>
ZookeeperClusterStatus defines the observed state of ZookeeperCluster
now, you can use it like this yaml ( ... means ellipsis) :
apiVersion: "zookeeper.pravega.io/v1beta1"
kind: "ZookeeperCluster"
metadata:
name: zookeeper
namespace: pravega-zk
...
spec:
replicas: 3
image:
repository: pravega/zookeeper
...
pod:
serviceAccountName: zookeeper
storageType: persistence
persistence:
reclaimPolicy: Delete
spec:
storageClassName: local-hostpath
...
or like this ( a ZookeeperCluster Kind in this helm chart ) :
helm install --create-namespace -n pravega-zk --set persistence.storageClassName=local-hostpath -- zookeeper pravega/zookeeper
other e.g. like:
Create a Zookeeper cluster.
kubectl create --namespace zookeeper -f - <<EOF
apiVersion: zookeeper.pravega.io/v1beta1
kind: ZookeeperCluster
metadata:
name: zookeeper
namespace: zookeeper
spec:
replicas: 1
EOF
Ref: https://banzaicloud.com/docs/supertubes/kafka-operator/install-kafka-operator/
btw, after you install a operator, you can use kubectl describe clusterrole zookeeper-operator to show some message, then run kubectl api-resources to find it, then you may find the name of this kind ....
I'm adding an Ingress as follows:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: cheddar
spec:
rules:
- host: cheddar.213.215.191.78.nip.io
http:
paths:
- backend:
service:
name: cheddar
port:
number: 80
path: /
pathType: ImplementationSpecific
but the logs complain:
W0205 15:14:07.482439 1 warnings.go:67] extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
time="2021-02-05T15:14:07Z" level=info msg="Updated ingress status" namespace=default ingress=cheddar
W0205 15:18:19.104225 1 warnings.go:67] networking.k8s.io/v1beta1 IngressClass is deprecated in v1.19+, unavailable in v1.22+; use networking.k8s.io/v1 IngressClassList
Why? What's the correct yaml to use?
I'm currently on microk8s 1.20
I have analyzed you issue and came to the following conclusions:
The Ingress will work and these Warnings you see are just to inform you about the available api versioning. You don't have to worry about this. I've seen the same Warnings:
#microk8s:~$ kubectl describe ing
Warning: extensions/v1beta1 Ingress is deprecated in v1.14+, unavailable in v1.22+; use networking.k8s.io/v1 Ingress
As for the "why" this is happening even when you use apiVersion: networking.k8s.io/v1, I have found the following explanation:
This is working as expected. When you create an ingress object, it can
be read via any version (the server handles converting into the
requested version). kubectl get ingress is an ambiguous request,
since it does not indicate what version is desired to be read.
When an ambiguous request is made, kubectl searches the discovery docs
returned by the server to find the first group/version that contains
the specified resource.
For compatibility reasons, extensions/v1beta1 has historically been
preferred over all other api versions. Now that ingress is the only
resource remaining in that group, and is deprecated and has a GA
replacement, 1.20 will drop it in priority so that kubectl get ingress would read from networking.k8s.io/v1, but a 1.19 server
will still follow the historical priority.
If you want to read a specific version, you can qualify the get
request (like kubectl get ingresses.v1.networking.k8s.io ...) or can
pass in a manifest file to request the same version specified in the
file (kubectl get -f ing.yaml -o yaml)
Long story short: despite the fact of using the proper apiVersion, the deprecated one is still being seen as the the default one and thus generating the Warning you experience.
I also see that changes are still being made recently so I assume that it is still being worked on.
I had the same issue and was unable to update the k8s cluster which was subscribed to release channel.
One of the reasons for this log warning generation is the ClusterRole definition of external-dns. The external-dns keep querying the ingresses in k8s cluster as per the rules defined in the Cluster role
- apiGroups: ["extensions", "networking.k8s.io"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
Found in the helm chart in here
It queries the old extensions of ingress as well which keeps on generating those logs. Please update the cert-manager.
I can easily find when a ConfigMap was created but how do you find out when a ConfigMap was last patched (edited)?
That information does not appear in any of the metadata when I export as YAML.
Deployments have a lastUpdateTime field that we can see, ConfigMaps don't appear to have the same, is that true?
status:
availableReplicas: 2
conditions:
- lastTransitionTime: "2020-01-16T17:48:17Z"
lastUpdateTime: "2020-01-16T17:48:17Z"
By current design config maps are not versioned and so there is no preserving the history and at a given point in time there can be only one version of a ConfigMap in kubernetes. Deployments are versioned and needs history and timestamps for rollout and rollback.
As #ArghyaSadhu said, configmaps are not versioned in kubernetes.
If you want to take control of your configmaps versions, you could create a simple entry in the data field like:
apiVersion: v1
kind: ConfigMap
data:
config.version: "001" <== HERE
...
and increase/modify the version number of this specific data when you do some adjustment or just let your CI/CD process modify it for you.
But it is the only for your control, if you want to implement 'rollback' functionality to your configmaps this must be done using some script from your side.
How do I automatically restart Kubernetes pods and pods associated with deployments when their configmap is changed/updated?
I know there's been talk about the ability to automatically restart pods when a config maps changes but to my knowledge this is not yet available in Kubernetes 1.2.
So what (I think) I'd like to do is a "rolling restart" of the deployment resource associated with the pods consuming the config map. Is it possible, and if so how, to force a rolling restart of a deployment in Kubernetes without changing anything in the actual template? Is this currently the best way to do it or is there a better option?
The current best solution to this problem (referenced deep in https://github.com/kubernetes/kubernetes/issues/22368 linked in the sibling answer) is to use Deployments, and consider your ConfigMaps to be immutable.
When you want to change your config, create a new ConfigMap with the changes you want to make, and point your deployment at the new ConfigMap. If the new config is broken, the Deployment will refuse to scale down your working ReplicaSet. If the new config works, then your old ReplicaSet will be scaled to 0 replicas and deleted, and new pods will be started with the new config.
Not quite as quick as just editing the ConfigMap in place, but much safer.
Signalling a pod on config map update is a feature in the works (https://github.com/kubernetes/kubernetes/issues/22368).
You can always write a custom pid1 that notices the confimap has changed and restarts your app.
You can also eg: mount the same config map in 2 containers, expose a http health check in the second container that fails if the hash of config map contents changes, and shove that as the liveness probe of the first container (because containers in a pod share the same network namespace). The kubelet will restart your first container for you when the probe fails.
Of course if you don't care about which nodes the pods are on, you can simply delete them and the replication controller will "restart" them for you.
The best way I've found to do it is run Reloader
It allows you to define configmaps or secrets to watch, when they get updated, a rolling update of your deployment is performed. Here's an example:
You have a deployment foo and a ConfigMap called foo-configmap. You want to roll the pods of the deployment every time the configmap is changed. You need to run Reloader with:
kubectl apply -f https://raw.githubusercontent.com/stakater/Reloader/master/deployments/kubernetes/reloader.yaml
Then specify this annotation in your deployment:
kind: Deployment
metadata:
annotations:
configmap.reloader.stakater.com/reload: "foo-configmap"
name: foo
...
Helm 3 doc page
Often times configmaps or secrets are injected as configuration files in containers. Depending on the application a restart may be required should those be updated with a subsequent helm upgrade, but if the deployment spec itself didn't change the application keeps running with the old configuration resulting in an inconsistent deployment.
The sha256sum function can be used together with the include function to ensure a deployments template section is updated if another spec changes:
kind: Deployment
spec:
template:
metadata:
annotations:
checksum/config: {{ include (print $.Template.BasePath "/secret.yaml") . | sha256sum }}
[...]
In my case, for some reasons, $.Template.BasePath didn't work but $.Chart.Name does:
spec:
replicas: 1
template:
metadata:
labels:
app: admin-app
annotations:
checksum/config: {{ include (print $.Chart.Name "/templates/" $.Chart.Name "-configmap.yaml") . | sha256sum }}
You can update a metadata annotation that is not relevant for your deployment. it will trigger a rolling-update
for example:
spec:
template:
metadata:
annotations:
configmap-version: 1
If k8>1.15; then doing a rollout restart worked best for me as part of CI/CD with App configuration path hooked up with a volume-mount. A reloader plugin or setting restartPolicy: Always in deployment manifest YML did not work for me. No application code changes needed, worked for both static assets as well as Microservice.
kubectl rollout restart deployment/<deploymentName> -n <namespace>
Had this problem where the Deployment was in a sub-chart and the values controlling it were in the parent chart's values file. This is what we used to trigger restart:
spec:
template:
metadata:
annotations:
checksum/config: {{ tpl (toYaml .Values) . | sha256sum }}
Obviously this will trigger restart on any value change but it works for our situation. What was originally in the child chart would only work if the config.yaml in the child chart itself changed:
checksum/config: {{ include (print $.Template.BasePath "/config.yaml") . | sha256sum }}
Consider using kustomize (or kubectl apply -k) and then leveraging it's powerful configMapGenerator feature. For example, from: https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/configmapgenerator/
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
# Just one example of many...
- name: my-app-config
literals:
- JAVA_HOME=/opt/java/jdk
- JAVA_TOOL_OPTIONS=-agentlib:hprof
# Explanation below...
- SECRETS_VERSION=1
Then simply reference my-app-config in your deployments. When building with kustomize, it'll automatically find and update references to my-app-config with an updated suffix, e.g. my-app-config-f7mm6mhf59.
Bonus, updating secrets: I also use this technique for forcing a reload of secrets (since they're affected in the same way). While I personally manage my secrets completely separately (using Mozilla sops), you can bundle a config map alongside your secrets, so for example in your deployment:
# ...
spec:
template:
spec:
containers:
- name: my-app
image: my-app:tag
envFrom:
# For any NON-secret environment variables. Name is automatically updated by Kustomize
- configMapRef:
name: my-app-config
# Defined separately OUTSIDE of Kustomize. Just modify SECRETS_VERSION=[number] in the my-app-config ConfigMap
# to trigger an update in both the config as well as the secrets (since the pod will get restarted).
- secretRef:
name: my-app-secrets
Then, just add a variable like SECRETS_VERSION into your ConfigMap like I did above. Then, each time you change my-app-secrets, just increment the value of SECRETS_VERSION, which serves no other purpose except to trigger a change in the kustomize'd ConfigMap name, which should also result in a restart of your pod. So then it becomes:
I also banged my head around this problem for some time and wished to solve this in an elegant but quick way.
Here are my 20 cents:
The answer using labels as mentioned here won't work if you are updating labels. But would work if you always add labels. More details here.
The answer mentioned here is the most elegant way to do this quickly according to me but had the problem of handling deletes. I am adding on to this answer:
Solution
I am doing this in one of the Kubernetes Operator where only a single task is performed in one reconcilation loop.
Compute the hash of the config map data. Say it comes as v2.
Create ConfigMap cm-v2 having labels: version: v2 and product: prime if it does not exist and RETURN. If it exists GO BELOW.
Find all the Deployments which have the label product: prime but do not have version: v2, If such deployments are found, DELETE them and RETURN. ELSE GO BELOW.
Delete all ConfigMap which has the label product: prime but does not have version: v2 ELSE GO BELOW.
Create Deployment deployment-v2 with labels product: prime and version: v2 and having config map attached as cm-v2 and RETURN, ELSE Do nothing.
That's it! It looks long, but this could be the fastest implementation and is in principle with treating infrastructure as Cattle (immutability).
Also, the above solution works when your Kubernetes Deployment has Recreate update strategy. Logic may require little tweaks for other scenarios.
How do I automatically restart Kubernetes pods and pods associated
with deployments when their configmap is changed/updated?
If you are using configmap as Environment you have to use the external option.
Reloader
Kube watcher
Configurator
Kubernetes auto-reload the config map if it's mounted as volume (If subpath there it won't work with that).
When a ConfigMap currently consumed in a volume is updated, projected
keys are eventually updated as well. The kubelet checks whether the
mounted ConfigMap is fresh on every periodic sync. However, the
kubelet uses its local cache for getting the current value of the
ConfigMap. The type of the cache is configurable using the
ConfigMapAndSecretChangeDetectionStrategy field in the
KubeletConfiguration struct. A ConfigMap can be either propagated by
watch (default), ttl-based, or by redirecting all requests directly to
the API server. As a result, the total delay from the moment when the
ConfigMap is updated to the moment when new keys are projected to the
Pod can be as long as the kubelet sync period + cache propagation
delay, where the cache propagation delay depends on the chosen cache
type (it equals to watch propagation delay, ttl of cache, or zero
correspondingly).
Official document : https://kubernetes.io/docs/concepts/configuration/configmap/#mounted-configmaps-are-updated-automatically
ConfigMaps consumed as environment variables are not updated automatically and require a pod restart.
Simple example Configmap
apiVersion: v1
kind: ConfigMap
metadata:
name: config
namespace: default
data:
foo: bar
POD config
spec:
containers:
- name: configmaptestapp
image: <Image>
volumeMounts:
- mountPath: /config
name: configmap-data-volume
ports:
- containerPort: 8080
volumes:
- name: configmap-data-volume
configMap:
name: config
Example : https://medium.com/#harsh.manvar111/update-configmap-without-restarting-pod-56801dce3388
Adding the immutable property to the config map totally avoids the problem. Using config hashing helps in a seamless rolling update but it does not help in a rollback. You can take a look at this open-source project - 'Configurator' - https://github.com/gopaddle-io/configurator.git .'Configurator' works by the following using the custom resources :
Configurator ties the deployment lifecycle with the configMap. When
the config map is updated, a new version is created for that
configMap. All the deployments that were attached to the configMap
get a rolling update with the latest configMap version tied to it.
When you roll back the deployment to an older version, it bounces to
configMap version it had before doing the rolling update.
This way you can maintain versions to the config map and facilitate rolling and rollback to your deployment along with the config map.
Another way is to stick it into the command section of the Deployment:
...
command: [ "echo", "
option = value\n
other_option = value\n
" ]
...
Alternatively, to make it more ConfigMap-like, use an additional Deployment that will just host that config in the command section and execute kubectl create on it while adding an unique 'version' to its name (like calculating a hash of the content) and modifying all the deployments that use that config:
...
command: [ "/usr/sbin/kubectl-apply-config.sh", "
option = value\n
other_option = value\n
" ]
...
I'll probably post kubectl-apply-config.sh if it ends up working.
(don't do that; it looks too bad)
I use the Kubernetes ServiceAccount plugin to automatically inject a ca.crt and token in to my pods. This is useful for applications such as kube2sky which need to access the API Server.
However, I run many hundreds of other pods that don't need this token. Is there a way to stop the ServiceAccount plugin from injecting the default-token in to these pods (or, even better, have it off by default and turn it on explicitly for a pod)?
As of Kubernetes 1.6+ you can now disable automounting API credentials for a particular pod as stated in the Kubernetes Service Accounts documentation
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
serviceAccountName: build-robot
automountServiceAccountToken: false
...
Right now there isn't a way to enable a service account for some pods but not others, although you can use ABAC to for some service accounts to restrict access to the apiserver.
This issue is being discussed in https://github.com/kubernetes/kubernetes/issues/16779 and I'd encourage you to add your use can to that issue and see when it will be implemented.