Google Cloud Kubernetes Persistent Volume Claim error in deployment Yaml - kubernetes

I have a persistent volume claim file, which previously was being read by buildkite in the deployment stage. Only recently it has been erroring in the build process with this error:
error: error validating "kube/common/01-redis-volume-claim.yml": error validating data: field
spec.dataSource for v1.PersistentVolumeClaimSpec is required; if you choose to ignore these
errors, turn validation off with --validate=false
I've seen this issue crop up twice recently, and the immediate fix is to add the missing field (spec.dataSource) and setting it to null.
My question is, if it was absent in the first instance, then will setting it to null be any different than what it was previously?

Based on documentation
spec.dataSource should have:
name: existing-src-pvc-name
kind: PersistentVolumeClaim
In my opinion everything you should do is add name and kind to your yaml file and there should not be any error anymore.
My question is, if it was absent in the first instance, then will setting it to null be any different than what it was previously?
Answering to this question, as far as I am concerned it is happening because you are not creating a new pvc, but you are likely to cloning it.
Volume clone feature was added to support CSI Volume Plugins only. For details, see volume cloning.
The CSI Volume Cloning feature adds support for specifying existing PVCs in the dataSource field to indicate a user would like to clone a Volume.

Related

How to install keycloak operator on IBM Cloud Kubernetes Service?

The operator is https://operatorhub.io/operator/keycloak-operator version 11.0.0.
The cluster is Kubernetes version 1.18.12.
I was able to follow the steps from OperatorHub.io to install the Operator Lifecycle Manager and the Keycloak "OperatorGroup" and "Subscription".
It took much longer than I was expecting (maybe 20 minutes?), but eventually the corresponding "ClusterServiceVersion" was created.
However, now when I try to use it by creating the following resource, it doesn't seem to be doing anything at all:
apiVersion: keycloak.org/v1alpha1
kind: Keycloak
metadata:
name: example-keycloak
namespace: keycloak
labels:
app: sso
spec:
instances: 1
externalAccess:
enabled: true
extensions:
- https://github.com/aerogear/keycloak-metrics-spi/releases/download/1.0.4/keycloak-metrics-spi-1.0.4.jar
It accepts the new resource, so I know the CRD is in place. The documentation states that it should create a stateful set, an ingress, and more, but it just doesn't seem to create anything.
I checked the cluster logs and this is the error that is jumping out to me:
olm-operator ERROR controllers.operator Could not update Operator status {"request": "/keycloak-operator.my-keycloak-operator", "error": "Operation cannot be fulfilled on operators.operators.coreos.com \"keycloak-operator.my-keycloak-operator\": the object has been modified; please apply your changes to the latest version and try again"}
I have quite a bit of experience with plain kubernetes, but I'm brand new to "operators" and so I'm really not sure where to look next wrt what might be going wrong.
Any hints/suggestions/explanations?
UPDATE: I was creating the keycloak resource in a namespace OTHER than the one I installed the operator into. Since it allowed me to create the custom resource (Kind: Keycloak) into this namespace, I thought this was supported. However, when I created the keycloak resource to the same namespace where the operator was installed (my-keycloak-operator), then it actually tried to do something. Its still failing to bring up the pod, mind you, but at least its trying to do something.
Will leave this question open for a bit to see if the "Could not update Operator status" is something I should be concerned about or not...
It looks like the operator or/and the components that it wants to bring up cannot do a write (POST/PUT) to the kube-apiserver.
From what you describe, it appears that the first time when you installed the operator on a different namespace it just didn't have permissions to bring up anything at all. The second time when you installed it on the right namespace it looks like the operator was able to talk to the kube-apiserver but the components that it's bring up (Keycloak, etc) are not able to.
I would check the logs on the kube-apiserver (control plane) to see if you have some unauthorized requests, also check the log files of the components (pods, deployments, etc) that the operator is trying to bring up.
If you have unauthorized requests you may have to manually update the RBAC rules. Finally, I would check with IBM cloud to see what specific permission its K8s control plane could have that is preventing applications to talk to it (the kube-apiserver).
✌️

How do you reuse a volume in Kubernetes?

Let's say that you wanted to create a Jenkins Deployment. As Jenkins uses a local XML file for configuration and state, you would want to create a PersistentVolume so that your data could be saved across Pod evictions and Deployment deletions. I know that the Retain reclaimPolicy will result in the data persisting on the detached PersistentVolume, but the documentation says this is just so that you can manually reclaim the data on it later on, and seems to say nothing about the volume being automatically reused if its mounting Pods are ever brought back up.
It is difficult to articulate what I am even trying to ask, so forgive me if this seems like a nebulous question, but:
If you delete the Jenkins deployment, then later decide to recreate it where you left off, how do you get it to re-mount that exact PersistentVolume on which that specific XML configuration is still stored?
Is this a case where you would want to use a StatefulSet? It seems like, in this case, Jenkins would be considered "stateful."
Is the PersistentVolumeClaim the basis of a volume's "identity"? In other words, is the expectation for the PersistentVolumeClaim to be the stable identifier by which an application can bind to a specific volume with specific data on it?
you can use stateful sets. scaling down deletes the pod, leaving the claims alone. Persistent volume claims can be deleted only manually, in order to release the underlying PersistentVolume
a scale-up can reattach the same claim along with the bound Persistent Volume and its contents to the newly created pod instance.
If you have accidentally scaled down a StatefulSet, you can scale up again and the new pod will have the same persisted state again.
If you delete the Jenkins deployment, then later decide to recreate it
where you left off, how do you get it to re-mount that exact
PersistentVolume on which that specific XML configuration is still
stored?
By using the PersistentVolumeClaim that was bound to that PersistentVolume, assuming the PersistentVolumeClaim and its PersistentVolume haven't been deleted. You should be able to try it :-)
Is this a case where you would want to use a StatefulSet? It seems
like, in this case, Jenkins would be considered "stateful."
Yes, you could use StatefulSet for its stable storage. With no need for persistent identities and stable hostnames, though, I'm not sure of the benefits compared to a master and dynamic slaves Deployment. Unless the idea is to partition the work (e.g. "areas" of the source control repo) across several Jenkins masters and their slaves...
Is the PersistentVolumeClaim the basis of a volume's "identity"? In
other words, is the expectation for the PersistentVolumeClaim to be
the stable identifier by which an application can bind to a specific
volume with specific data on it?
Yes (see my answer to the first question) - the PersistentVolumeClaim is like a stable identifier by which an application can mount the specific volume the claim is bound to.

Recommended way to provide kubernetes metrics-server with its key pair?

Are there any known issues with metrics-server and configmap? I’ve tried a zillion things to get it to work but unable to. If in my deployment manifest I simply replace "image: k8s.gcr.io/metrics-server-amd64:v0.3.3" with “image: docker.io/alpine” it can read configmap files. But metrics-server throws the following error:
“no such file or directory” when attempting to reference a configmap file. Which tends to make me suspect the problem is in metrics-server rather than the k8s environment.
My purpose is doing this is to make the server’s public and private keys (–tls-cert-file) available to the container. If a configmap is not the recommended way to provide the metrics-server its keys , please let me know what the recommended way is. (In tihs case I still would be curious why metrics-server cannot mount configmap volumes.)
I figured this out. The problem was a combination of a misleading error message from metric-server and zero insight into whether or not the container was able to see the files in the volume. In fact the files were there, but the error message made me think they weren’t. If you pass “–tls-cert-file” without also giving “–tls-private-key-file” (which I was doing just for testing) the error message is: “No such file or directory”. Instead of something more informative, like “Please specify both options together.” The metrics-server developers need to change this and save “No such file” for cases when the file actually does not exist or cannot be opened for reading.
Thinking there was no file, there wasn’t any way to verify this from within the container because it only has one binary without any shell. Running “docker export” on the non-running container (not running because metrics-server would bomb out with the error) revealed an empty volume because kubelet had unmounted the volumes when stopping the container.
Looking at the kubelet logs they were showing everything ok with the volume, and I could see the files under /var/lib/kublet/pods/…/. But all indications were that something was wrong because I had no insight into what the container itself was seeing.
Once I started passing both the command line options for the certs, everything was working.

Kubernetes: security context and IPC_LOCK capability

I'm trying to install a helm package that needs IPC_LOCK capability. So, I'm getting this message:
Error creating: pods "pod..." is forbidden: unable to validate against any security context constraint: [capabilities.add: Invalid value: "IPC_LOCK": capability may not be added capabilities.add: Invalid value: "IPC_LOCK": capability may not be added]
You can see the DeploymentConfig here.
I'm installing Vault using a Helm chart, so I'm not able to change DeploymentConfig.
I guess the only way to get it would be using a service account with an scc associated allowing it to perform the container.
How could I solve that?
I haven't worked on vault yet, so my answers might not be accurate.
But I think you can remove that capability and disable m_lock in vault config.
https://www.vaultproject.io/docs/configuration/index.html#disable_mlock
Having said that, I don't think kubernetes supports memory swapping anyway (someone needs to verify this) therefore a syscall to mlock might not be needed.

kubernetes petset on google cloud

I am running a kubernetes cluster on google cloud(version 1.3.5) .
I found a redis.yaml
that uses petset to create a redis cluster but when i run kubectl create -f redis.yaml i get the following error :
error validating "redis.yaml": error validating data: the server could not find the requested resource (get .apps); if you choose to ignore these errors, turn validation off with --validate=false
i cant find why i get this error or how to solve this.
PetSet is currently an alpha feature (which you can tell because the apiVersion in the linked yaml file is apps/v1alpha1). It may not be obvious, but alpha features are not supported in Google Container Engine.
As described in api_changes.md, alpha level API objects are disabled by default, have no guarantees that they will exist in future versions, can break compatibility with older versions at any time, and may destabilize the cluster.
I'm using PetSet with some success, for example https://github.com/Yolean/kubernetes-mysql-cluster, in zone europe-west1-d but when I tried europe-west1-c I got the aforementioned error.
Google just enabled Alpha Clusters for GKE as announced here: https://cloud.google.com/container-engine/docs/alpha-clusters
Now you are able (but not SLA covered) to use all alpha features within an alpha cluster, what was disable previously.