How can I give grafana user appropriate permission so that it can start successfully? - kubernetes

env:
kubernetes provider: gke
kubernetes version: v1.13.12-gke.25
grafana version: 6.6.2 (official image)
grafana deployment manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
name: grafana
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:6.6.2
ports:
- name: grafana
containerPort: 3000
# securityContext:
# runAsUser: 104
# allowPrivilegeEscalation: true
resources:
limits:
memory: "1Gi"
cpu: "500m"
requests:
memory: "500Mi"
cpu: "100m"
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-storage
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
Problem
when I deployed this grafana dashboard first time, its working fine. after sometime I restarted the pod to check whether volume mount is working or not. after restarting, I getting below error.
mkdir: can't create directory '/var/lib/grafana/plugins': Permission denied
GF_PATHS_DATA='/var/lib/grafana' is not writable.
You may have issues with file permissions, more information here: http://docs.grafana.org/installation/docker/#migration-from-a-previous-version-of-the-docker-container-to-5-1-or-later
what I understand from this error, user could create these files. How can I give this user appropriate permission to start grafana successfully?

I recreated your deployment with appropriate PVC and noticed that grafana pod was failing.
Output of command: $ kubectl get pods -n monitoring
NAME READY STATUS RESTARTS AGE
grafana-6466cd95b5-4g95f 0/1 Error 2 65s
Further investigation pointed the same errors as yours:
mkdir: can't create directory '/var/lib/grafana/plugins': Permission denied
GF_PATHS_DATA='/var/lib/grafana' is not writable.
You may have issues with file permissions, more information here: http://docs.grafana.org/installation/docker/#migration-from-a-previous-version-of-the-docker-container-to-5-1-or-later
This error showed on first creation of a pod and the deployment. There was no need to recreate any pods.
What I did to make it work was to edit your deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
name: grafana
labels:
app: grafana
spec:
securityContext:
runAsUser: 472
fsGroup: 472
containers:
- name: grafana
image: grafana/grafana:6.6.2
ports:
- name: grafana
containerPort: 3000
resources:
limits:
memory: "1Gi"
cpu: "500m"
requests:
memory: "500Mi"
cpu: "100m"
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-storage
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
Please take a specific look on part:
securityContext:
runAsUser: 472
fsGroup: 472
It is a setting described in official documentation: Kubernetes.io: set the security context for a pod
Please take a look on this Github issue which is similar to yours and pointed me to solution that allowed pod to spawn correctly:
https://github.com/grafana/grafana-docker/issues/167
Grafana had some major updates starting from version 5.1. Please take a look: Grafana.com: Docs: Migrate to v5.1 or later
Please let me know if this helps.

On v8.0, I do that setting runAsUser: 0.
It works.
---
apiVersion: v1
kind: Service
metadata:
name: grafana
spec:
ports:
- name: grafana-tcp
port: 3000
protocol: TCP
targetPort: 3000
selector:
project: grafana
type: LoadBalancer
status:
loadBalancer: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
project: grafana
name: grafana
spec:
replicas: 1
selector:
matchLabels:
project: grafana
strategy:
type: RollingUpdate
template:
metadata:
labels:
project: grafana
name: grafana
spec:
securityContext:
runAsUser: 0
containers:
- image: grafana/grafana
name: grafana
ports:
- containerPort: 3000
protocol: TCP
resources: {}
volumeMounts:
- mountPath: /var/lib/grafana
name: grafana-volume
volumes:
- name: grafana-volume
hostPath:
# directory location on host
path: /opt/grafana
# this field is optional
type: DirectoryOrCreate
restartPolicy: Always
status: {}

Related

Nexus on k3s on restart does not persist Users and data

I have installed on K3S raspberry pi cluster nexus with the following setups for kubernetes learning purposes. First I created a StatefulSet:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexusstorage
mountPath: /sonatype-work
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Storage class
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: nexusstorage
provisioner: driver.longhorn.io
allowVolumeExpansion: true
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
numberOfReplicas: "3"
staleReplicaTimeout: "30"
fsType: "ext4"
diskSelector: "ssd"
nodeSelector: "ssd"
pvc
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nexusstorage
namespace: dev-ops
spec:
accessModes:
- ReadWriteOnce
storageClassName: nexusstorage
resources:
requests:
storage: 50Gi
Service
apiVersion: v1
kind: Service
metadata:
name: nexus-server
namespace: dev-ops
annotations:
prometheus.io/scrape: 'true'
prometheus.io/path: /
prometheus.io/port: '8081'
spec:
selector:
app: nexus-server
type: LoadBalancer
ports:
- port: 8081
targetPort: 8081
nodePort: 32000
this setup will spin up nexus, but if I restart the pod the data will not persist and I have to create all the setups and users from scratch.
What I'm missing in this case?
UPDATE
I got it working, nexus needs on mount permissions on directory. The working StatefulSet looks as it follow
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: nexus
namespace: dev-ops
spec:
serviceName: "nexus"
replicas: 1
selector:
matchLabels:
app: nexus-server
template:
metadata:
labels:
app: nexus-server
spec:
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
containers:
- name: nexus
image: klo2k/nexus3:latest
env:
- name: MAX_HEAP
value: "800m"
- name: MIN_HEAP
value: "300m"
resources:
limits:
memory: "4Gi"
cpu: "1000m"
requests:
memory: "2Gi"
cpu: "500m"
ports:
- containerPort: 8081
volumeMounts:
- name: nexus-storage
mountPath: /nexus-data
volumes:
- name: nexus-storage
persistentVolumeClaim:
claimName: nexus-storage
important snippet to get it working
securityContext:
runAsUser: 200
runAsGroup: 200
fsGroup: 200
I'm not familiar with that image, although checking dockerhub, they mention using a Dockerfile similar to that of Sonatype. Then, I would change the mountpoint for your volume, to /nexus-data
This is the default path storing data (they set this env var, then declare a VOLUME). Which we can confirm, looking at the repository that most likely produced your arm-capable image
And following up on your last comment, let's try to also mount it in /opt/sonatype/sonatype-work/nexus3...
In your statefulset, change volumeMounts, to this:
volumeMounts:
- name: nexusstorage
mountPath: /nexus-data
- name: nexusstorage
mountPath: /opt/sonatype/sonatype-work/nexus3
volumes:
- name: nexusstorage
persistentVolumeClaim:
claimName: nexusstorage
Although the second volumeMount entry should not be necessary, as far as I understand. Maybe something's wrong with your storage provider?
Are you sure your PVC is write-able? Reverting back to your initial configuration, enter your pod (kubectl exec -it) and try to write a file at the root of your PVC.

Configuring yaml file

I'm learning k8s, I found an example in the MS docs. The problem I'm having is that I want to switch what GITHUB repo thats being used. I havent been able to figure out the path within this yaml example
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-back
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-back
template:
metadata:
labels:
app: azure-vote-back
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-back
image: mcr.microsoft.com/oss/bitnami/redis:6.0.8
env:
- name: ALLOW_EMPTY_PASSWORD
value: "yes"
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-back
spec:
ports:
- port: 6379
selector:
app: azure-vote-back
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: azure-vote-front
spec:
replicas: 1
selector:
matchLabels:
app: azure-vote-front
template:
metadata:
labels:
app: azure-vote-front
spec:
nodeSelector:
"kubernetes.io/os": linux
containers:
- name: azure-vote-front
image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 250m
memory: 256Mi
ports:
- containerPort: 80
env:
- name: REDIS
value: "azure-vote-back"
---
apiVersion: v1
kind: Service
metadata:
name: azure-vote-front
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: azure-vote-front
This YAML example doesn't have a Github Repo field at all. That's why you can't find a path.
If you're trying to change the container image source, it has to be from a container registry (or your own filesystem), which is located at
containers: image: mcr.microsoft.com/azuredocs/azure-vote-front:v1
where mcr.microsoft.com is the container registry.
You won't be able to connect this directly to a Github Repository, but any container registry will work, and I believe Github has one at https://ghcr.io (that link itself will direct you back to Github)

Time Zone in Kubernetes Pods Using Environment Variable

I am trying to update my pod time to Asia/Kolkata zone as per kubernetes timezone in POD with command and argument. However, the time still remains the same UTC time. Only the time zone is getting updated from UTC to Asia.
I was able to fix it using the volume mounts as below. Create a config map and apply the deployment yaml.
kubectl create configmap tz --from-file=/usr/share/zoneinfo/Asia/Kolkata -n <required namespace>
Why is the environmental variable method not working? Will a pod eviction occur from one host to another if we use volume mount time and will if affect the volume mount time after pod eviction?
The EV deployment YAML is below which does not update the time
apiVersion: apps/v1
kind: Deployment
metadata:
name: connector
labels:
app: connector
namespace: clients
spec:
replicas: 1
selector:
matchLabels:
app: connector
template:
metadata:
labels:
app: connector
spec:
containers:
- image: connector
name: connector
resources:
requests:
memory: "32Mi" # "64M"
cpu: "250m"
limits:
memory: "64Mi" # "128M"
cpu: "500m"
ports:
- containerPort: 3307
protocol: TCP
env:
- name: TZ
value: Asia/Kolkata
volumeMounts:
- name: connector-rd
mountPath: /home/mongobi/mongosqld.conf
subPath: mongosqld.conf
volumes:
- name: connector-rd
configMap:
name: connector-rd
items:
- key: mongod.conf
Volume Mount yaml is below.
apiVersion: apps/v1
kind: Deployment
metadata:
name: connector
labels:
app: connector
namespace: clients
spec:
replicas: 1
selector:
matchLabels:
app: connector
template:
metadata:
labels:
app: connector
spec:
containers:
- image: connector
name: connector
resources:
requests:
memory: "32Mi" # "64M"
cpu: "250m"
limits:
memory: "64Mi" # "128M"
cpu: "500m"
ports:
- containerPort: 3307
protocol: TCP
volumeMounts:
- name: tz-config
mountPath: /etc/localtime
- name: connector-rd
mountPath: /home/mongobi/mongosqld.conf
subPath: mongosqld.conf
volumes:
- name: connector-rd
configMap:
name: connector-rd
items:
- key: mongod.conf
path: mongosqld.conf
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Kolkata
In this scenario you need to mention type attribute as File for hostPath in the deployment configuration. The below configuration should work for you.
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Asia/Kolkata
type: File
Simply setting TZ env variable in deployment works for me

Deploy pods in different nodes

I have a namespace called airflow that has 2 pods: webserver and scheduler. I want to deploy scheduler on node A and webserver on node B.
And here you can see deployment files:
scheduler:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: airflow
name: airflow-scheduler
labels:
name: airflow-scheduler
spec:
replicas: 1
template:
metadata:
labels:
app: airflow-scheduler
spec:
terminationGracePeriodSeconds: 60
containers:
- name: scheduler
image: 123423.dkr.ecr.us-east-1.amazonaws.com/airflow:$COMMIT_SHA1
volumeMounts:
- name: logs
mountPath: /logs
command: ["airflow"]
args: ["scheduler"]
imagePullPolicy: Always
resources:
limits:
memory: "3072Mi"
requests:
cpu: "500m"
memory: "2048Mi"
volumes:
- name: logs
persistentVolumeClaim:
claimName: logs
webserver:
apiVersion: v1
kind: Service
metadata:
name: airflow-webserver
namespace: airflow
labels:
run: airflow-webserver
spec:
ports:
- port: 80
targetPort: 8080
selector:
run: airflow-webserver
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: airflow-webserver
namespace: airflow
annotations:
kubernetes.io/ingress.class: nginx
certmanager.k8s.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- airflow.awesome.com.br
secretName: airflow-crt
rules:
- host: airflow.awesome.com.br
http:
paths:
- path: /
backend:
serviceName: airflow-webserver
servicePort: 80
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
namespace: airflow
name: airflow-webserver
labels:
run: airflow-webserver
spec:
replicas: 1
template:
metadata:
labels:
run: airflow-webserver
spec:
terminationGracePeriodSeconds: 60
containers:
- name: webserver
image: 123423.dkr.ecr.us-east-1.amazonaws.com/airflow:$COMMIT_SHA1
volumeMounts:
- name: logs
mountPath: /logs
ports:
- containerPort: 8080
command: ["airflow"]
args: ["webserver"]
imagePullPolicy: Always
resources:
limits:
cpu: "200m"
memory: "3072Mi"
requests:
cpu: "100m"
memory: "2048Mi"
volumes:
- name: logs
persistentVolumeClaim:
claimName: logs
What's the proper way to ensure that pods will be deployed on different nodes?
edit1:
antiaffinity is not working:
I've tried to set podAntiAffinity on scheduler but it's not working:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: name
operator: In
values:
- airflow-webserver
topologyKey: "kubernetes.io/hostname"
If you want to have these pods run on different nodes but you don't care about which nodes exactly, you can use the Pod anti-affinity feature. It basically defines that the pod X should not run on the same node (it can be also used with failure domain / zones etc., not just with nodes) as pod Y and uses labels to specify the pods. So you will need to add some labels and specify them in the spec sections. More info about it is in Kube docs.
If in addition you want to also specify on which node it should run, you can use the Node affinity feature. See Kube docs for more details.

kubernetes StorageClass does not retain existing data

My Kubernetes StorageClass volume doesn't retain existing data when the pod is deleted and deployed back with my postgresql database. When I delete the pod, the new pod is created but the database is empty.
I have followed variations of the different versions of the tutorials (https://kubernetes.io/docs/concepts/storage/persistent-volumes/) but nothing seems to work.
I paste all the YAML files cause the problem might be in the combination.
storage-google.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: spingular-pvc
spec:
storageClassName: standard
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 7Gi
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-standard
zone: us-east4-a
jhipsterpress-postgresql.yml
apiVersion: v1
kind: Secret
metadata:
name: jhipsterpress-postgresql
namespace: default
labels:
app: jhipsterpress-postgresql
type: Opaque
data:
postgres-password: NjY0NXJxd24=
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jhipsterpress-postgresql
namespace: default
spec:
replicas: 1
template:
metadata:
labels:
app: jhipsterpress-postgresql
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: spingular-pvc
containers:
- name: postgres
image: postgres:10.4
env:
- name: POSTGRES_USER
value: jhipsterpress
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: jhipsterpress-postgresql
key: postgres-password
ports:
- containerPort: 5432
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/
---
apiVersion: v1
kind: Service
metadata:
name: jhipsterpress-postgresql
namespace: default
spec:
selector:
app: jhipsterpress-postgresql
ports:
- name: postgresqlport
port: 5432
type: LoadBalancer
jhipsterpress-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: jhipsterpress
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: jhipsterpress
version: "v1"
template:
metadata:
labels:
app: jhipsterpress
version: "v1"
spec:
initContainers:
- name: init-ds
image: busybox:latest
command:
- '/bin/sh'
- '-c'
- |
while true
do
rt=$(nc -z -w 1 jhipsterpress-postgresql 5432)
if [ $? -eq 0 ]; then
echo "DB is UP"
break
fi
echo "DB is not yet reachable;sleep for 10s before retry"
sleep 10
done
containers:
- name: jhipsterpress-app
image: galore/jhipsterpress
env:
- name: SPRING_PROFILES_ACTIVE
value: prod
- name: SPRING_DATASOURCE_URL
value: jdbc:postgresql://jhipsterpress-postgresql.default.svc.cluster.local:5432/jhipsterpress
- name: SPRING_DATASOURCE_USERNAME
value: jhipsterpress
- name: SPRING_DATASOURCE_PASSWORD
valueFrom:
secretKeyRef:
name: jhipsterpress-postgresql
key: postgres-password
- name: JAVA_OPTS
value: " -Xmx256m -Xms256m"
resources:
requests:
memory: "256Mi"
cpu: "500m"
limits:
memory: "512Mi"
cpu: "1"
ports:
- name: http
containerPort: 8080
readinessProbe:
httpGet:
path: /management/health
port: http
initialDelaySeconds: 20
periodSeconds: 15
failureThreshold: 6
livenessProbe:
httpGet:
path: /management/health
port: http
initialDelaySeconds: 120
jhipsterpress-service.yml
apiVersion: v1
kind: Service
metadata:
name: jhipsterpress
namespace: default
labels:
app: jhipsterpress
spec:
selector:
app: jhipsterpress
type: LoadBalancer
ports:
- name: http
port: 8080
When I included a Retain Policy I was getting this error:
#cloudshell:~ (academic-veld-230622)$ kubectl apply -f storage-google.yaml
error: error validating "storage-google.yaml": error validating data:
ValidationError(PersistentVolumeClaim.spec): unknown field "persistentVolumeReclaimPolicy" in io.k8s.api.core.v1.PersistentVolumeClaimSpec; if you choose to ignore these errors, turn validation off with --validate=false
Please, if you know of a complete example on a public image that works (in postgresql, I can make it work with Mongo), I will really appreciate it.
Thanks to all.
Note that for this to work you need to have your PVC dynamically provision a PV to satisfy its requirements, then there will be a permanent binding between the PVC and PV and every time your workload uses the PVC then it will use the same PV. Specifically indicated by this excerpt:
If a PV was dynamically provisioned for a new PVC, the loop will always bind that PV to the PVC
If in your case the Google Persistent Disk is being provisioned by the PVC, and you can verify that on GCP it's the same PV used every time, then it's probably an issue with the pod startup process where it's removing all the data. (Is there any reason why you are using /var/lib/postgresql/ vs /var/lib/postgresql?)
Also, persistentVolumeReclaimPolicy: Retain applies to a PV, not a PVC. For dynamically provisioned PVs the value is Delete. In your case, it wouldn't apply because your dynamically provisioned volume should be bound to your PVC. In other words, you are not reclaiming the volume.
Having said all that the recommended way to deploy a DB is using StatefulSets similar to this mysql example using a volumeClaimTemplate.