I'm trying to run DNS Server (Dnsmasq) in Kubernetes cluster. The cluster has only one node. Everything works fine until I need to restart dnsmasq container (kubectl rollout restart daemonsets dnsmasq-daemonset) to apply changes made to hosts ConfigMap. As I found out this is needed as Dnsmasq that is already running will not otherwise load changes made into hosts ConfigMap.
Soon as the container is restarted it is not able to pull dnsmasq image and it fails. It is expected behavior as it cannot resolve the image name as there are no other dns servers running, but I wonder what would be best way around it or what are the best practices with running DNS Server in Kubernetes in general. Is this something that CoreDNS is used for or what other alternatives are there? Maybe some high availability solution?
hosts ConfigMap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: dnsmasq-hosts
namespace: core
data:
hosts: |
127.0.0.1 localhost
10.x.x.x example.com
...
Dnsmasq deployment:
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: dnsmasq-daemonset
namespace: core
spec:
selector:
matchLabels:
app: dnsmasq-app
template:
metadata:
labels:
app: dnsmasq-app
namespace: core
spec:
containers:
- name: dnsmasq
image: registry.gitlab.com/path/to/dnsmasqImage:tag
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "1"
memory: "32Mi"
requests:
cpu: "150m"
memory: "16Mi"
ports:
- name: dns
containerPort: 53
hostPort: 53
protocol: UDP
volumeMounts:
- name: conf-dnsmasq
mountPath: /etc/dnsmasq.conf
subPath: dnsmasq.conf
readOnly: true
- name: dnsconf-dnsmasq
mountPath: /etc/dnsmasq.d/dns.conf
subPath: dns.conf
readOnly: true
- name: hosts-dnsmasq
mountPath: /etc/dnsmasq.d/hosts
subPath: hosts
readOnly: true
volumes:
- name: conf-dnsmasq
configMap:
name: dnsmasq-conf
- name: dnsconf-dnsmasq
configMap:
name: dnsmasq-dnsconf
- name: hosts-dnsmasq
configMap:
name: dnsmasq-hosts
imagePullSecrets:
- name: gitlab-registry-credentials
nodeSelector:
kubernetes.io/hostname: master
restartPolicy: Always
I tried to use imagePullPolicy: Never, but it seems to fail anyway.
Related
I have a strange result from using nginx and IIS server together in single Kubernetes pod. It seems to be an issue with nginx.conf. If I bypass nginx and go directly to IIS, I see the standard landing page -
However when I try to go through the reverse proxy I see this partial result -
Here are the files:
nginx.conf:
events {
worker_connections 4096; ## Default: 1024
}
http{
server {
listen 81;
#Using variable to prevent nginx from checking hostname at startup, which leads to a container failure / restart loop, due to nginx starting faster than IIS server.
set $target "http://127.0.0.1:80/";
location / {
proxy_pass $target;
}
}
}
deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
...
name: ...
spec:
replicas: 1
selector:
matchLabels:
pod: ...
template:
metadata:
labels:
pod: ...
name: ...
spec:
containers:
- image: claudiubelu/nginx:1.15-1-windows-amd64-1809
name: nginx-reverse-proxy
volumeMounts:
- mountPath: "C:/usr/share/nginx/conf"
name: nginx-conf
imagePullPolicy: Always
- image: some-repo/proprietary-server-including-iis
name: ...
imagePullPolicy: Always
nodeSelector:
kubernetes.io/os: windows
imagePullSecrets:
- name: secret1
volumes:
- name: nginx-conf
persistentVolumeClaim:
claimName: pvc-nginx
Mapping the nginx.conf file from a volume is just a convenient way to rapidly test different configs. New configs can be swapped in using kubectl cp ./nginx/conf nginx-busybox-pod:/mnt/nginx/.
Busybox pod (used to access the PVC):
apiVersion: v1
kind: Pod
metadata:
name: nginx-busybox-pod
namespace: default
spec:
containers:
- image: busybox
command:
- sleep
- "360000"
imagePullPolicy: Always
name: busybox
volumeMounts:
- name: nginx-conf
mountPath: "/mnt/nginx/conf"
restartPolicy: Always
volumes:
- name: nginx-conf
persistentVolumeClaim:
claimName: pvc-nginx
nodeSelector:
kubernetes.io/os: linux
And lastly the PVC:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-nginx
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 100Mi
storageClassName: azurefile
Any ideas why?
I am trying to connect to Firestore from code running on GKE Container. Simple REST GET api is working fine, but when I access the Firestore from read/write, I am getting Missing or insufficient permissions.
An unhandled exception was thrown by the application.
Info
2021-06-06 21:21:20.283 EDT
Grpc.Core.RpcException: Status(StatusCode="PermissionDenied", Detail="Missing or insufficient permissions.", DebugException="Grpc.Core.Internal.CoreErrorDetailException: {"created":"#1623028880.278990566","description":"Error received from peer ipv4:172.217.193.95:443","file":"/var/local/git/grpc/src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Missing or insufficient permissions.","grpc_status":7}")
at Google.Api.Gax.Grpc.ApiCallRetryExtensions.<>c__DisplayClass0_0`2.<<WithRetry>b__0>d.MoveNext()
Update I am trying to provide secret to pod with service account credentails.
Here is the k8 file which deploys a pod to cluster with no issues when no secrets are provided and I can do Get Operations which don't hit Firestore, and they work fine.
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/policies
port: 80
ports:
- name: worldmgmt
containerPort: 80
Now, if I try to mount secret, the pod never gets created fully, and it eventually fails
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
volumes:
- name: google-cloud-key
secret:
secretName: firestore-key
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/earth
port: 80
ports:
- name: worldmgmt
containerPort: 80
I tried to deploy the sample application and it works fine.
If I keep only the following the yaml file, the container gets deployed properly
- name: google-cloud-key
secret:
secretName: firestore-key
But once I add the following to yaml, it fails
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
And I can see in GCP events that the container is not able to find the google-cloud-key. Any idea how to troubleshoot this issue, i.e why I am not able to mount the secrets, I can bash into the pod if needed.
I am using multi stage docker file made of
From mcr.microsoft.com/dotnet/sdk:5.0 AS build
FROM mcr.microsoft.com/dotnet/aspnet:5.0 AS runtime
Thanks
Looks like they key itself might not be correctly visible to the pod. I would start by getting into the pod with kubectl exec --stdin --tty <podname> -- /bin/bash and ensuring that the /var/key.json (per your config) is accessible and has the correct credentials.
The following would be a good way to mount the secret:
volumeMounts:
- name: google-cloud-key
mountPath: /var/run/secret/cloud.google.com
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/run/secret/cloud.google.com/key.json
The above assumes your secret was created with a command like:
kubectl --namespace <namespace> create secret generic firestore-key --from-file key.json
Also it is important to check your Workload Identity setup. The Workload Identity | Kubernetes Engine Documentation has a good section on this.
I'm trying to deploy Drupal 7 in Kubernetes, It fails with an error Fatal error: require_once(): Failed opening required '/var/www/html/modules/system/system.install' (include_path='.:/usr/local/lib/php') in /var/www/html/includes/install.core.inc on line 241.
Here is K8S deployment manifest:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: drupal-pvc
annotations:
pv.beta.kubernetes.io/gid: "drupal-gid"
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
---
apiVersion: v1
kind: Service
metadata:
name: drupal-service
spec:
ports:
- name: http
port: 80
protocol: TCP
selector:
app: drupal
type: LoadBalancer
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
labels:
app: drupal
name: drupal
spec:
replicas: 1
template:
metadata:
labels:
app: drupal
spec:
initContainers:
- name: init-sites-volume
image: drupal:7.72
command: ['/bin/bash', '-c']
args: ['cp -r /var/www/html/sites/ /data/; chown www-data:www-data /data/ -R']
volumeMounts:
- mountPath: /data
name: vol-drupal
containers:
- image: drupal:7.72
name: drupal
ports:
- containerPort: 80
volumeMounts:
- mountPath: /var/www/html/modules
name: vol-drupal
subPath: modules
- mountPath: /var/www/html/profiles
name: vol-drupal
subPath: profiles
- mountPath: /var/www/html/sites
name: vol-drupal
subPath: sites
- mountPath: /var/www/html/themes
name: vol-drupal
subPath: themes
volumes:
- name: vol-drupal
persistentVolumeClaim:
claimName: drupal-pvc
However, when I remove the volumeMounts from the drupal container, it works. I need to use volumes in order to persist the website data, can any one suggest a fix?
Update: I have also added the manifest for the persistence volume.
check if you could write to mounted volume.
# kubectl exec -it drupal-zxxx -- sh
$ ls -alhtr /var/www/html/modules
$ cd /var/www/html/modules
$ touch test.txt
because storage configured with a group ID (GID) allows writing only by Pods using the same GID. Mismatched or missing GIDs cause permission denied errors.
alternatively you could try out an operator for drupal:
https://github.com/geerlingguy/drupal-operator
Also helm chart is another option:
https://bitnami.com/stack/drupal/helm
In Google Kubernetes Engine I created a POC cluster for our company which worked flawlessly. But now, when I try to create our production environment I cannot seem to get the imagesPullSecrets to work, it's the exact same credentials as in the POC, Same helm chart and the exact same regcred yaml file.
Yet i keep getting the classical:
Back-off pulling image "registry.company.co/frontend/company-web/upload": ImagePullBackOff
Pulling manually on the node works with the same credentials as those that i supplied in the imagesPullSecret
I've tried defining the imagesPullSecret both on a chart level and on the Service Account
I've verified the secret format and directly copied the credentials there when trying the manual pulls
GKE picks up regcred and shows it in the deployment
Regcred generated by kubectl create secret docker-registry regcred --docker-server="registry.company.co" --docker-username="gitlab" --docker-password="[PASSWORD]"
regcred secret
kind: Secret
apiVersion: v1
metadata:
name: regcred
namespace: default
data:
.dockerconfigjson: eyJhdXRocyI6eyJyZWdpc3RyeS5jb21wYW55LmNvIjp7InVzZXJuYW1lIjoiZ2l0bGFiIiwicGFzc3dvcmQiOiJbUkVEQUNURURdIiwiYXV0aCI6IloybDBiR0ZpT2x0QmJITnZJRkpsWkdGamRHVmtYUT09In19fQ==
type: kubernetes.io/dockerconfigjson
Service Account
kind: ServiceAccount
apiVersion: v1
metadata:
name: default
namespace: default
secrets:
- name: default-token-jktj5
imagePullSecrets:
- name: regcred
Deployment.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nfs-server
spec:
replicas: 1
selector:
matchLabels:
role: nfs-server
template:
metadata:
labels:
role: nfs-server
spec:
containers:
- name: nfs-server
image: gcr.io/google_containers/volume-nfs:latest
ports:
- name: nfs
containerPort: 2049
- name: mountd
containerPort: 20048
- name: rpcbind
containerPort: 111
securityContext:
privileged: true
volumeMounts:
- mountPath: /exports
name: mypvc
initContainers:
- name: init-volume-perms
imagePullPolicy: Always
image: alpine
command: ["/bin/sh", "-c"]
args: ["mkdir /mnt/company-logos; mkdir /mnt/uploads; chown -R 1337:1337 /mnt"]
volumeMounts:
- mountPath: /mnt
name: mypvc
- name: company-web-uploads
image: registry.company.co/frontend/company-web/uploads
imagePullPolicy: Always
volumeMounts:
- mountPath: /var/lib/company/web/uploads
subPath: uploads
name: mypvc
- name: company-logos
image: registry.company.co/backend/pdf-service/company-logos
imagePullPolicy: Always
volumeMounts:
- mountPath: /var/lib/company/shared/company-logos
subPath: company-logos
name: mypvc
volumes:
- name: mypvc
gcePersistentDisk:
pdName: gke-nfs-disk
fsType: ext4
I've looked around, following different guides from the ground up to no success.
So I'm at a total loss as to what to do.
Default namespace all around
It may be because of namespace issue. Can you verify a few things
Are you using default namespace at both places?
K8S version difference between poc and prod.
Can you recreate working secret by something like kubectl get secret default-token-jktj5 -o yaml > imagepullsecret.yaml. Edit the yaml file to remove revision and other status information. Apply the same to prod
I have seen this issue in GKE because of multiline secret conversion to base64. Ensure secrets are matching between environments.
I have a Go Lang REST service and ETCD DB in one container, deployed in kubernetes cluster using Deployment type. Whenever I try to restart the service pod, the service loses connectivity to ETCD, I have tried using stateful sets instead of deployment but still didn't help. My deployment looks something like below.
The ETCD fails restarting due to this issue: https://github.com/etcd-io/etcd/issues/10487
PVC :
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: XXXX
namespace: XXXX
annotations:
volume.beta.kubernetes.io/storage-class: glusterfs-storage
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: XXX
namespace: XXX
spec:
replicas: X
XXXXXXX
template:
metadata:
labels:
app: rest-service
version: xx
spec:
hostAliases:
- ip: 127.0.0.1
hostnames:
- "etcd.xxxxx"
containers:
- name: rest-service
image: xxxx
imagePullPolicy: IfNotPresent
ports:
- containerPort: xxx
securityContext:
readOnlyRootFilesystem: false
capabilities:
add:
- IPC_LOCK
- name: etcd-db
image: quay.io/coreos/etcd:v3.3.11
imagePullPolicy: IfNotPresent
command:
- etcd
- --name=etcd-db
- --listen-client-urls=https://0.0.0.0:2379
- --advertise-client-urls=https://etcd.xxxx:2379
- --data-dir=/var/etcd/data
- --client-cert-auth
- --trusted-ca-file=xxx/ca.crt
- --cert-file=xxx/tls.crt
- --key-file=xxx/tls.key
volumeMounts:
- mountPath: /var/etcd/data
name: etcd-data
XXXX
ports:
- containerPort: 2379
volumes:
- name: etcd-data
persistentVolumeClaim:
claimName: XXXX
I would expect the DB to be able to connect to pod even when it restarts
Keeping application and database in one pod is one of the worst practices in Kubernetes. If you update application code - you have to restart pod to apply changes. So you restart database also just for nothing.
Solution is very simple - you should run application in one deployment and database - in another. That way you can update application without restarting database. In that case you can also scale app and DB separately, like add more replicas to app while keeping DB at 1 replicas or vice versa.