Service loses connection to Etcd DB when pod restarts - kubernetes

I have a Go Lang REST service and ETCD DB in one container, deployed in kubernetes cluster using Deployment type. Whenever I try to restart the service pod, the service loses connectivity to ETCD, I have tried using stateful sets instead of deployment but still didn't help. My deployment looks something like below.
The ETCD fails restarting due to this issue: https://github.com/etcd-io/etcd/issues/10487
PVC :
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: XXXX
namespace: XXXX
annotations:
volume.beta.kubernetes.io/storage-class: glusterfs-storage
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi
Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: XXX
namespace: XXX
spec:
replicas: X
XXXXXXX
template:
metadata:
labels:
app: rest-service
version: xx
spec:
hostAliases:
- ip: 127.0.0.1
hostnames:
- "etcd.xxxxx"
containers:
- name: rest-service
image: xxxx
imagePullPolicy: IfNotPresent
ports:
- containerPort: xxx
securityContext:
readOnlyRootFilesystem: false
capabilities:
add:
- IPC_LOCK
- name: etcd-db
image: quay.io/coreos/etcd:v3.3.11
imagePullPolicy: IfNotPresent
command:
- etcd
- --name=etcd-db
- --listen-client-urls=https://0.0.0.0:2379
- --advertise-client-urls=https://etcd.xxxx:2379
- --data-dir=/var/etcd/data
- --client-cert-auth
- --trusted-ca-file=xxx/ca.crt
- --cert-file=xxx/tls.crt
- --key-file=xxx/tls.key
volumeMounts:
- mountPath: /var/etcd/data
name: etcd-data
XXXX
ports:
- containerPort: 2379
volumes:
- name: etcd-data
persistentVolumeClaim:
claimName: XXXX
I would expect the DB to be able to connect to pod even when it restarts

Keeping application and database in one pod is one of the worst practices in Kubernetes. If you update application code - you have to restart pod to apply changes. So you restart database also just for nothing.
Solution is very simple - you should run application in one deployment and database - in another. That way you can update application without restarting database. In that case you can also scale app and DB separately, like add more replicas to app while keeping DB at 1 replicas or vice versa.

Related

container level securityContext fsGroup

I'm trying to play with single pod multi container scenario.
The problem is one of my container (directus) is a node app that run as user 'node' with uid 1000
First try, I use hostpath as storage back end. With this, I need to change the host's directory mode with chmod manualy.
Now, I'm trying using longhorn.
And basicaly I don't want to change a host directory mod/ownership each time i deploy this deployment.
Here is my manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: lh-directus
namespace: lh-directus
spec:
replicas: 1
selector:
matchLabels:
app: lh-directus
template:
metadata:
labels:
app: lh-directus
spec:
nodeSelector:
kubernetes.io/os: linux
isGeneralDeployment: "true"
volumes:
- name: lh-directus-uploads-volume
persistentVolumeClaim:
claimName: lh-directus-uploads-pvc
- name: lh-directus-dbdata-volume
persistentVolumeClaim:
claimName: lh-directus-dbdata-pvc
containers:
# Redis Cache
- name: redis
image: redis:6
# Database
- name: database
image: postgres:12
volumeMounts:
- name: lh-directus-dbdata-volume
mountPath: /var/lib/postgresql/data
# Directus
- name: directus
image: directus/directus:latest
securityContext:
fsGroup: 1000
volumeMounts:
- name: lh-directus-uploads-volume
mountPath: /directus/uploads
When I Appy the manifest, I got error
error: error validating "lh-directus.yaml": error validating data: ValidationError(Deployment.spec.template.spec.containers[2].securityContext): unknown field "fsGroup" in io.k8s.api.core.v1.SecurityContext; if you choose to ignore these errors, turn validation off with --validate=false
I reads about initContainer ....
But Kindly please tell me how to fix this problem without initContainer and without manualy set/change host's path ownership/mod.
Sincerely
-bino-

How to run DNS Server (dnsmasq) in Kubernetes?

I'm trying to run DNS Server (Dnsmasq) in Kubernetes cluster. The cluster has only one node. Everything works fine until I need to restart dnsmasq container (kubectl rollout restart daemonsets dnsmasq-daemonset) to apply changes made to hosts ConfigMap. As I found out this is needed as Dnsmasq that is already running will not otherwise load changes made into hosts ConfigMap.
Soon as the container is restarted it is not able to pull dnsmasq image and it fails. It is expected behavior as it cannot resolve the image name as there are no other dns servers running, but I wonder what would be best way around it or what are the best practices with running DNS Server in Kubernetes in general. Is this something that CoreDNS is used for or what other alternatives are there? Maybe some high availability solution?
hosts ConfigMap:
---
apiVersion: v1
kind: ConfigMap
metadata:
name: dnsmasq-hosts
namespace: core
data:
hosts: |
127.0.0.1 localhost
10.x.x.x example.com
...
Dnsmasq deployment:
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: dnsmasq-daemonset
namespace: core
spec:
selector:
matchLabels:
app: dnsmasq-app
template:
metadata:
labels:
app: dnsmasq-app
namespace: core
spec:
containers:
- name: dnsmasq
image: registry.gitlab.com/path/to/dnsmasqImage:tag
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: "1"
memory: "32Mi"
requests:
cpu: "150m"
memory: "16Mi"
ports:
- name: dns
containerPort: 53
hostPort: 53
protocol: UDP
volumeMounts:
- name: conf-dnsmasq
mountPath: /etc/dnsmasq.conf
subPath: dnsmasq.conf
readOnly: true
- name: dnsconf-dnsmasq
mountPath: /etc/dnsmasq.d/dns.conf
subPath: dns.conf
readOnly: true
- name: hosts-dnsmasq
mountPath: /etc/dnsmasq.d/hosts
subPath: hosts
readOnly: true
volumes:
- name: conf-dnsmasq
configMap:
name: dnsmasq-conf
- name: dnsconf-dnsmasq
configMap:
name: dnsmasq-dnsconf
- name: hosts-dnsmasq
configMap:
name: dnsmasq-hosts
imagePullSecrets:
- name: gitlab-registry-credentials
nodeSelector:
kubernetes.io/hostname: master
restartPolicy: Always
I tried to use imagePullPolicy: Never, but it seems to fail anyway.

GCP Firestore: Server request fails with Missing or insufficient permissions from GKE

I am trying to connect to Firestore from code running on GKE Container. Simple REST GET api is working fine, but when I access the Firestore from read/write, I am getting Missing or insufficient permissions.
An unhandled exception was thrown by the application.
Info
2021-06-06 21:21:20.283 EDT
Grpc.Core.RpcException: Status(StatusCode="PermissionDenied", Detail="Missing or insufficient permissions.", DebugException="Grpc.Core.Internal.CoreErrorDetailException: {"created":"#1623028880.278990566","description":"Error received from peer ipv4:172.217.193.95:443","file":"/var/local/git/grpc/src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Missing or insufficient permissions.","grpc_status":7}")
at Google.Api.Gax.Grpc.ApiCallRetryExtensions.<>c__DisplayClass0_0`2.<<WithRetry>b__0>d.MoveNext()
Update I am trying to provide secret to pod with service account credentails.
Here is the k8 file which deploys a pod to cluster with no issues when no secrets are provided and I can do Get Operations which don't hit Firestore, and they work fine.
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/policies
port: 80
ports:
- name: worldmgmt
containerPort: 80
Now, if I try to mount secret, the pod never gets created fully, and it eventually fails
kind: Deployment
apiVersion: apps/v1
metadata:
name: foo-worldmanagement-production
spec:
replicas: 1
selector:
matchLabels:
app: foo
role: worldmanagement
env: production
template:
metadata:
name: worldmanagement
labels:
app: foo
role: worldmanagement
env: production
spec:
volumes:
- name: google-cloud-key
secret:
secretName: firestore-key
containers:
- name: worldmanagement
image: gcr.io/foodev/foo/master/worldmanagement.21
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
resources:
limits:
memory: "500Mi"
cpu: "300m"
imagePullworld: Always
readinessProbe:
httpGet:
path: /api/worldManagement/earth
port: 80
ports:
- name: worldmgmt
containerPort: 80
I tried to deploy the sample application and it works fine.
If I keep only the following the yaml file, the container gets deployed properly
- name: google-cloud-key
secret:
secretName: firestore-key
But once I add the following to yaml, it fails
volumeMounts:
- name: google-cloud-key
mountPath: /var/
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/key.json
And I can see in GCP events that the container is not able to find the google-cloud-key. Any idea how to troubleshoot this issue, i.e why I am not able to mount the secrets, I can bash into the pod if needed.
I am using multi stage docker file made of
From mcr.microsoft.com/dotnet/sdk:5.0 AS build
FROM mcr.microsoft.com/dotnet/aspnet:5.0 AS runtime
Thanks
Looks like they key itself might not be correctly visible to the pod. I would start by getting into the pod with kubectl exec --stdin --tty <podname> -- /bin/bash and ensuring that the /var/key.json (per your config) is accessible and has the correct credentials.
The following would be a good way to mount the secret:
volumeMounts:
- name: google-cloud-key
mountPath: /var/run/secret/cloud.google.com
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/run/secret/cloud.google.com/key.json
The above assumes your secret was created with a command like:
kubectl --namespace <namespace> create secret generic firestore-key --from-file key.json
Also it is important to check your Workload Identity setup. The Workload Identity | Kubernetes Engine Documentation has a good section on this.

unable to mount a specific directory from couchdb pod kubernetes

Hi I am trying to mount a directory from pod where couchdb is running . directory is /opt/couchdb/data and for mounting in kubernetes I am using this config for deployment .
apiVersion: v1
kind: Service
metadata:
name: couchdb0-peer0org1
spec:
ports:
- port: 5984
targetPort: 5984
type: NodePort
selector:
app: couchdb0-peer0org1
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: couchdb0-peer0org1
spec:
selector:
matchLabels:
app: couchdb0-peer0org1
strategy:
type: Recreate
template:
metadata:
labels:
app: couchdb0-peer0org1
spec:
containers:
- image: hyperledger/fabric-couchdb
imagePullPolicy: IfNotPresent
name: couchdb0
env:
- name: COUCHDB_USER
value: admin
- name: COUCHDB_PASSWORD
value: admin
ports:
- containerPort: 5984
name: couchdb0
volumeMounts:
- name: datacouchdbpeer0org1
mountPath: /opt/couchdb/data
subPath: couchdb0
volumes:
- name: datacouchdbpeer0org1
persistentVolumeClaim:
claimName: worker1-incoming-volumeclaim
so by applying this deployments . I always gets result for the pods .
couchdb0-peer0org1-b89b984cf-7gjfq 0/1 CrashLoopBackOff 1 9s
couchdb0-peer0org2-86f558f6bb-jzrwf 0/1 CrashLoopBackOff 1 9s
But now the strange thing if I changed mounted directory from /opt/couchdb/data to /var/lib/couchdb then it works fine . But the issue is that I have to store the data for couchdb database in statefull manner .
Edit your /etc/exports with following content
"path/exported/directory *(rw,sync,no_subtree_check,no_root_squash)"
and then restart NFS server:
sudo /etc/init.d/nfs-kernel-server restart*
no_root_squash is used, remote root users are able to change any file on the shared file. This a quick solution but have some security concerns

Spring boot application pod fails to find the mongodb pod on Kubernetes cluster

I have a Spring Boot Application backed by MongoDB. Both are deployed on a Kubernetes cluster on Azure. My Application throws "Caused by: java.net.UnknownHostException: mongo-dev-0 (pod): Name or service not known" when it tries to connect to MongoDB.
I am able to connect to the mongo-dev-0 pod and run queries on the MongoDB, so there is no issue with the Mongo itself and it looks like the Spring boot is able to connect to Mongo Service and discover the pod behind the service.
How do I ensure the pods are discoverable by my Spring Boot Application?
How do I go about debugging this issue?
Any help is appreciated. Thanks in advance.
Here is my config:
---
apiVersion: v1
kind: Service
metadata:
name: mongo-dev
labels:
name: mongo-dev
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo-dev
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo-dev
spec:
serviceName: "mongo-dev"
replicas: 3
template:
metadata:
labels:
role: mongo-dev
environment: dev
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo-dev
image: mongo:3.4
command:
- mongod
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
- "--auth"
- "--bind_ip"
- 0.0.0.0
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-dev-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo-dev,environment=dev"
- name: KUBERNETES_MONGO_SERVICE_NAME
value: "mongo-dev"
volumeClaimTemplates:
- metadata:
name: mongo-dev-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "devdisk"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Gi
---
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: devdisk
provisioner: kubernetes.io/azure-disk
parameters:
skuName: Premium_LRS
location: abc
storageAccount: xyz
To be able to reach your mongodb pod via its service from your spring boot application, you have to start the mongodb pod and the corresponding service first, and then start your spring boot application pod (let's name it sb-pod).
You can enforce this order by using an initContainer in your sb-pod; to wait for the database service to be available before starting. Something like:
initContainers:
- name: init-mongo-dev
image: busybox
command: ['sh', '-c', 'until nslookup mongo-dev; do echo waiting for mongo-dev; sleep 2; done;']
If you connect to your sb-pod using:
kubectl exec -it sb-pod bash
and type the env command, make sure you can see the environment variables
MONGO_DEV_SERVICE_HOST and MONGO_DEV_SERVICE_PORT
How about mongo-dev-0.mongo-dev.default.svc.cluster.local ?
<pod-id>.<service name>.<namespace>.svc.cluster.local
As in Stable Network ID.