HDFS Namenode format issue with AWS EBS in EKS cluster - kubernetes

I have EKS cluster with EBS storage class/volume. I have elasticsearch cluster running fine with this EBS storage (as persistent volume/pvc).
I am trying to deploy hdfs namenode image (bde2020/hadoop-namenode) using statefulset, but it gives me below error always:
2020-05-09 08:59:02,400 INFO util.GSet: capacity = 2^15 = 32768 entries
2020-05-09 08:59:02,415 INFO common.Storage: Lock on /hadoop/dfs/name/in_use.lock acquired by nodename 87#hdfs-name-0.hdfs-name.pulse.svc.cluster.local
2020-05-09 08:59:02,417 WARN namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:252)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1105)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:720)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:953)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:926)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1692)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1759)
I checked run.sh of this iameg and it does seem to be formatting namenode if dir is empty. But this does not work in may case (with EBS as PVC). Any help would be very appreciated.
My deployment yml is:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: hdfs-name
labels:
component: hdfs-name
spec:
serviceName: hdfs-name
replicas: 1
selector:
matchLabels:
component: hdfs-name
template:
metadata:
labels:
component: hdfs-name
spec:
containers:
- name: hdfs-name
image: bde2020/hadoop-namenode
env:
- name: CLUSTER_NAME
value: hdfs-k8s
ports:
- containerPort: 8020
name: nn-rpc
- containerPort: 50070
name: nn-web
volumeMounts:
- name: hdfs-name-pv-claim
mountPath: /hadoop/dfs/name
volumeClaimTemplates:
- metadata:
name: hdfs-name-pv-claim
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: ebs
resources:
requests:
storage: 1Gi

With ebs storage class, lost+found folder is created automatically. Due to this, namenode format does not occur.
Having initcontainer to delete the lost+found folder seems to work.
initContainers:
- name: delete-lost-found
image: busybox
command: ["sh", "-c", "rm -rf /hadoop/dfs/name/lost+found"]
volumeMounts:
- name: hdfs-name-pv-claim
mountPath: /hadoop/dfs/name

Related

Debezium server fffset file storage claim or cleanup

So, for non-kafka deployment, we specify offset storage file with debezium.source.offset.storage.file.filename as per documentation, how does the storage is reclaimed or cleanup happens?
I am planning to have k8s deployment with mounted volume as,
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: debezium-gke
labels:
name: debezium
spec:
serviceName: debezium
selector:
matchLabels:
name: debezium
template:
spec:
containers:
- name: debezium
image: debezium_server:1.10
imagePullPolicy: Always
volumeMounts:
- name: debezium-config-volume
mountPath: /debezium/conf
- name: debezium-data-volume
mountPath: /debezium/data
volumes:
- name: debezium-config-volume
configMap:
name: debezium
volumeClaimTemplates:
- metadata:
name: debezium-data-volume
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10G
But wondering if the disk spaces gets filled in if storage reclaim doesn't happen!
Thanks

NFS mount within K8s pods failing

I am trying to get NFS share mounted into a k8s pod but its failing with the below error
mount.nfs: rpc.statd is not running but is required for remote locking. mount.nfs: Either use '-o nolock' to keep locks local, or start statd.
I tried to start rpcbind using CMD command in docker container that also did not work.
My deployment yaml is as below
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
spec:
selector:
matchLabels:
app: nfs-client-provisioner
replicas: 1
strategy:
type: Recreate
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccountName: nfs-client-provisioner
containers:
- name: nfs-client-provisioner
image: gcr.io/k8s-staging-sig-storage/nfs-subdir-external-provisioner:v4.0.0
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME
value: nfs-provisioner
- name: NFS_SERVER
value: NFS SERVER PATH
- name: NFS_PATH
value: /filesharepath
volumes:
- name: nfs-client-root
nfs:
server: <NFS IP>
path: /filesharepath
I saw in github that there is an identical issue which says rpcbind needs to be on the base system.
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner/issues/224
Please assist.

What command is needed to start restoring the mariadb database when declaring the kubenetes manifest

Dockerfile
FROM mariadb
COPY all-databases.sql /base/all-databases.sql
Manifest
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mariadb
spec:
serviceName: mariadb
selector:
matchLabels:
app: mariadb
template:
metadata:
labels:
app: mariadb
spec:
containers:
- name: mariadb
image: pawwwel/mariadb01:v5
ports:
- containerPort: 3306
env:
- name: MYSQL_ROOT_PASSWORD
value: password
volumeMounts:
- name: pvclocal
mountPath: /var/lib/mysql
command: ["mysql", "-uroot", "-ppassword", "<", "/base/all-databases.sql"]
volumeClaimTemplates:
- metadata:
name: pvclocal
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: local-storage
resources:
requests:
storage: 1Gi
The presence of such a command leads to an error in the pod.
command: ["mysql", "-uroot", "-ppassword", "<", "/base/all-databases.sql"]
Without it, an empty database is created.
Tell me how to automate the recovery of the database from a backup in kubernetes.

How to have data persist in GKE kubernetes StatefulSet with postgres?

So I'm just trying to get a web app running on GKE experimentally to familiarize myself with Kubernetes and GKE.
I have a statefulSet (Postgres) with a persistent volume/ persistent volume claim which is mounted to the Postgres pod as expected. The problem I'm having is having the Postgres data endure. If I mount the PV at var/lib/postgres the data gets overridden with each pod update. If I mount at var/lib/postgres/data I get the warning:
initdb: directory "/var/lib/postgresql/data" exists but is not empty
It contains a lost+found directory, perhaps due to it being a mount point.
Using a mount point directly as the data directory is not recommended.
Create a subdirectory under the mount point.
Using Docker alone having the volume mount point at var/lib/postgresql/data works as expected and data endures, but I don't know what to do now in GKE. How does one set this up properly?
Setup file:
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: sm-pd-volume-claim
spec:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1G
---
apiVersion: "apps/v1"
kind: "StatefulSet"
metadata:
name: "postgis-db"
namespace: "default"
labels:
app: "postgis-db"
spec:
serviceName: "postgis-db"
replicas: 1
selector:
matchLabels:
app: "postgis-db"
template:
metadata:
labels:
app: "postgis-db"
spec:
terminationGracePeriodSeconds: 25
containers:
- name: "postgis"
image: "mdillon/postgis"
ports:
- containerPort: 5432
name: postgis-port
volumeMounts:
- name: sm-pd-volume
mountPath: /var/lib/postgresql/data
volumes:
- name: sm-pd-volume
persistentVolumeClaim:
claimName: sm-pd-volume-claim
You are getting this error because the postgres pod has tried to mount the data directory on / folder. It is not recommended to do so.
You have to create subdirectory to resolve this issues on the statefulset manifest yaml files.
volumeMounts:
- name: sm-pd-volume
mountPath: /var/lib/postgresql/data
subPath: data

K8s pod has unbound immediate PersistentVolumeClaims - mongoDB

I'm trying to deploy mongoDB in my Kubernetes cluster (Google Cloud Platform). Right now I'm focused on minikube local version of the deployment. The thing is that I'm getting error like so:
pod has unbound immediate PersistentVolumeClaims
This is my config files which sets up mongodb inside the minikube:
Service file:
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
Stateful set file:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongo
spec:
selector:
matchLabels:
role: mongo
serviceName: "mongo"
replicas: 1
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo
command:
- mongod
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi
Storage class file:
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
What am I missing here?
I think the problem is that you are trying to apply of
provisioner: kubernetes.io/gce-pd
this should not work locally, as it is intended for GCE PD.
For minikube, you can create hostPath pvc. Read more
I work with minikube during the feature development and I am running MongoDB as well, I would recommend you to use the hostPath when working with minikube, here is my volume definition:
volume
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: "blog-repoflow-resources-data-volume"
namespace: repoflow-blog-namespace
labels:
service: "resources-data-service"
fsOwner: "1001"
fsGroup: "0"
fsMode: "775"
spec:
capacity:
storage: "500Mi"
accessModes:
- "ReadWriteOnce"
storageClassName: local-storage
hostPath:
path: /home/docker/production/blog.repoflow.com/volumes/blog-repoflow-resources-data-volume
The complete cluster is up and running and open source: repository.
Check also the stateful, volume claim and service definitions