Kubernetes MongoDB autoscaling - mongodb

I've deployed a stateful mongodb setup in my k8s cluster. Everytime a scale a new pod, I need to add the pod from mongodb console using rs.add() command. Is there any way I can orchestrate this ?..Also how can I expose my mongodb service outside my k8s cluster..Changing the service type to nodeport didn't work for me..Please help.
Giving below the stateful yaml file which I used to deploy mongodb.
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 3
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo:3.4
command:
- mongod
- "--replSet"
- rs0
- "--bind_ip"
- 0.0.0.0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "managed-nfs-storage"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 2Gi

As you are using Kubernetes, which is Container Orchestration platform, you can always scale your deployment/statefulset using $ kubectl scale deployment [deployment_name] --repplicas=X
or $ kubectl scale statefulset [statefulset-name] --replicas=X
where X means how many pods in total you want to have in deployment. It will create autoamatically pods based on your deployment settings.
If you don't want to create it manually, you should read about Kubernetes autoscaling - HPA.
About exposing application outside Kubernetes you have to do it using Service. More information can be found here. I am not sure if NodePort is right in this scenario. You can check ServiceType description.
However I am not very familiar with MongoDB with Kubernetes, but maybe those tutorials help you.
Scaling MongoDB on Kubernetes, Running MongoDB as a Microservice with Docker and Kubernetes, Running MongoDB on Kubernetes with StatefulSets.
Hope it will help.

As #PjoterS suggest you can scale the mongoDB replicas or pods inside the kubernetes using HPA.
But with that you should have to also take care about the volume mounting with it.also data latency between replicas.
I would suggest better first check the native scaling cluster option provided by the mongo db it self and configure. You can use some operators for mongoDB
like : https://docs.mongodb.com/kubernetes-operator/master/tutorial/install-k8s-operator/
Otherwise if you have current config is following native cluster and support scaling replica and data copy between replca's you can go for HPA.
you can also have a look at this : https://medium.com/faun/scaling-mongodb-on-kubernetes-32e446c16b82

Related

GKE sticky connections makes autoscaling uneffective because of limited pod ports (API to database)

I have an API to which I send requests, and this API connects to MongoDB through MongoClient in PyMongo. Here is a scheme of my system that I deployed in GKE:
The major part of the calculations needed for each request are made in the MongoDB, so I want the MongoDB pods to be autoscaled based on CPU usage. Thus I have an HPA for the MongoDB deployment, with minReplicas: 1.
When I send many requests to the Nginx Ingress, I see that my only MongoDB pod has 100% CPU usage, so the HPA creates a second pod. But this second pod isn't used.
After looking in the logs of my first MongoDB pod, I see that all the requests have this :
"remote":"${The_endpoint_of_my_API_Pod}:${PORT}", and the PORT only takes 12 different values (I counted them, they started repeating so I guessed that there aren't others).
So my guess is that the second pod isn't used because of sticky connections, as suggested in this answer https://stackoverflow.com/a/73028316/19501779 to one my previous questions, where there is more detail on my MongoDB deployment.
I have 2 questions :
Is the second pod not used in fact because of sticky connections between my API Pod and my first MongoDB Pod?
If this is the case, how can I overcome this issue to make the autoscaling effective?
Thanks, and if you need more info please ask me.
EDIT
Here is my MongoDB configuration:
Its Dockerfile, from which I create my MongoDB image from the VM where my original MongoDB is. A single deployment of this image works in k8s.
FROM mongo:latest
EXPOSE 27017
COPY /mdb/ /data/db
The deployment.yml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: mongodb
labels:
app: mongodb
spec:
replicas: 1
selector:
matchLabels:
app: mongodb
template:
metadata:
labels:
app: mongodb
spec:
containers:
- name: mongodb
image: $My_MongoDB_image
ports:
- containerPort: 27017
resources:
requests:
memory: "1000Mi"
cpu: "1000m"
imagePullSecrets: #for pulling from my Docker Hub
- name: regcred
and the service.yml and hpa.yml:
apiVersion: v1
kind: Service
metadata:
name: mongodb-service
labels:
app: mongodb
spec:
selector:
app: mongodb
ports:
- protocol: TCP
port: 27017
targetPort: 27017
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: mongodb-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mongodb
minReplicas: 1
maxReplicas: 70
targetCPUUtilizationPercentage: 85
And I access to this service from my API Pod with PyMongo:
def get_db(database: str):
client = MongoClient(host="$Cluster_IP_of_{mongodb-service}",
port=27017,
username="...",
password="...",
authSource="admin")
return client.get_database(database)
And moreover, when a second MongoDB Pod is created thanks to autoscaling, its endpoint appears in my mongodb-service:
the HPA created a second Pod
the new Pod endpoints appears in the mongodb-service

How to apply configuration to a down Mongo pod in Kubernetes cluster?

I have a running cluster in Kubernetes (Google Cloud), with 2 pods for 2 frontend apps (Angular), 1 pod backend app (NodeJS) and 1 pod for Mongo DB (currently Down)
In the last code update from git Mongo version updated unintencionally, image tag was not specified in Replication Controller, so it took the latest. This version seems not work.
I get CrashLoopBackOff error in Kubernetes, and the details are:
Failed to start up WiredTiger under any compatibility version. This may be due to an unsupported upgrade or downgrade."}
I have updated the Replication Controller specifying a Mongo version, but when I commit/push the change, workload in Google Cloud is not updating because it's down. The date of last "Created on" is some days ago, not now (see attached images if it's not clear what I'm trying to explain).
My 2 biggest doubts are:
How to force the (re)start of the Mongo Pod (with the added tag specifying the version), in order to fix the down pod issue?
How could I recover the data of the Mongo database in GKE, in order to migrate it quickly?
New Mongo.yaml
apiVersion: v1
kind: Service
metadata:
name: mongo
namespace: sgw-production
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
selector:
name: mongo
---
apiVersion: v1
kind: ReplicationController
metadata:
name: mongo-controller
namespace: sgw-production
labels:
name: mongo
spec:
replicas: 1
template:
metadata:
labels:
name: mongo
spec:
containers:
- image: mongo:4.2.10
name: mongo
ports:
- name: mongo
containerPort: 27017
hostPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
volumes:
- name: mongo-persistent-storage
gcePersistentDisk:
pdName: mongo-disk-$CI_ENVIRONMENT_SLUG
fsType: ext4
I answered my own question here with the solution to this issue (copy/paste below):
CrashLoopBackOff (Mongo in Docker/Kubernetes) - Failed to start up WiredTiger under any compatibility version
I solved this issue editing the Replication Controller online from the Google Cloud Console.
Access to: "Kubernetes Engine" > "Workload" > "mongo-controller" > "Managed pods" > "mongo-controller-XXXXX"
...and press EDIT button (in the top navbar). You can edit the configuration online in real time. I simply specified the Mongo version (4.2.10) in the image, and everything woked as expected.
spec:
replicas: 1
selector:
name: mongo
template:
metadata:
creationTimestamp: null
labels:
name: mongo
spec:
containers:
- image: mongo: 4.2.10
(...)

How can I fix MongoError: no mongos proxy available on GKE

I am trying to deploy and Express api on GKE, with a Mongo StatefulSet.
googlecloud_ssd.yaml
kind: StorageClass
apiVersion: storage.k8s.io/v1beta1
metadata:
name: fast
provisioner: kubernetes.io/gce-pd
parameters:
type: pd-ssd
mongo-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
clusterIP: None
selector:
role: mongo
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: mongo
spec:
serviceName: "mongo"
replicas: 2
template:
metadata:
labels:
role: mongo
environment: test
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongo
image: mongo
command:
- mongod
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
- name: mongo-sidecar
image: cvallance/mongo-k8s-sidecar
env:
- name: MONGO_SIDECAR_POD_LABELS
value: "role=mongo,environment=test"
volumeClaimTemplates:
- metadata:
name: mongo-persistent-storage
annotations:
volume.beta.kubernetes.io/storage-class: "fast"
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
I deployed my Express app and it works perfect, I then deployed Mongo using the above yaml config.
Having set the connection string in express as:
"mongodb://mongo-0.mongo,mongo-1.mongo:27017/"
I can see the updated pod(s) not starting.
Looking at the logs for that container I see
{
insertId: "a9tu83g211w2a6"
labels: {…}
logName: "projects/<my-project-id>/logs/express"
receiveTimestamp: "2019-06-03T14:19:14.142238836Z"
resource: {…}
severity: "ERROR"
textPayload: "[ ERROR ] MongoError: no mongos proxy available
"
timestamp: "2019-06-03T14:18:56.132989616Z"
}
I am unsure how to debug / fix MongoError: no mongos proxy available
Edit
So I scaled down my replicas to 1 on each and it's now working.
I'm confused as to why this won't work more than 1 replica.
The connection to your Mongodb database doesn't work for two reasons:
You cannot connect to high-available MongoDB deployment running inside your Kubernetes cluster using Pods DNS names. These unique POD names: mongo-0.mongo, mongo-1.mongo, with corresponding FQDNs as mongo-0.mongo.default.svc.cluster.local, mongo-1.mongo.default.svc.cluster.local, can be only reached within the K8S cluster. You have an Express web application that runs on client side (Web browser), and needs to connect to your mongodb from outside of cluster.
Connection string: you should connect to primary node via Kubernetes service name, that abstracts access to the Pods behind the replica sets.
Solution:
Create a separate Kubernetes Service of LoadBalancer or NodePort type for your Primary ReplicaSet, and use <ExternalIP_of_LoadBalancer>:27017 in your connection string.
I would encourage you to take a look at official mongodb helm chart, to see what kind of manifest files are required to satisfy your case.
Hint: use '--set service.type=LoadBalancer' with this helm chart

How I can use MongoDB GUI tool like mongo-express or RockMongo on Kubernetes cluster

I have MongoDB running on Kuberenetes cluster and I am looking for a MongoDB GUI tool like PHPmyAdmin to run it as a pod on the cluster and , I have Rockmongo running as a pod but it doesn't connect to MongoDB and also I couldn't expose it, I need any microservice i can run on kubernetes cluster that can do administration for MongoDB pod that is running on default namespace as well.
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: rockmongo
spec:
selector:
matchLabels:
app: rockmongo
replicas: 1
template:
metadata:
labels:
app: rockmongo
spec:
containers:
- name: rockmongo
image: webts/rockmongo
ports:
- containerPort: 8050
env:
- name: MONGO_HOSTS
value: '27017'
- name: ROCKMONGO_PORT
value: '8050'
- name: MONGO_HIDE_SYSTEM_COLLECTIONS
value: 'false'
- name: MONGO_AUTH
value: 'false'
- name: ROCKMONGO_USER
value: 'admin'
- name: ROCKMONGO_PASSWORD
value: 'admin'
Services running on the cluster
rockmongo ClusterIP 10.107.52.82 <none> 8050/TCP 13s
As Vishal Biyani suggested, you
may consider using Kubernetes ingress (with ingress controller) to access internal
resources of MongoDB or GUI for PHP operations.
Distributed databases such as MongoDB require a little extra
attention when being deployed with orchestration frameworks such as Kubernetes.
I found interesting documentation regarding your needs of MongoDB as a
microservice with docker and Kubernetes.

Cannot connect to a mongodb service in a Kubernetes cluster

I have a Kubernetes cluster on Google Cloud, I have a database service, which is running in front of a mongodb deployment. I also have a series of microservices, which are attempting to connect to that datastore.
However, they can't seem to find the host.
apiVersion: v1
kind: Service
metadata:
labels:
name: mongo
name: mongo
spec:
ports:
- port: 27017
targetPort: 27017
selector:
name: mongo
Here's my mongo deployment...
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: mongo-deployment
spec:
replicas: 1
template:
metadata:
labels:
name: mongo
spec:
containers:
- image: mongo:latest
name: mongo
ports:
- name: mongo
containerPort: 27017
hostPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
volumes:
- name: mongo-persistent-storage
gcePersistentDisk:
pdName: mongo-disk
fsType: ext4
And an example of one of my services...
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: bandzest-artists
spec:
replicas: 1
template:
metadata:
labels:
name: bandzest-artists
spec:
containers:
- name: artists-container
image: gcr.io/<omitted>/artists:41040e8
ports:
- containerPort: 7000
imagePullPolicy: Always
env:
- name: DB_HOST
value: mongo
- name: AWS_BUCKET_NAME
value: <omitted>
- name: AWS_ACCESS_KEY_ID
value: <omitted>
- name: AWS_SECRET_KEY
value: <omitted>
First, check that the service is created
kubectl describe svc mongo
You should see it show that it is both created and routing to your pod's IP. If you're wondering what your pod's IP is you can check it out via
kubectl get po | grep mongo
Which should return something like: mongo-deployment-<guid>-<guid>, then do
kubectl describe po mongo-deployment-<guid>-<guid>
You should make sure the pod is started correctly and says Running not something like ImagePullBackoff. It looks like you're mounting a volume from a gcePersistentDisk. If you're seeing your pod just hanging out in the ContainerCreating state it's very likely you're not mounting the disk correctly. Make sure you create the disk before you try and mount it as a volume.
If it looks like your service is routing correctly, then you can check the logs of your pod to make sure it started mongo correctly:
kubectl logs mongo-deployment-<guid>-<guid>
If it looks like the pod and logs are correct, you can exec into the pod and make sure mongo is actually starting and working:
kubectl exec -it mongo-deployment-<guid>-<guid> sh
Which should get you into the container (Pod) and then you can try something like this to see if your DB is running.