I'm running a Kubernetes cluster on AWS and need to configure a replicated MongoDB 4.2 Database.
I'm using StatefulSets in order for other Pods (e.g., REST API NodeJS Pod) to easily connect to the mongo instances (example dsn: "mongodb://mongo-0.mongo,mongo-1.mongo,mongo-2.mongo:27017/app").
mongo-configmap.yaml (provides a shell script to perform the replication initialization upon mongo container creation):
apiVersion: v1
kind: ConfigMap
metadata:
name: mongo-init
data:
init.sh: |
#!/bin/bash
# wait for the readiness health check to pass
until ping -c 1 ${HOSTNAME}.mongo; do
echo "waiting for DNS (${HOSTNAME}.mongo)..."
sleep 2
done
until /usr/bin/mongo --eval 'printjson(db.serverStatus())'; do
echo "connecting to local mongo..."
sleep 2
done
echo "connected to local."
HOST=mongo-0.mongo:27017
until /usr/bin/mongo --host=${HOST} --eval 'printjson(db.serverStatus())'; do
echo "connecting to remote mongo..."
sleep 2
done
echo "connected to remote."
if [[ "${HOSTNAME}" != 'mongo-0' ]]; then
until /usr/bin/mongo --host=${HOST} --eval="printjson(rs.status())" \
| grep -v "no replset config has been received"; do
echo "waiting for replication set initialization"
sleep 2
done
echo "adding self to mongo-0"
/usr/bin/mongo --host=${HOST} --eval="printjson(rs.add('${HOSTNAME}.mongo'))"
fi
if [[ "${HOSTNAME}" == 'mongo-0' ]]; then
echo "initializing replica set"
/usr/bin/mongo --eval="printjson(rs.initiate(\
{'_id': 'rs0', 'members': [{'_id': 0, \
'host': 'mongo-0.mongo:27017'}]}))"
fi
echo "initialized"
while true; do
sleep 3600
done
mongo-service.yaml:
apiVersion: v1
kind: Service
metadata:
name: mongo
labels:
app: mongo
spec:
clusterIP: None
ports:
- port: 27017
selector:
app: mongo
mongo-statefulset.yaml (2 containers inside one Pod, 1 for the actual DB, the other for initialization of the replication):
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mongo
labels:
app: mongo
spec:
selector:
matchLabels:
app: mongo
serviceName: "mongo"
replicas: 3
template:
metadata:
labels:
app: mongo
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongodb
image: mongo:4.2
command:
- mongod
args:
- --replSet
- rs0
- "--bind_ip_all"
ports:
- containerPort: 27017
name: web
volumeMounts:
- name: database
mountPath: /data/db
livenessProbe:
exec:
command:
- /usr/bin/mongo
- --eval
- db.serverStatus()
initialDelaySeconds: 10
timeoutSeconds: 10
- name: init-mongo
image: mongo:4.2
command:
- bash
- /config/init.sh
volumeMounts:
- name: config
mountPath: /config
volumes:
- name: config
configMap:
name: "mongo-init"
volumeClaimTemplates:
- metadata:
name: database
annotations:
volume.beta.kubernetes.io/storage-class: mongodb-storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 5Gi
After applying these configurations, the 3 mongo pods start running (mongo-0, mongo-1, mongo-2).
However, other pods can't connect to these mongo pods.
Further look into the mongo-0 pods (should be primary instance) reveals that the replication did not work.
kubectl exec -it mongo-0 -- /bin/bash
Then running 'mongo' to start the mongo shell, and entering 'rs.status()' into the mongo shell results in the following output:
{
"info" : "run rs.initiate(...) if not yet done for the set",
"ok" : 0,
"errmsg" : "no replset config has been received",
"code" : 94,
"codeName" : "NotYetInitialized"
}
Apparently, mongo-4.x images do not come with 'ping' installed, therefore, the rest of the script was not executed.
Adding these two lines to the script in mongo-configmap.yaml fixes the problem:
apt-get update
apt-get install iputils-ping --yes
After start running all pods, then hit this command
kubectl exec -it mongo-0 -- /bin/bash
(here mongo-0 is pod name)
now start mongo shell,
mongo
now check pod is initiate or not
rs.status()
If not then, initiate and make it primary by hitting these commands one by one
rs.initiate()
var cfg = rs.conf()
cfg.members[0].host=”mongo-0.mongo:27017”
(here is mongo-0 is pod name and mongo is serive name.)
now reconfig primary node
rs.reconfig(cfg)
Add all slaves to primary node
rs.add(“mongo-1.mongo:27017”)
rs.add(“mongo-2.mongo:27017”)
(here is mongo-1 and mongo-2 is pod name and mongo is serive name.)
now check status
rs.status()
now exit from primary shell and go to secondary (slave) node
exit
exit
kubectl exec -it mongo-1 -- /bin/bash
mongo
rs.secondaryOk()
check status and exit
rs.status()
exit
exit
now do same for other secondary (slave) nodes
Related
I am trying to set up a load test for an endpoint. This is what I have followed so far:
Dockerfile
FROM python:3.8
# Add the external tasks directory into /tasks
WORKDIR /src
ADD requirements.txt .
RUN pip install --no-cache-dir --upgrade locust==2.10.1
ADD run.sh .
ADD load_test.py .
ADD load_test.conf .
# Expose the required Locust ports
EXPOSE 5557 5558 8089
# Set script to be executable
RUN chmod 755 run.sh
# Start Locust using LOCUS_OPTS environment variable
CMD ["bash", "run.sh"]
# Modified from:
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/docker/Dockerfile
run.sh
#!/bin/bash
LOCUST="locust"
LOCUS_OPTS="--config=load_test.conf"
LOCUST_MODE=${LOCUST_MODE:-standalone}
if [[ "$LOCUST_MODE" = "master" ]]; then
LOCUS_OPTS="$LOCUS_OPTS --master"
elif [[ "$LOCUST_MODE" = "worker" ]]; then
LOCUS_OPTS="$LOCUS_OPTS --worker --master-host=$LOCUST_MASTER_HOST"
fi
echo "${LOCUST} ${LOCUS_OPTS}"
$LOCUST $LOCUS_OPTS
# Copied from
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/docker/locust/run.sh
This is how I have written the load test locust script:
import json
from locust import HttpUser, constant, task
class CategorizationUser(HttpUser):
wait_time = constant(1)
#task
def predict(self):
payload = json.dumps(
{
"text": "Face and Body Paint washable Rubies Halloween item 91#385"
}
)
_ = self.client.post("/predict", data=payload)
I am invoking that with a configuration:
locustfile = load_test.py
headless = false
users = 1000
spawn-rate = 1
run-time = 5m
host = IP
html = locust_report.html
So, after building and pushing the Docker image and creating a k8s cluster on GKE, I am deploying it. This is how the deployment.yaml looks like:
# Copied from
# https://github.com/scrollocks/locust-loadtesting/blob/master/locust/kubernetes/templates/deployment.yaml
---
kind: Deployment
apiVersion: apps/v1
metadata:
namespace: default
name: locust-master-deployment
labels:
name: locust
role: master
spec:
replicas: 1
selector:
matchLabels:
name: locust
role: master
template:
metadata:
labels:
name: locust
role: master
spec:
containers:
- name: locust
image: gcr.io/PROJECT_ID/IMAGE_URI
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: master
- name: LOCUST_LOG_LEVEL
value: DEBUG
ports:
- name: loc-master-web
containerPort: 8089
protocol: TCP
- name: loc-master-p1
containerPort: 5557
protocol: TCP
- name: loc-master-p2
containerPort: 5558
protocol: TCP
---
kind: Deployment
apiVersion: apps/v1
metadata:
namespace: default
name: locust-worker-deployment
labels:
name: locust
role: worker
spec:
replicas: 2
selector:
matchLabels:
name: locust
role: worker
template:
metadata:
labels:
name: locust
role: worker
spec:
containers:
- name: locust
image: gcr.io/PROJECT_ID/IMAGE_URI
imagePullPolicy: Always
env:
- name: LOCUST_MODE
value: worker
- name: LOCUST_MASTER
value: locust-master-service
- name: LOCUST_LOG_LEVEL
value: DEBUG
After deployment, I am exposing the required ports like so:
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type NodePort \
--port 5558 \
--target-port 5558 \
--name locust-5558
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type NodePort \
--port 5557 \
--target-port 5557 \
--name locust-5557
kubectl expose pod locust-master-deployment-f9d4c4f59-8q6wk \
--type LoadBalancer \
--port 80 \
--target-port 8089 \
--name locust-web
The cluster and the nodes provision successfully. But the moment I hit the IP of locust-web, I am getting:
Any suggestions on how to resolve the bug?
Since you are exposing your pods and you are trying to access to them outside the cluster (your web application), you have to port-forward them or add an Ingress in order to access to your locust pods.
My first approach will be trying to ping or send requests to your locust pods with a simple port-forward.
More infos about the port forwarding here.
Probably the environment variables set by k8s is colliding with locust’s (LOCUST_WEB_PORT specifically). Change your setup so that no containers are named ”locust”.
See https://github.com/locustio/locust/issues/1226 for a similar issue.
I have created a mongodb pod in my Minikube local cluster with the following configuration, now I would like to migrate the data of my existing mongodb database(run in AWS EC2 instance) to this database, how can I accomplish this?
apiVersion: v1
kind: Pod
metadata:
name: mongodb
labels:
app: mongodb
spec:
volumes:
- name: mongo-vol
persistentVolumeClaim:
claimName: mongo-pvc
containers:
- image: mongo
name: container1
command:
- "mongod"
- "--bind_ip"
- "0.0.0.0"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-vol
mountPath: /data/db
Can't added a comment because I do not have more than 50 reputation but here is the thing:
You created a Pod using MongoDB and I see the volumeMount set to /data/db. What is this path? Is this volume a hostPath, NFS, external storage?
The migration itself could be accomplished in two ways:
by running a mongodump on your EC2 instance:
mongodump -host hostname --port 27017 --out /tmp/mongodb-dump
Then a mongorestore on your /data/db path.
You can manually log in on the pod
#copy your dump from host to pod using
kubectl cp /tmp/foo_dir <some-pod>:/tmp/bar_dir
#log on the pod
kubectl exec mongodb -it /bin/bash
#run restore
mongorestore /tmp/mongodb-dump
or
You can copy all files/the entire /data from your EC2 instance and dropped on the volumeMount your pod is mounting /data/db.
I am in the process of deploying a mongodb ReplicaSet on GKE.
My deployment works, however I would like to enable auth on Mongo.
I have connected to my pod
kubectl exec -it {pod_name} mongo admin
Created an Admin user and also a user for my database. I was then thinking I could update mongo-statefulset.yaml with the --auth flag and apply the updated yaml.
Something like
.....
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mongod-container
image: mongo:3.6
command:
- mongod
- "--bind_ip"
- "0.0.0.0"
- "--replSet"
- rs0
- "--smallfiles"
- "--noprealloc"
- "--auth"
ports:
- containerPort: 27017
volumeMounts:
- name: mongo-persistent-storage
mountPath: /data/db
.....
But running kubectl apply -f mongo-statefulset.yaml just produces
service/mongo-svc unchanged
statefulset.apps/mongo configured
Should I restart my pods for this to now take effect?
Try to do rolling update:
The RollingUpdate update strategy will update all Pods in a StatefulSet, in reverse ordinal order, while respecting the StatefulSet guarantees.
Patch the web StatefulSet to apply the RollingUpdate update strategy.
$ kubectl patch statefulset your_statefulset_name -p '{"spec":{...}}}'
Don't to forget to add env label with credentials you have created on your pod like:
env:
- name: MONGODB_USERNAME
value: admin
- name: MONGODB_PASSWORD
value: password
I hope it helps.
You could try
kubectl delete -f mongo-statefulset.yaml && kubectl apply -f mongo-statefulset.yaml
The data required by my container is too large to fit on one local SSD. I also need to access the SSD's as one filesystem from my container. So I would need to attach multiple ones. How do I combine them (single partition, RAID0, etc) and make them accessible as one volume mount in my container?
This link shares how to mount an SSD https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/local-ssd to a mount path. I am not sure how you would merge multiple.
edit
The question asks how one would "combine" multiple SSD devices, individually mounted, on a single node in GKE.
WARNING
this is experimental and not intended for production use without
knowing what you are doing and only tested on gke version 1.16.x.
The approach includes a daemonset using a configmap to use nsenter (with wait tricks) for host namespace and privileged access so you can manage the devices. Specifically for GKE Local SSDs, we can unmount those devices and then raid0 them. InitContainer for the dirty work as this type of task seems most apparent for something you'd need to mark complete, and to then kill privileged container access (or even the Pod). Here is how it is done.
The example assumes 16 SSDs, however, you'll want to adjust the hardcoded values as necessary. Also, ensure your OS image reqs, I use Ubuntu. Also make sure the version of GKE you use starts local-ssd's at sd[b]
ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: local-ssds-setup
namespace: search
data:
setup.sh: |
#!/bin/bash
# returns exit codes: 0 = found, 1 = not found
isMounted() { findmnt -rno SOURCE,TARGET "$1" >/dev/null;} #path or device
# existing disks & mounts
SSDS=(/dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq)
# install mdadm utility
apt-get -y update && apt-get -y install mdadm --no-install-recommends
apt-get autoremove
# OPTIONAL: determine what to do with existing, I wipe it here
if [ -b "/dev/md0" ]
then
echo "raid array already created"
if isMounted "/dev/md0"; then
echo "already mounted - unmounting"
umount /dev/md0 &> /dev/null || echo "soft error - assumed device was mounted"
fi
mdadm --stop /dev/md0
mdadm --zero-superblock "${SSDS[#]}"
fi
# unmount disks from host filesystem
for i in {0..15}
do
umount "${SSDS[i]}" &> /dev/null || echo "${SSDS[i]} already unmounted"
done
if isMounted "/dev/sdb";
then
echo ""
echo "unmount failure - prevent raid0" 1>&2
exit 1
fi
# raid0 array
yes | mdadm --create /dev/md0 --force --level=0 --raid-devices=16 "${SSDS[#]}"
echo "raid array created"
# format
mkfs.ext4 -F /dev/md0
# mount, change /mnt/ssd-array to whatever
mkdir -p /mnt/ssd-array
mount /dev/md0 /mnt/ssd-array
chmod a+w /mnt/ssd-array
wait.sh: |
#!/bin/bash
while sudo fuser /var/{lib/{dpkg,apt/lists},cache/apt/archives}/lock >/dev/null 2>&1; do sleep 1; done
DeamonSet pod spec
spec:
hostPID: true
nodeSelector:
cloud.google.com/gke-local-ssd: "true"
volumes:
- name: setup-script
configMap:
name: local-ssds-setup
- name: host-mount
hostPath:
path: /tmp/setup
initContainers:
- name: local-ssds-init
image: marketplace.gcr.io/google/ubuntu1804
securityContext:
privileged: true
volumeMounts:
- name: setup-script
mountPath: /tmp
- name: host-mount
mountPath: /host
command:
- /bin/bash
- -c
- |
set -e
set -x
# Copy setup script to the host
cp /tmp/setup.sh /host
# Copy wait script to the host
cp /tmp/wait.sh /host
# Wait for updates to complete
/usr/bin/nsenter -m/proc/1/ns/mnt -- chmod u+x /tmp/setup/wait.sh
# Give execute priv to script
/usr/bin/nsenter -m/proc/1/ns/mnt -- chmod u+x /tmp/setup/setup.sh
# Wait for Node updates to complete
/usr/bin/nsenter -m/proc/1/ns/mnt /tmp/setup/wait.sh
# If the /tmp folder is mounted on the host then it can run the script
/usr/bin/nsenter -m/proc/1/ns/mnt /tmp/setup/setup.sh
containers:
- image: "gcr.io/google-containers/pause:2.0"
name: pause
For high performance use cases, use the Ephemeral storage on local SSDs GKE feature. All local SSDs will be configures as a (striped) raid0 array and mounted into the pod.
Quick summary:
Create the node pool or cluster with the option: --ephemeral-storage local-ssd-count=X
Schedule to nodes with cloud.google.com/gke-ephemeral-storage-local-ssd.
Add an emptyDir volume.
Mount it with volumeMounts.
Here's how I used it with a DaemonSet:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: myapp
labels:
app: myapp
spec:
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
nodeSelector:
cloud.google.com/gke-ephemeral-storage-local-ssd: "true"
volumes:
- name: localssd
emptyDir: {}
containers:
- name: myapp
image: <IMAGE>
volumeMounts:
- mountPath: /scratch
name: localssd
You can use DaemonSet yaml file to deploy the pod will run on startup, assuming already created a cluster with 2 local-SSD (this pod will be in charge of creating the Raid0 disk)
kind: DaemonSet
apiVersion: extensions/v1beta1
metadata:
name: ssd-startup-script
labels:
app: ssd-startup-script
spec:
template:
metadata:
labels:
app: ssd-startup-script
spec:
hostPID: true
containers:
- name: ssd-startup-script
image: gcr.io/google-containers/startup-script:v1
imagePullPolicy: Always
securityContext:
privileged: true
env:
- name: STARTUP_SCRIPT
value: |
#!/bin/bash
sudo curl -s https://get.docker.com/ | sh
echo Done
The pod that will have access to the disk array in the above example is “/mnt/disks/ssd-array”
apiVersion: v1
kind: Pod
metadata:
name: test-pod
spec:
containers:
- name: test-container
image: ubuntu
volumeMounts:
- mountPath: /mnt/disks/ssd-array
name: ssd-array
args:
- sleep
- "1000"
nodeSelector:
cloud.google.com/gke-local-ssd: "true"
tolerations:
- key: "local-ssd"
operator: "Exists"
effect: "NoSchedule"
volumes:
- name: ssd-array
hostPath:
path: /mnt/disks/ssd-array
After deploying the test-pod, SSH to the pod from your cloud-shell or any instance.
Then run :
kubectl exec -it test-pod -- /bin/bash
After that you should be able to see the created file in the ssd-array disk.
cat test-file.txt
I am struggling to find any answer to this in the Kubernetes documentation. The scenario is the following:
Kubernetes version 1.4 over AWS
8 pods running a NodeJS API (Express) deployed as a Kubernetes Deployment
One of the pods gets restarted for no apparent reason late at night (no traffic, no CPU spikes, no memory pressure, no alerts...). Number of restarts is increased as a result of this.
Logs don't show anything abnormal (ran kubectl -p to see previous logs, no errors at all in there)
Resource consumption is normal, cannot see any events about Kubernetes rescheduling the pod into another node or similar
Describing the pod gives back TERMINATED state, giving back COMPLETED reason and exit code 0. I don't have the exact output from kubectl as this pod has been replaced multiple times now.
The pods are NodeJS server instances, they cannot complete, they are always running waiting for requests.
Would this be internal Kubernetes rearranging of pods? Is there any way to know when this happens? Shouldn't be an event somewhere saying why it happened?
Update
This just happened in our prod environment. The result of describing the offending pod is:
api:
Container ID: docker://7a117ed92fe36a3d2f904a882eb72c79d7ce66efa1162774ab9f0bcd39558f31
Image: 1.0.5-RC1
Image ID: docker://sha256:XXXX
Ports: 9080/TCP, 9443/TCP
State: Running
Started: Mon, 27 Mar 2017 12:30:05 +0100
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 24 Mar 2017 13:32:14 +0000
Finished: Mon, 27 Mar 2017 12:29:58 +0100
Ready: True
Restart Count: 1
Update 2
Here it is the deployment.yaml file used:
apiVersion: "extensions/v1beta1"
kind: "Deployment"
metadata:
namespace: "${ENV}"
name: "${APP}${CANARY}"
labels:
component: "${APP}${CANARY}"
spec:
replicas: ${PODS}
minReadySeconds: 30
revisionHistoryLimit: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
component: "${APP}${CANARY}"
spec:
serviceAccount: "${APP}"
${IMAGE_PULL_SECRETS}
containers:
- name: "${APP}${CANARY}"
securityContext:
capabilities:
add:
- IPC_LOCK
image: "134078050561.dkr.ecr.eu-west-1.amazonaws.com/${APP}:${TAG}"
env:
- name: "KUBERNETES_CA_CERTIFICATE_FILE"
value: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
- name: "NAMESPACE"
valueFrom:
fieldRef:
fieldPath: "metadata.namespace"
- name: "ENV"
value: "${ENV}"
- name: "PORT"
value: "${INTERNAL_PORT}"
- name: "CACHE_POLICY"
value: "all"
- name: "SERVICE_ORIGIN"
value: "${SERVICE_ORIGIN}"
- name: "DEBUG"
value: "http,controllers:recommend"
- name: "APPDYNAMICS"
value: "true"
- name: "VERSION"
value: "${TAG}"
ports:
- name: "http"
containerPort: ${HTTP_INTERNAL_PORT}
protocol: "TCP"
- name: "https"
containerPort: ${HTTPS_INTERNAL_PORT}
protocol: "TCP"
The Dockerfile of the image referenced in the above Deployment manifest:
FROM ubuntu:14.04
ENV NVM_VERSION v0.31.1
ENV NODE_VERSION v6.2.0
ENV NVM_DIR /home/app/nvm
ENV NODE_PATH $NVM_DIR/v$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/v$NODE_VERSION/bin:$PATH
ENV APP_HOME /home/app
RUN useradd -c "App User" -d $APP_HOME -m app
RUN apt-get update; apt-get install -y curl
USER app
# Install nvm with node and npm
RUN touch $HOME/.bashrc; curl https://raw.githubusercontent.com/creationix/nvm/${NVM_VERSION}/install.sh | bash \
&& /bin/bash -c 'source $NVM_DIR/nvm.sh; nvm install $NODE_VERSION'
ENV NODE_PATH $NVM_DIR/versions/node/$NODE_VERSION/lib/node_modules
ENV PATH $NVM_DIR/versions/node/$NODE_VERSION/bin:$PATH
# Create app directory
WORKDIR /home/app
COPY . /home/app
# Install app dependencies
RUN npm install
EXPOSE 9080 9443
CMD [ "npm", "start" ]
npm start is an alias for a regular node app.js command that starts a NodeJS server on port 9080.
Check the version of docker you run, and whether the docker daemon was restarted during that time.
If the docker daemon was restarted, all the container would be terminated (unless you use the new "live restore" feature in 1.12). In some docker versions, docker may incorrectly reports "exit code 0" for all containers terminated in this situation. See https://github.com/docker/docker/issues/31262 for more details.
If this is still relevant, we just had a similar problem in our cluster.
We have managed to find more information by inspecting the logs from docker itself. ssh onto your k8s node and run the following:
sudo journalctl -fu docker.service
I hade similar problem when we upgraded to version 2.x Pos get restarted even after the Dags ran successfully.
I later resolved it after a long time of debugging by overriding the pod template and specifying it in the airflow.cfg file.
[kubernetes]
....
pod_template_file = {{ .Values.airflow.home }}/pod_template.yaml
---
# pod_template.yaml
apiVersion: v1
kind: Pod
metadata:
name: dummy-name
spec:
serviceAccountName: default
restartPolicy: Never
containers:
- name: base
image: dummy_image
imagePullPolicy: IfNotPresent
ports: []
command: []