Running mongodb stateful set on Kubernetes with istio - kubernetes

I am trying to setup mongodb on kubernetes with istio. My statefulset is as follows:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: treeservice
namespace: staging
spec:
serviceName: tree-service-service
replicas: 1
selector:
matchLabels:
app: treeservice
template:
metadata:
labels:
app: treeservice
spec:
containers:
- name: mongodb-cache
image: mongo:latest
imagePullPolicy: Always
ports:
- containerPort: 30010
volumeMounts:
- name: mongodb-cache-data
mountPath: /data/db
resources:
requests:
memory: "4Gi" # 4 GB
cpu: "1000m" # 1 CPUs
limits:
memory: "4Gi" # 4 GB
cpu: "1000" # 1 CPUs
readinessProbe:
exec:
command:
- mongo
- --eval "db.stats()" --port 30010
initialDelaySeconds: 60 #wait this period after staring fist time
periodSeconds: 30 # polling interval every 5 minutes
timeoutSeconds: 60
livenessProbe:
exec:
command:
- mongo
- --eval "db.stats()" --port 30010
initialDelaySeconds: 60 #wait this period after staring fist time
periodSeconds: 30 # polling interval every 5 minutes
timeoutSeconds: 60
command: ["/bin/bash"]
args: ["-c","mongod --port 30010 --replSet test"] #bind to localhost
volumeClaimTemplates:
- metadata:
name: mongodb-cache-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: fast
resources:
requests:
storage: 300Gi
however, the pod is not created and I see the following error:
kubectl describe statefulset treeservice -n staging
Warning FailedCreate 1m (x159 over 1h) statefulset-controller create Pod treeservice-0 in StatefulSet treeservice failed error: Pod "treeservice-0" is invalid: spec.containers[1].env[7].name: Invalid value: "ISTIO_META_statefulset.kubernetes.io/pod-name": a valid environment variable name must consist of alphabetic characters, digits, '_', '-', or '.', and must not start with a digit (e.g. 'my.env-name', or 'MY_ENV.NAME', or 'MyEnvName1', regex used for validation is '[-._a-zA-Z][-._a-zA-Z0-9]*')
I assum treeservice is a valid pod name. Am I missing something?

I guess it's due to this issue https://github.com/istio/istio/issues/9571 which is still open
I made it work temporarily using the following:
annotations:
sidecar.istio.io/inject: "false"

Related

Setup a Single Solr exporter, for all instances of Solr Statefulsets running in different namespaces in a Kubernetes Cluster

In our Project, we are using Solr exporter to fetch Solr metrics like heap usage and send it to Prometheus. We have configured alert manager to fire alert when Solr heap usage exceeds 80%. This application is configured for all instances.
Is there a way we can configure a common Solr exporter which will fetch Solr metrics from all instances which are in different namespaces in that cluster.? (This is reduce the resource consumption in kuberenetes caused by multiple solr exporter instances.)
YAML configs for solr exporter given below:
apiVersion: v1
kind: Service
metadata:
name: solr-exporter
labels:
app: solr-exporter
spec:
ports:
- port: 9983
name: client
selector:
app: solr
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: solr
name: solr-exporter
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: solr
pod: solr-exporter
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: solr
pod: solr-exporter
spec:
containers:
- command:
- /bin/bash
- -c
- /opt/solr/contrib/prometheus-exporter/bin/solr-exporter -p 9983 -z zk-cs:2181
-n 7 -f /opt/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
image: solr:8.1.1
imagePullPolicy: Always
livenessProbe:
failureThreshold: 3
httpGet:
path: /metrics
port: 9983
scheme: HTTP
initialDelaySeconds: 480
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 5
name: solr-exporter
ports:
- containerPort: 9983
protocol: TCP
readinessProbe:
failureThreshold: 2
httpGet:
path: /metrics
port: 9983
scheme: HTTP
initialDelaySeconds: 60
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 5
resources:
limits:
cpu: 500m
requests:
cpu: 50m
memory: 64Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /opt/solr/contrib/prometheus-exporter/conf/solr-exporter-config.xml
name: solr-exporter-config
readOnly: true
subPath: solr-exporter-config.xml
dnsPolicy: ClusterFirst
initContainers:
- command:
- sh
- -c
- |-
apt-get update
apt-get install curl -y
PROTOCOL="http://"
COUNTER=0;
while [ $COUNTER -lt 30 ]; do
curl -v -k -s --connect-timeout 10 "${PROTOCOL}solrcluster:8983/solr/admin/info/system" && exit 0
sleep 2
done;
echo "Did NOT see a Running Solr instance after 60 secs!";
exit 1;
image: solr:8.1.1
imagePullPolicy: Always
name: mpi-demo-solr-exporter-init
resources: {}
securityContext:
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 420
name: solr-exporter-config
name: solr-exporter-config

How to specify RollingUpdate settings for Celery Beat on Kubernetes

We have Celery Beat set up using the following deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
name: celery-beat
labels:
deployment: celery-beat
spec:
replicas: 1
minReadySeconds: 120
selector:
matchLabels:
app: celery-beat
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 0 # Would rather have downtime than an additional instance in service?
maxUnavailable: 1
template:
metadata:
labels:
app: celery-beat
spec:
containers:
- name: celery-beat
image: image_url
command: ["/var/app/scripts/kube_worker_beat_start.sh"]
imagePullPolicy: Always
ports:
- containerPort: 8000
name: http-server
livenessProbe: #From https://github.com/celery/celery/issues/4079#issuecomment-437415370
exec:
command:
- /bin/sh
- -c
- celery -A app_name status | grep ".*OK"
initialDelaySeconds: 3600
periodSeconds: 3600
readinessProbe:
exec:
command:
- /bin/sh
- -c
- celery -A app_name status | grep ".*OK"
initialDelaySeconds: 60
periodSeconds: 30
resources:
limits:
cpu: "0.5" #500mcores - only really required on install
requests:
cpu: "30m"
I have found the RollingUpdate settings tricky because with Celery Beat you really don't want two instances, otherwise you might get duplicate tasks being completed. This is super important for us to avoid since we're using it to send out push notifications.
With the current settings, when a deployment rolls out there is 3-5mins of downtime, because the existing instance is terminated immediately and we have to wait for the new one to set itself up.
Is there a better way of configuring this reduce the downtime whilst ensuring a maximum of one service is ever in service?

replicaset connections requires

currently i'm trying to set up countly application on kubernetes.
I have a problem that I can't understand. maybe it's because I've been in it too long and I don't have enough distance.
When my pods (frontend & api) countly start I get the error below :
ERROR [db:read] Error reading apps {"name":"find","args":[{}]} MongoError: seed list
contains no mongos proxies, replicaset connections requires the parameter replicaSet to be supplied in the URI or options object, mongodb://server:port/db?replicaSet=name {"name":"MongoError"}
No apps to upgrade
I checked on the mongodb website how to set up the replicaset string connection
At the bottom I put the result of the command kubectl get all to see the service name and pod name.
I've tried a lot of different ways, but every time I've made a mistake.
Deployment.yaml :
# Source: countly-chart/templates/frontend.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: RELEASE-NAME-frontend
labels:
name: RELEASE-NAME-frontend
app: RELEASE-NAME
product: dgt
environment: dev
version: HEAD
component: frontend
spec:
replicas: 1
revisionHistoryLimit: 1
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
name: RELEASE-NAME-frontend
app: RELEASE-NAME
product: dgt
environment: dev
version: HEAD
component: frontend
template:
metadata:
labels:
name: RELEASE-NAME-frontend
app: RELEASE-NAME
product: dgt
environment: dev
version: HEAD
component: frontend
spec:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- mongodb
- mongodb
topologyKey: "kubernetes.io/hostname"
terminationGracePeriodSeconds: 10
containers:
- name: countly-frontend
image: countly/frontend:19.08.1
resources:
limits:
cpu: 100m
memory: 512Mi
requests:
cpu: 100m
memory: 512Mi
imagePullPolicy: Always
ports:
- containerPort: 6001
readinessProbe:
httpGet:
path: /ping
port: 6001
initialDelaySeconds: 60
periodSeconds: 10
timeoutSeconds: 3
env:
- name: COUNTLY_PLUGINS
value: "mobile,web,desktop,density,locale,browser,sources,views,enterpriseinfo,logger,systemlogs,errorlogs,populator,reports,crashes,push,star-rating,slipping-away-users,compare,server-stats,dbviewer,assistant,plugin-upload,times-of-day,compliance-hub,video-intelligence-monetization,alerts,onboarding"
- name: COUNTLY_CONFIG_API_FILESTORAGE
value: "gridfs"
- name: COUNTLY_CONFIG_API_MONGODB
value: "mongodb://mongodb-0.mongodb:27017,mongodb-1.mongodb:27017/countly?replicaSet=rs0"
- name: COUNTLY_CONFIG_FRONTEND_MONGODB
value: "mongodb://mongodb-0.mongodb:27017,mongodb-1.mongodb:27017/countly?replicaSet=rs0"
- name: COUNTLY_CONFIG_HOSTNAME
value: dgt-iointc-countly.xxx.xx
Service.yaml :
# Source: countly-chart/templates/frontend.yaml
apiVersion: v1
kind: Service
metadata:
name: RELEASE-NAME-frontend
labels:
app: RELEASE-NAME
product: dgt
environment: dev
version: HEAD
component: frontend
spec:
type: NodePort
ports:
- port: 6001
targetPort: 6001
protocol: TCP
selector:
app: RELEASE-NAME
component: frontend
ApiVersion: apps/v1
kind: StatefulSet
metadata:
creationTimestamp: "2020-04-09T15:06:27Z"
generation: 7
labels:
app: mongodb
component: mongo
environment: iointc
product: mongodb
name: mongodb
namespace: dgt-iointc
resourceVersion: "510868664"
selfLink: /apis/apps/v1/namespaces/dgt-iointc/statefulsets/mongodb
uid: af9eb929-7a73-11ea-a219-0679473f89f6
spec:
podManagementPolicy: OrderedReady
replicas: 2
revisionHistoryLimit: 1
selector:
matchLabels:
app: mongodb
component: mongo
serviceName: mongodb
template:
metadata:
creationTimestamp: null
labels:
app: mongodb
component: mongo
environment: iointc
product: mongodb
spec:
containers:
- command:
- mongod
- --replSet
- rs0
- --bind_ip_all
env:
- name: MONGODB_REPLICA_SET_NAME
value: rs0
image: docker.xxx.xx/run/mongo:3.6.12
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- mongo
- --eval
- db.adminCommand('ping')
failureThreshold: 3
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: mongo
ports:
- containerPort: 27017
protocol: TCP
readinessProbe:
exec:
command:
- mongo
- --eval
- db.adminCommand('ping')
failureThreshold: 3
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: 500m
memory: 1000Mi
requests:
cpu: 50m
memory: 1000Mi
securityContext:
runAsUser: 999
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data/db
name: mongo-storage
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 10
updateStrategy:
type: RollingUpdate
volumeClaimTemplates:
- metadata:
creationTimestamp: null
name: mongo-storage
spec:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 600Mi
storageClassName: nfs-external
status:
phase: Pending
Kubectl get all
NAME READY STATUS RESTARTS AGE
pod/countly-api-7b878d6fc5-ktlmg 0/1 Running 17 46m
pod/countly-frontend-6c4fc86c96-g47hx 0/1 Running 0 39m
pod/mongodb-0 1/1 Running 0 5h19m
pod/mongodb-1 1/1 Running 0 5h19m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/countly-api NodePort 10.80.11.164 <none> 3001:32612/TCP 37h
service/countly-frontend NodePort 10.80.104.62 <none> 6001:31176/TCP 37h
service/mongodb ClusterIP None <none> 27017/TCP 18h
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
deployment.apps/countly-api 1 1 1 0 12h
deployment.apps/countly-frontend 1 1 1 0 12h
NAME DESIRED CURRENT READY AGE
replicaset.apps/countly-api-7b878d6fc5 1 1 0 46m
replicaset.apps/countly-frontend-6c4fc86c96 1 1 0 39m
NAME DESIRED CURRENT AGE
statefulset.apps/mongodb 2 2 18h

Failed to attach volume ... already being used by

I am running Kubernetes in a GKE cluster using version 1.6.6 and another cluster with 1.6.4. Both are experiencing issues with failing over GCE compute disks.
I have been simulating failures using kill 1 inside the container or killing the GCE node directly. Sometimes I get lucky and the pod will get created on the same node again. But most of the time this isn't the case.
Looking at the event log it shows the error trying to mount 3 times and it fails to do anything more. Without human intervention it never corrects it self. I am forced to kill the pod multiple times until it works. During maintenances this is a giant pain.
How do I get Kubernetes to fail over with volumes properly ? Is there a way to tell the deployment to try a new node on failure ? Is there a way to remove the 3 retry limit ?
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: dev-postgres
namespace: jolene
spec:
revisionHistoryLimit: 0
template:
metadata:
labels:
app: dev-postgres
namespace: jolene
spec:
containers:
- image: postgres:9.6-alpine
imagePullPolicy: IfNotPresent
name: dev-postgres
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-data
env:
[ ** Removed, irrelevant environment variables ** ]
ports:
- containerPort: 5432
livenessProbe:
exec:
command:
- sh
- -c
- exec pg_isready
initialDelaySeconds: 30
timeoutSeconds: 5
failureThreshold: 6
readinessProbe:
exec:
command:
- sh
- -c
- exec pg_isready --host $POD_IP
initialDelaySeconds: 5
timeoutSeconds: 3
periodSeconds: 5
volumes:
- name: postgres-data
persistentVolumeClaim:
claimName: dev-jolene-postgres
I have tried this with and without PersistentVolume / PersistentVolumeClaim.
apiVersion: "v1"
kind: "PersistentVolume"
metadata:
name: dev-jolene-postgres
spec:
capacity:
storage: "1Gi"
accessModes:
- "ReadWriteOnce"
claimRef:
namespace: jolene
name: dev-jolene-postgres
gcePersistentDisk:
fsType: "ext4"
pdName: "dev-jolene-postgres"
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: dev-jolene-postgres
namespace: jolene
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
By default, every node is schedulable, so there is no need to explicitly mention it in deployment. and feature which can mention retry limits is still in progress, which can be tracked here, https://github.com/kubernetes/kubernetes/issues/16652

How to run command after initialization

I would like to run specific command after initialization of deployment is successful.
This is my yaml file:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: auth
spec:
replicas: 1
template:
metadata:
labels:
app: auth
spec:
containers:
- name: auth
image: {{my-service-image}}
env:
- name: NODE_ENV
value: "docker-dev"
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 3000
However, I would like to run command for db migration after (not before) deployment is successfully initialized and pods are running.
I can do it manually for every pod (with kubectl exec), but this is not very scalable.
I resolved it using lifecycles:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: auth
spec:
replicas: 1
template:
metadata:
labels:
app: auth
spec:
containers:
- name: auth
image: {{my-service-image}}
env:
- name: NODE_ENV
value: "docker-dev"
resources:
requests:
cpu: 100m
memory: 100Mi
ports:
- containerPort: 3000
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", {{cmd}}]
You can use Helm to deploy a set of Kubernetes resources. And then, use a Helm hook, e.g. post-install or post-upgrade, to run a Job in a separate docker container. Set your Job to invoke db migration. A Job will run >=1 Pods to completion, so it fits here quite well.
I chose to use a readinessProbe
My application requires configuration after the process has completely started.
The postStart command was running before the app was ready.
readinessProbe:
exec:
command: [healthcheck]
initialDelaySeconds: 30
periodSeconds: 2
timeoutSeconds: 1
successThreshold: 3
failureThreshold: 10