Why my GKE node pool does not auto-scale down? - kubernetes

I've got a preemptible node pool which is clearly under-utilized:
The node pool hosts a deployment with HPA with the following setup:
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend
labels:
app: backend
spec:
replicas: 1
selector:
matchLabels:
app: backend
template:
metadata:
labels:
app: backend
spec:
initContainers:
- name: wait-for-database
image: ### IMAGE ###
command: ['bash', 'init.sh']
containers:
- name: backend
image: ### IMAGE ###
command: ["bash", "entrypoint.sh"]
imagePullPolicy: Always
resources:
requests:
memory: "200M"
cpu: "50m"
ports:
- name: probe-port
containerPort: 8080
hostPort: 8080
volumeMounts:
- name: static-shared-data
mountPath: /static
readinessProbe:
httpGet:
path: /readiness/
port: probe-port
failureThreshold: 5
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 5
- name: nginx
image: nginx:alpine
resources:
requests:
memory: "400M"
cpu: "20m"
ports:
- containerPort: 80
volumeMounts:
- name: nginx-proxy-config
mountPath: /etc/nginx/conf.d/default.conf
subPath: app.conf
- name: static-shared-data
mountPath: /static
volumes:
- name: nginx-proxy-config
configMap:
name: backend-nginx
- name: static-shared-data
emptyDir: {}
nodeSelector:
cloud.google.com/gke-nodepool: app-dev
tolerations:
- effect: NoSchedule
key: workload
operator: Equal
value: dev
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: backend
namespace: default
spec:
maxReplicas: 12
minReplicas: 8
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: Deployment
name: backend
metrics:
- resource:
name: cpu
targetAverageUtilization: 50
type: Resource
---
The node pool also has the toleration label.
The HPA utilization shows this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
backend-develop Deployment/backend-develop 10%/50% 8 12 8 38d
But the node pool does not scale down for about a day. No heavy load on this deployment:
NAME STATUS ROLES AGE VERSION
gke-dev-app-dev-fee1a901-fvw9 Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-gls7 Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lf3f Ready <none> 24h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-lgw9 Ready <none> 3d10h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-qxkz Ready <none> 3h35m v1.14.10-gke.36
gke-dev-app-dev-fee1a901-s10l Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-sj4d Ready <none> 22h v1.14.10-gke.36
gke-dev-app-dev-fee1a901-vdnw Ready <none> 27h v1.14.10-gke.36
There's no affinity settings for this deployment and node pool. Some of the nodes easily pack several same pods, but others just hold one pod for hours, no scale down happens.
What could be wrong?

The issue was:
hostPort: 8080
This lead to FailedScheduling didn't have free ports.
That's why the nodes were kept online.

Related

Kubernetes : RabbitMQ pod is spammed with connections from kube-system

I'm currently learning Kubernetes and all its quircks.
I'm currently using a rabbitMQ Deployment, service and pod in my cluster to exchange messages between apps in the cluster. However, I saw an abnormal amount of the rabbitMQ pod restarts.
After installing prometheus and Grafana to see the problem, I saw that the rabbitMQ pod would consume more and more memory and cpu until it gets killed by the OOMkiller every two hours or so. The graph looks like this :
Graph of CPU consumption in my cluster (rabbitmq in red)
After that I looked into the rabbitMQ pod UI, and saw that an app in my cluster (ip 10.224.0.5) was constantly creating new connections, this IP corresponding to my kube-system and my prometheus instance, as shown by the following logs :
k get all -A -o wide | grep 10.224.0.5
E1223 12:13:48.231908 23198 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E1223 12:13:48.311831 23198 memcache.go:255] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
kube-system pod/azure-ip-masq-agent-xh9jk 1/1 Running 0 25d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/cloud-node-manager-h5ff5 1/1 Running 0 25d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/csi-azuredisk-node-sf8sn 3/3 Running 0 3d15h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/csi-azurefile-node-97nbt 3/3 Running 0 19d 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
kube-system pod/kube-proxy-2s5tn 1/1 Running 0 3d15h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
monitoring pod/prometheus-prometheus-node-exporter-dztwx 1/1 Running 0 20h 10.224.0.5 aks-agentpool-37892177-vmss000001 <none> <none>
Also, I noticed that these connections seem tpo be blocked by rabbitMQ, as the field connection.blocked in the client properties is set to true, as shown in the follwing image:
Print screen of a connection details from rabbitMQ pod's UI
I saw in the documentation that rabbitMQ starts to blocks connections when it hits low on resources, but I set the cpu and memory limits to 1 cpu and 1 Gib RAM, and the connections are blocked from the start anyway.
On the cluster, I'm also using Keda which uses the rabbitmq pod, and polls it every one second to see if there are any messages in a queue (I set pollingInterval to 1 in the yaml). But as I said earlier, it's not Keda that's creating all the connections, it's kube-system. Unless keda uses a component described earlier in the log to poll rabbitmq, and that the Keda's polling interval does not corresponds to seconds (which is highly unlikely as it's written in the docs that this polling intertval is given in seconds), I don't know at all what's going on with all these connections.
The following section contains the yamls of all the components that might be involved with this problem (keda and rabbitmq) :
rabbitMQ Replica Count.yaml
apiVersion: v1
kind: ReplicationController
metadata:
labels:
component: rabbitmq
name: rabbitmq-controller
spec:
replicas: 1
template:
metadata:
labels:
app: taskQueue
component: rabbitmq
spec:
containers:
- image: rabbitmq:3.11.5-management
name: rabbitmq
ports:
- containerPort: 5672
name: amqp
- containerPort: 15672
name: http
resources:
limits:
cpu: 1
memory: 1Gi
rabbitMQ Service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
component: rabbitmq
name: rabbitmq-service
spec:
type: LoadBalancer
ports:
- port: 5672
targetPort: 5672
name: amqp
- port: 15672
targetPort: 15672
name: http
selector:
app: taskQueue
component: rabbitmq
keda JobScaler, Secret and TriggerAuthentication (sample data is just a replacement for fields that I do not want to be revealed :) ):
apiVersion: v1
kind: Secret
metadata:
name: keda-rabbitmq-secret
data:
host: sample-host # base64 encoded value of format amqp://guest:password#localhost:5672/vhost
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: keda-trigger-auth-rabbitmq-conn
namespace: default
spec:
secretTargetRef:
- parameter: host
name: keda-rabbitmq-secret
key: host
---
apiVersion: keda.sh/v1alpha1
kind: ScaledJob
metadata:
name: builder-job-scaler
namespace: default
spec:
jobTargetRef:
parallelism: 1
completions: 1
activeDeadlineSeconds: 600
backoffLimit: 5
template:
spec:
volumes:
- name: shared-storage
emptyDir: {}
initContainers:
- name: sourcesfetcher
image: sample image
volumeMounts:
- name: shared-storage
mountPath: /mnt/shared
env:
- name: SHARED_STORAGE_MOUNT_POINT
value: /mnt/shared
- name: RABBITMQ_ENDPOINT
value: sample host
- name: RABBITMQ_QUEUE_NAME
value: buildOrders
containers:
- name: builder
image: sample image
volumeMounts:
- name: shared-storage
mountPath: /mnt/shared
env:
- name: SHARED_STORAGE_MOUNT_POINT
value: /mnt/shared
- name: MINIO_ENDPOINT
value: sample endpoint
- name: MINIO_PORT
value: sample port
- name: MINIO_USESSL
value: "false"
- name: MINIO_ROOT_USER
value: sample user
- name: MINIO_ROOT_PASSWORD
value: sampel password
- name: BUCKET_NAME
value: "hex"
- name: SERVER_NAME
value: sample url
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 500m
memory: 512Mi
restartPolicy: OnFailure
pollingInterval: 1
maxReplicaCount: 2
minReplicaCount: 0
rollout:
strategy: gradual
triggers:
- type: rabbitmq
metadata:
protocol: amqp
queueName: buildOrders
mode: QueueLength
value: "1"
authenticationRef:
name: keda-trigger-auth-rabbitmq-conn
Any help would very much appreciated!

Kubernetes and mongo, PV, PVC

Hi just A noobiew question.
I manage(?) to implement PV and PVC over mongo DB. I'm using PV as local and not on the cloud.
There is a way to save the data when k8s runs on my pc after container restart ?
I'm not sure I got this right but what I need is to save the mongo data after he restart. What is the best way ? (no mongo atlas)
UPDATE:
I managed to make tickets service db work great, but I have 2 other services that it just wont work ! i update the yaml files so u can see the current state. the auth-mongo is just the same as tickets-mongo so why it wont work ?
the ticket-depl-mongo yaml file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: tickets-mongo-depl
spec:
replicas: 1
selector:
matchLabels:
app: tickets-mongo
template:
metadata:
labels:
app: tickets-mongo
spec:
containers:
- name: tickets-mongo
image: mongo
args: ["--dbpath", "data/auth"]
livenessProbe:
exec:
command:
- mongo
- --disableImplicitSessions
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6
volumeMounts:
- mountPath: /data/auth
name: tickets-data
volumes:
- name: tickets-data
persistentVolumeClaim:
claimName: tickets-pvc
---
apiVersion: v1
kind: Service
metadata:
name: tickets-mongo-srv
spec:
selector:
app: tickets-mongo
ports:
- name: db
protocol: TCP
port: 27017
targetPort: 27017
auth-mongo-depl.yaml :
apiVersion: apps/v1
kind: Deployment
metadata:
name: auth-mongo-depl
spec:
replicas: 1
selector:
matchLabels:
app: auth-mongo
template:
metadata:
labels:
app: auth-mongo
spec:
containers:
- name: auth-mongo
image: mongo
args: ["--dbpath", "data/db"]
livenessProbe:
exec:
command:
- mongo
- --disableImplicitSessions
- --eval
- "db.adminCommand('ping')"
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6
volumeMounts:
- mountPath: /data/db
name: auth-data
volumes:
- name: auth-data
persistentVolumeClaim:
claimName: auth-pvc
---
apiVersion: v1
kind: Service
metadata:
name: auth-mongo-srv
spec:
selector:
app: auth-mongo
ports:
- name: db
protocol: TCP
port: 27017
targetPort: 27017
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv-auth 1Gi RWO Retain Bound default/auth-pvc auth 78m
pv-orders 1Gi RWO Retain Bound default/orders-pvc orders 78m
pv-tickets 1Gi RWO Retain Bound default/tickets-pvc tickets 78m
I'm using mongo containers with tickets, orders, and auth services.
Just adding some info to make it clear.
NAME READY STATUS RESTARTS AGE
auth-depl-66c5d54988-ffhwc 1/1 Running 0 36m
auth-mongo-depl-594b98fcc5-k9hj8 1/1 Running 0 36m
client-depl-787cf6c7c6-xxks9 1/1 Running 0 36m
expiration-depl-864d846445-b95sh 1/1 Running 0 36m
expiration-redis-depl-64bd9fdb95-sg7fc 1/1 Running 0 36m
nats-depl-7d6c7dc46-m6mcg 1/1 Running 0 36m
orders-depl-5478cf4dfd-zmngj 1/1 Running 0 36m
orders-mongo-depl-5f974847d7-bz9s4 1/1 Running 0 36m
payments-depl-78f85d94fd-4zs55 1/1 Running 0 36m
payments-mongo-depl-5d5c47494b-7zjrl 1/1 Running 0 36m
tickets-depl-84d59fd47c-cs4k5 1/1 Running 0 36m
tickets-mongo-depl-66798d9874-cfbqb 1/1 Running 0 36m
example for pv:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-tickets
labels:
type: local
spec:
storageClassName: tickets
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp"
All I had to do is to change the path of hostPath in each PV. the same path will make the app to faill.
pv1:
hostPath:
path: "/path/x1"
pv2:
hostPath:
path: "/path/x2"
like so.. just not the same path.

ConfigMaps Not Found while Deploying an Application to Kubernetes Cluster

I am trying to deploy an app to a Kubernetes cluster. My deployment uses three configMaps as volumeMounts.
However when I apply the deployment it can't seem to find the configMaps.
My deployment.yml looks like this:
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: dev-space
name: my-app-dev
spec:
replicas: 1
revisionHistoryLimit: 1
selector:
matchLabels:
name: my-app-dev
strategy:
rollingUpdate:
maxSurge: 100%
maxUnavailable: 30%
type: RollingUpdate
template:
metadata:
labels:
name: my-app-dev
version: v1
annotations:
sla: high
tier: application
role: frontend-api
quality: dev
spec:
containers:
- name: my-app
env:
- name: ENVIRONMENT
value: dev
- name: SAMPLE_FILE
value: sample.yml
- name: SAMPLE_FILE2
value: sample2.yml
image: my-app:1.0
ports:
- containerPort: 8000
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 9000
initialDelaySeconds: 11
timeoutSeconds: 3
readinessProbe:
httpGet:
path: /health
port: 9000
initialDelaySeconds: 11
timeoutSeconds: 3
volumeMounts:
- name: sample-volume
mountPath: /path
readOnly: true
- name: sample-volume1
mountPath: /path1
readOnly: true
- name: sample-volume2
mountPath: /path2
readOnly: true
nodeSelector:
tier: app
imagePullSecrets:
- name: img-secret
volumes:
- name: "sample-volume"
configMap:
name: "sample-volume-dev-my-app"
- name: "sample-volume1"
configMap:
name: "sample-volume1-dev-my-app"
- name: "sample-volume2"
configMap:
name: "sample-volume2-dev-my-app"
When I apply the deployment I get the following errors:
Warning FailedMount 4m (x6 over 5m) kubelet, server.org.local MountVolume.SetUp failed for volume "sample-volume" : configmaps "sample-volume-dev-my-app" not found
Warning FailedMount 4m (x6 over 5m) kubelet, server.org.local MountVolume.SetUp failed for volume "sample-volume1" : configmaps "sample-volume1-dev-my-app" not found
Warning FailedMount 4m (x6 over 5m) kubelet, server.org.local MountVolume.SetUp failed for volume "sample-volume2" : configmaps "sample-volume2-dev-my-app" not found
Is there something wrong with my configuration? What could be the issue?
You either have not created the config maps or you have created them in a different namespace than where you are deploying the application.
kubectl get cm -A
Above command will list all config maps in all namespaces. Check if a config map with name sample-volume-dev-my-app exists and in which namespace.

HPA showing unknown in k8s

I configured HPA using a command as shown below
kubectl autoscale deployment isamruntime-v1 --cpu-percent=20 --min=1 --max=3 --namespace=default
horizontalpodautoscaler.autoscaling/isamruntime-v1 autoscaled
However, the HPA cannot identify the CPU load.
pranam#UNKNOWN kubernetes % kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
isamruntime-v1 Deployment/isamruntime-v1 <unknown>/20% 1 3 0 3s
I read a number of articles which suggested installing metrics server. So, I did that.
pranam#UNKNOWN kubernetes % kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.6/components.yaml
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator configured
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io configured
serviceaccount/metrics-server configured
deployment.apps/metrics-server configured
service/metrics-server configured
clusterrole.rbac.authorization.k8s.io/system:metrics-server configured
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server configured
I can see the metrics server.
pranam#UNKNOWN kubernetes % kubectl get pods -o wide --namespace=kube-system
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-kube-controllers-7d88b45844-lz8zw 1/1 Running 0 22d 10.164.27.28 10.164.27.28 <none> <none>
calico-node-bsx6p 1/1 Running 0 8d 10.164.27.39 10.164.27.39 <none> <none>
calico-node-g229m 1/1 Running 0 8d 10.164.27.46 10.164.27.46 <none> <none>
calico-node-slwrh 1/1 Running 0 22d 10.164.27.28 10.164.27.28 <none> <none>
calico-node-tztjg 1/1 Running 0 8d 10.164.27.44 10.164.27.44 <none> <none>
coredns-7d6bb98ccc-d8nrs 1/1 Running 0 25d 172.30.93.205 10.164.27.28 <none> <none>
coredns-7d6bb98ccc-n28dm 1/1 Running 0 25d 172.30.93.204 10.164.27.28 <none> <none>
coredns-7d6bb98ccc-zx5jx 1/1 Running 0 25d 172.30.93.197 10.164.27.28 <none> <none>
coredns-autoscaler-848db65fc6-lnfvf 1/1 Running 0 25d 172.30.93.201 10.164.27.28 <none> <none>
dashboard-metrics-scraper-576c46d9bd-k6z85 1/1 Running 0 25d 172.30.93.195 10.164.27.28 <none> <none>
ibm-file-plugin-7c57965855-494bz 1/1 Running 0 22d 172.30.93.216 10.164.27.28 <none> <none>
ibm-iks-cluster-autoscaler-7df84fb95c-fhtgv 1/1 Running 0 2d23h 172.30.137.98 10.164.27.46 <none> <none>
ibm-keepalived-watcher-9w4gb 1/1 Running 0 8d 10.164.27.39 10.164.27.39 <none> <none>
ibm-keepalived-watcher-ps5zm 1/1 Running 0 8d 10.164.27.46 10.164.27.46 <none> <none>
ibm-keepalived-watcher-rzxbs 1/1 Running 0 8d 10.164.27.44 10.164.27.44 <none> <none>
ibm-keepalived-watcher-w6mxb 1/1 Running 0 25d 10.164.27.28 10.164.27.28 <none> <none>
ibm-master-proxy-static-10.164.27.28 2/2 Running 0 25d 10.164.27.28 10.164.27.28 <none> <none>
ibm-master-proxy-static-10.164.27.39 2/2 Running 0 8d 10.164.27.39 10.164.27.39 <none> <none>
ibm-master-proxy-static-10.164.27.44 2/2 Running 0 8d 10.164.27.44 10.164.27.44 <none> <none>
ibm-master-proxy-static-10.164.27.46 2/2 Running 0 8d 10.164.27.46 10.164.27.46 <none> <none>
ibm-storage-watcher-67466b969f-ps55m 1/1 Running 0 22d 172.30.93.217 10.164.27.28 <none> <none>
kubernetes-dashboard-c6b4b9d77-27zwb 1/1 Running 2 22d 172.30.93.218 10.164.27.28 <none> <none>
metrics-server-79d847cf58-6frsf 2/2 Running 0 3m23s 172.30.93.226 10.164.27.28 <none> <none>
public-crbro6um6l04jalpqrsl5g-alb1-8465f75bb4-88vl5 4/4 Running 0 11h 172.30.93.225 10.164.27.28 <none> <none>
public-crbro6um6l04jalpqrsl5g-alb1-8465f75bb4-vx68d 4/4 Running 0 11h 172.30.137.104 10.164.27.46 <none> <none>
vpn-58b48cdc7c-4lp9c 1/1 Running 0 25d 172.30.93.193 10.164.27.28 <none> <none>
I am using Istio and sysdig. Not sure if that breaks anything. My k8s versions are shown below.
pranam#UNKNOWN kubernetes % kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.7", GitCommit:"be3d344ed06bff7a4fc60656200a93c74f31f9a4", GitTreeState:"clean", BuildDate:"2020-02-11T19:34:02Z", GoVersion:"go1.13.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.7+IKS", GitCommit:"3305158dfe9ee1f89f596ef260135dcba881848c", GitTreeState:"clean", BuildDate:"2020-06-17T18:32:22Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
My YAML file is
#Assumes create-docker-store-secret.sh used to create dockerlogin secret
#Assumes create-secrets.sh used to create key file, sam admin, and cfgsvc secrets
apiVersion: storage.k8s.io/v1beta1
# Create StorageClass with gidallocate=true to allow non-root user access to mount
# This is used by PostgreSQL container
kind: StorageClass
metadata:
name: ibmc-file-bronze-gid
labels:
kubernetes.io/cluster-service: "true"
provisioner: ibm.io/ibmc-file
parameters:
type: "Endurance"
iopsPerGB: "2"
sizeRange: "[1-12000]Gi"
mountOptions: nfsvers=4.1,hard
billingType: "hourly"
reclaimPolicy: "Delete"
classVersion: "2"
gidAllocate: "true"
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ldaplib
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ldapslapd
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ldapsecauthority
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgresqldata
spec:
storageClassName: ibmc-file-bronze-gid
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50M
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: isamconfig
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 50M
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: openldap
labels:
app: openldap
spec:
selector:
matchLabels:
app: openldap
replicas: 1
template:
metadata:
labels:
app: openldap
spec:
volumes:
- name: ldaplib
persistentVolumeClaim:
claimName: ldaplib
- name: ldapslapd
persistentVolumeClaim:
claimName: ldapslapd
- name: ldapsecauthority
persistentVolumeClaim:
claimName: ldapsecauthority
- name: openldap-keys
secret:
secretName: openldap-keys
containers:
- name: openldap
image: ibmcom/isam-openldap:9.0.7.0
ports:
- containerPort: 636
env:
- name: LDAP_DOMAIN
value: ibm.com
- name: LDAP_ADMIN_PASSWORD
value: Passw0rd
- name: LDAP_CONFIG_PASSWORD
value: Passw0rd
volumeMounts:
- mountPath: /var/lib/ldap
name: ldaplib
- mountPath: /etc/ldap/slapd.d
name: ldapslapd
- mountPath: /var/lib/ldap.secAuthority
name: ldapsecauthority
- mountPath: /container/service/slapd/assets/certs
name: openldap-keys
# This line is needed when running on Kubernetes 1.9.4 or above
args: [ "--copy-service"]
# useful for debugging startup issues - can run bash, then exec to the container and poke around
# command: [ "/bin/bash"]
# args: [ "-c", "while /bin/true ; do sleep 5; done" ]
# Just this line to get debug output from openldap startup
# args: [ "--loglevel" , "trace","--copy-service"]
---
# for external service access, see https://console.bluemix.net/docs/containers/cs_apps.html#cs_apps_public_nodeport
apiVersion: v1
kind: Service
metadata:
name: openldap
labels:
app: openldap
spec:
ports:
- port: 636
name: ldaps
protocol: TCP
selector:
app: openldap
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
labels:
app: postgresql
spec:
selector:
matchLabels:
app: postgresql
replicas: 1
template:
metadata:
labels:
app: postgresql
spec:
securityContext:
runAsNonRoot: true
runAsUser: 70
fsGroup: 0
volumes:
- name: postgresqldata
persistentVolumeClaim:
claimName: postgresqldata
- name: postgresql-keys
secret:
secretName: postgresql-keys
containers:
- name: postgresql
image: ibmcom/isam-postgresql:9.0.7.0
ports:
- containerPort: 5432
env:
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
value: Passw0rd
- name: POSTGRES_DB
value: isam
- name: POSTGRES_SSL_KEYDB
value: /var/local/server.pem
- name: PGDATA
value: /var/lib/postgresql/data/db-files/
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgresqldata
- mountPath: /var/local
name: postgresql-keys
# useful for debugging startup issues - can run bash, then exec to the container and poke around
# command: [ "/bin/bash"]
# args: [ "-c", "while /bin/true ; do sleep 5; done" ]
---
# for external service access, see https://console.bluemix.net/docs/containers/cs_apps.html#cs_apps_public_nodeport
apiVersion: v1
kind: Service
metadata:
name: postgresql
spec:
ports:
- port: 5432
name: postgresql
protocol: TCP
selector:
app: postgresql
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamconfig
labels:
app: isamconfig
spec:
selector:
matchLabels:
app: isamconfig
replicas: 1
template:
metadata:
labels:
app: isamconfig
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
persistentVolumeClaim:
claimName: isamconfig
- name: isamconfig-logs
emptyDir: {}
containers:
- name: isamconfig
image: ibmcom/isam:9.0.7.1_IF4
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamconfig-logs
env:
- name: SERVICE
value: config
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: ADMIN_PWD
valueFrom:
secretKeyRef:
name: samadmin
key: adminpw
readinessProbe:
tcpSocket:
port: 9443
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 9443
initialDelaySeconds: 120
periodSeconds: 20
# command: [ "/sbin/bootstrap.sh" ]
imagePullSecrets:
- name: dockerlogin
---
# for external service access, see https://console.bluemix.net/docs/containers/cs_apps.html#cs_apps_public_nodeport
apiVersion: v1
kind: Service
metadata:
name: isamconfig
spec:
# To make the LMI internet facing, make it a NodePort
type: NodePort
ports:
- port: 9443
name: isamconfig
protocol: TCP
# make this one statically allocated
nodePort: 30442
selector:
app: isamconfig
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamwrprp1-v1
labels:
app: isamwrprp1
spec:
selector:
matchLabels:
app: isamwrprp1
version: v1
replicas: 1
template:
metadata:
labels:
app: isamwrprp1
version: v1
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
emptyDir: {}
- name: isamwrprp1-logs
emptyDir: {}
containers:
- name: isamwrprp1
image: ibmcom/isam:9.0.7.1_IF4
ports:
- containerPort: 443
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamwrprp1-logs
env:
- name: SERVICE
value: webseal
- name: INSTANCE
value: rp1
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: AUTO_RELOAD_FREQUENCY
value: "5"
- name: CONFIG_SERVICE_URL
value: https://isamconfig:9443/shared_volume
- name: CONFIG_SERVICE_USER_NAME
value: cfgsvc
- name: CONFIG_SERVICE_USER_PWD
valueFrom:
secretKeyRef:
name: configreader
key: cfgsvcpw
livenessProbe:
exec:
command:
- /sbin/health_check.sh
- livenessProbe
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
exec:
command:
- /sbin/health_check.sh
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
imagePullSecrets:
- name: dockerlogin
---
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamwrprp1-v2
labels:
app: isamwrprp1
spec:
selector:
matchLabels:
app: isamwrprp1
version: v2
replicas: 1
template:
metadata:
labels:
app: isamwrprp1
version: v2
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
emptyDir: {}
- name: isamwrprp1-logs
emptyDir: {}
containers:
- name: isamwrprp1
image: ibmcom/isam:9.0.7.1_IF4
ports:
- containerPort: 443
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamwrprp1-logs
env:
- name: SERVICE
value: webseal
- name: INSTANCE
value: rp1
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: AUTO_RELOAD_FREQUENCY
value: "5"
- name: CONFIG_SERVICE_URL
value: https://isamconfig:9443/shared_volume
- name: CONFIG_SERVICE_USER_NAME
value: cfgsvc
- name: CONFIG_SERVICE_USER_PWD
valueFrom:
secretKeyRef:
name: configreader
key: cfgsvcpw
livenessProbe:
exec:
command:
- /sbin/health_check.sh
- livenessProbe
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
exec:
command:
- /sbin/health_check.sh
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
imagePullSecrets:
- name: dockerlogin
---
# for external service access, see https://console.bluemix.net/docs/containers/cs_apps.html#cs_apps_public_nodeport
apiVersion: v1
kind: Service
metadata:
name: isamwrprp1
spec:
type: NodePort
sessionAffinity: ClientIP
ports:
- port: 443
name: isamwrprp1
protocol: TCP
nodePort: 30443
selector:
app: isamwrprp1
---
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamwrpmobile
labels:
app: isamwrpmobile
spec:
selector:
matchLabels:
app: isamwrpmobile
replicas: 1
template:
metadata:
labels:
app: isamwrpmobile
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
emptyDir: {}
- name: isamwrpmobile-logs
emptyDir: {}
containers:
- name: isamwrpmobile
image: ibmcom/isam:9.0.7.1_IF4
ports:
- containerPort: 443
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamwrpmobile-logs
env:
- name: SERVICE
value: webseal
- name: INSTANCE
value: mobile
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: AUTO_RELOAD_FREQUENCY
value: "5"
- name: CONFIG_SERVICE_URL
value: https://isamconfig:9443/shared_volume
- name: CONFIG_SERVICE_USER_NAME
value: cfgsvc
- name: CONFIG_SERVICE_USER_PWD
valueFrom:
secretKeyRef:
name: configreader
key: cfgsvcpw
livenessProbe:
exec:
command:
- /sbin/health_check.sh
- livenessProbe
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
exec:
command:
- /sbin/health_check.sh
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
imagePullSecrets:
- name: dockerlogin
---
# for external service access, see https://console.bluemix.net/docs/containers/cs_apps.html#cs_apps_public_nodeport
apiVersion: v1
kind: Service
metadata:
name: isamwrpmobile
spec:
type: NodePort
sessionAffinity: ClientIP
ports:
- port: 443
name: isamwrpmobile
protocol: TCP
nodePort: 30444
selector:
app: isamwrpmobile
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamruntime-v1
labels:
app: isamruntime
spec:
selector:
matchLabels:
app: isamruntime
version: v1
replicas: 1
template:
metadata:
labels:
app: isamruntime
version: v1
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
emptyDir: {}
- name: isamruntime-logs
emptyDir: {}
containers:
- name: isamruntime
image: ibmcom/isam:9.0.7.1_IF4
ports:
- containerPort: 443
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamruntime-logs
env:
- name: SERVICE
value: runtime
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: AUTO_RELOAD_FREQUENCY
value: "5"
- name: CONFIG_SERVICE_URL
value: https://isamconfig:9443/shared_volume
- name: CONFIG_SERVICE_USER_NAME
value: cfgsvc
- name: CONFIG_SERVICE_USER_PWD
valueFrom:
secretKeyRef:
name: configreader
key: cfgsvcpw
livenessProbe:
exec:
command:
- /sbin/health_check.sh
- livenessProbe
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
exec:
command:
- /sbin/health_check.sh
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
imagePullSecrets:
- name: dockerlogin
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: isamruntime-v2
labels:
app: isamruntime
spec:
selector:
matchLabels:
app: isamruntime
version: v2
replicas: 1
template:
metadata:
labels:
app: isamruntime
version: v2
spec:
securityContext:
runAsNonRoot: true
runAsUser: 6000
volumes:
- name: isamconfig
emptyDir: {}
- name: isamruntime-logs
emptyDir: {}
containers:
- name: isamruntime
image: ibmcom/isam:9.0.7.1_IF4
ports:
- containerPort: 443
volumeMounts:
- mountPath: /var/shared
name: isamconfig
- mountPath: /var/application.logs
name: isamruntime-logs
env:
- name: SERVICE
value: runtime
- name: CONTAINER_TIMEZONE
value: Europe/London
- name: AUTO_RELOAD_FREQUENCY
value: "5"
- name: CONFIG_SERVICE_URL
value: https://isamconfig:9443/shared_volume
- name: CONFIG_SERVICE_USER_NAME
value: cfgsvc
- name: CONFIG_SERVICE_USER_PWD
valueFrom:
secretKeyRef:
name: configreader
key: cfgsvcpw
livenessProbe:
exec:
command:
- /sbin/health_check.sh
- livenessProbe
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
readinessProbe:
exec:
command:
- /sbin/health_check.sh
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 2
imagePullSecrets:
- name: dockerlogin
---
apiVersion: v1
kind: Service
metadata:
name: isamruntime
spec:
ports:
- port: 443
name: isamruntime
protocol: TCP
selector:
app: isamruntime
---
I am not sure why the CPU load is shown as unknown. Have I missed a step or made any mistake ? Can someone help ?
Regards
Pranam
Based on the issue shown, it appears that you have not set the resource limits in the delployment.yaml file.
if you go for executing kubectl explain deployment then you will see in containers specs -
resources:
limits:
cpu:
memory:
requests:
cpu:
memory:
If you add values to above mentioned keys then surely the hpa issue will get solved
Did you specify the resources block when you defined your app's deployment?
I don't remembered where it was stated but I encountered this case once when I forgot that.
More information about managing resources for containers: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

K8s ingress: nginx ingress controller is not in running mode

I have a jenkins image, I made service as NodeType. It works well. Since I will add more services, I need to use ingress nginx to divert traffic to different kinds of services.
At this moment, I use my win10 to set up two vms (Centos 7.5). One vm as master1, it has two internal IPv4 address (10.0.2.9 and 192.168.56.103) and one vm as worker node4 (10.0.2.6 and 192.168.56.104).
All images are local. I have downloaded into local docker image repository. The problem is that Nginx ingress does not run.
My configuration as follows:
ingress-nginx-ctl.yaml:
apiVersion: extensions/v1beta1
metadata:
name: ingress-nginx
namespace: default
spec:
replicas: 1
template:
metadata:
labels:
app: ingress-nginx
spec:
terminationGracePeriodSeconds: 60
containers:
- image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
name: ingress-nginx
imagePullPolicy: Never
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
args:
- /nginx-ingress-controller
- --default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
ingress-nginx-res.yaml:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: my-ingress
namespace: default
spec:
rules:
- host:
http:
paths:
- path: /
backend:
serviceName: shinyinfo-jenkins-svc
servicePort: 8080
nginx-default-backend.yaml
kind: Service
apiVersion: v1
metadata:
name: nginx-default-backend
namespace: default
spec:
ports:
- port: 80
targetPort: http
selector:
app: nginx-default-backend
---
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: nginx-default-backend
namespace: default
spec:
replicas: 1
template:
metadata:
labels:
app: nginx-default-backend
spec:
terminationGracePeriodSeconds: 60
containers:
- name: default-http-backend
image: chenliujin/defaultbackend
imagePullPolicy: Never
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 5
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
ports:
- name: http
containerPort: 8080
protocol: TCP
shinyinfo-jenkins-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: shinyinfo-jenkins
labels:
app: shinyinfo-jenkins
spec:
containers:
- name: shinyinfo-jenkins
image: shinyinfo_jenkins
imagePullPolicy: Never
ports:
- containerPort: 8080
containerPort: 50000
volumeMounts:
- mountPath: /devops/password
name: jenkins-password
- mountPath: /var/jenkins_home
name: jenkins-home
volumes:
- name: jenkins-password
hostPath:
path: /jenkins/password
- name: jenkins-home
hostPath:
path: /jenkins
shinyinfo-jenkins-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: shinyinfo-jenkins-svc
labels:
name: shinyinfo-jenkins-svc
spec:
selector:
app: shinyinfo-jenkins
type: NodePort
ports:
- name: tcp
port: 8080
nodePort: 30003
There is something wrong with nginx ingress, the console output is as follows:
[master#master1 config]$ sudo kubectl apply -f ingress-nginx-ctl.yaml
service/ingress-nginx created
deployment.extensions/ingress-nginx created
[master#master1 config]$ sudo kubectl apply -f ingress-nginx-res.yaml
ingress.extensions/my-ingress created
Images is CrashLoopBackOff, Why???
[master#master1 config]$ sudo kubectl get po
NAME READY STATUS RESTARTS AGE
ingress-nginx-66df6b6d9-mhmj9 0/1 CrashLoopBackOff 1 9s
nginx-default-backend-645546c46f-x7s84 1/1 Running 0 6m
shinyinfo-jenkins 1/1 Running 0 20m
describe pod:
[master#master1 config]$ sudo kubectl describe po ingress-nginx-66df6b6d9-mhmj9
Name: ingress-nginx-66df6b6d9-mhmj9
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: node4/192.168.56.104
Start Time: Thu, 08 Nov 2018 16:45:46 +0800
Labels: app=ingress-nginx
pod-template-hash=228926285
Annotations: <none>
Status: Running
IP: 100.127.10.211
Controlled By: ReplicaSet/ingress-nginx-66df6b6d9
Containers:
ingress-nginx:
Container ID: docker://2aba164d116758585abef9d893a5fa0f0c5e23c04a13466263ce357ebe10cb0a
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0
Image ID: docker://sha256:a3f21ec4bd119e7e17c8c8b2bf8a3b9e42a8607455826cd1fa0b5461045d2fa9
Ports: 80/TCP, 443/TCP
Host Ports: 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--default-backend-service=$(POD_NAMESPACE)/nginx-default-backend
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Thu, 08 Nov 2018 16:46:09 +0800
Finished: Thu, 08 Nov 2018 16:46:09 +0800
Ready: False
Restart Count: 2
Liveness: http-get http://:10254/healthz delay=30s timeout=5s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-66df6b6d9-mhmj9 (v1:metadata.name)
POD_NAMESPACE: default (v1:metadata.namespace)
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-24hnm (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-24hnm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-24hnm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40s default-scheduler Successfully assigned default/ingress-nginx-66df6b6d9-mhmj9 to node4
Normal Pulled 18s (x3 over 39s) kubelet, node4 Container image "quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.20.0" already present on machine
Normal Created 18s (x3 over 39s) kubelet, node4 Created container
Normal Started 17s (x3 over 39s) kubelet, node4 Started container
Warning BackOff 11s (x5 over 36s) kubelet, node4 Back-off restarting failed container
logs of pod:
[master#master1 config]$ sudo kubectl logs ingress-nginx-66df6b6d9-mhmj9
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: 0.20.0
Build: git-e8d8103
Repository: https://github.com/kubernetes/ingress-nginx.git
-------------------------------------------------------------------------------
nginx version: nginx/1.15.5
W1108 08:47:16.081042 6 client_config.go:552] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
I1108 08:47:16.081234 6 main.go:196] Creating API client for https://10.96.0.1:443
I1108 08:47:16.122315 6 main.go:240] Running in Kubernetes cluster version v1.11 (v1.11.3) - git (clean) commit a4529464e4629c21224b3d52edfe0ea91b072862 - platform linux/amd64
F1108 08:47:16.123661 6 main.go:97] ✖ The cluster seems to be running with a restrictive Authorization mode and the Ingress controller does not have the required permissions to operate normally.
Could experts here drop me some hints?
You need set ingress-nginx to use a seperate serviceaccount and give neccessary privilege to the serviceaccount.
here is a example:
apiVersion: v1
kind: ServiceAccount
metadata:
name: lb
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: nginx-ingress-normal
rules:
- apiGroups:
- ""
resources:
- configmaps
- endpoints
- nodes
- pods
- secrets
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- ""
resources:
- services
verbs:
- get
- list
- watch
- apiGroups:
- "extensions"
resources:
- ingresses
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- apiGroups:
- "extensions"
resources:
- ingresses/status
verbs:
- update
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: Role
metadata:
name: nginx-ingress-minimal
namespace: kube-system
rules:
- apiGroups:
- ""
resources:
- configmaps
- pods
- secrets
- namespaces
verbs:
- get
- apiGroups:
- ""
resources:
- configmaps
resourceNames:
- "ingress-controller-leader-dev"
- "ingress-controller-leader-prod"
verbs:
- get
- update
- apiGroups:
- ""
resources:
- configmaps
verbs:
- create
- apiGroups:
- ""
resources:
- endpoints
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: RoleBinding
metadata:
name: nginx-ingress-minimal
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: nginx-ingress-minimal
subjects:
- kind: ServiceAccount
name: lb
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: nginx-ingress-normal
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: nginx-ingress-normal
subjects:
- kind: ServiceAccount
name: lb
namespace: kube-system