Installing kong-ingress-controller to manage ingress on kubernetes - postgresql

I am installing kong ingress controller on my AKS cluster, but I don't want to have the postgres Statefulset service inside my cluster. Instead, I have a postgres database in my azure infrastructure, and I want connect it from my kong-ingress-controller deplopyment creating the postgres credentials like secrets in my aks cluster and store it in an environment variables.
I've create the secret
⟩ kubectl create secret generic az-pg-db-user-pass --from-literal=username='az-pg-username' --from-literal=password='az-pg-password' --namespace kong
secret/az-pg-db-user-pass created
And in my kongwithingress.yaml file, I have the deployment manifest declarations, which I did want to present from this gist link in order to don't fill the body question of a lot of yaml code lines.
This gist is based in this AKS deployment all in one, but removing postgres like Statefulset and Service due to the previous reasons, my objective is setup connection with my own azure managed postgres service
I've configured the az-pg-db-user-pass generic secret created in the kong-ingress-controller deployment and my kong deployment and my kong-migrations job presents in my whole gist script on order to create an environment variables such as follow:
KONG_PG_USERNAME
KONG_PG_PASSWORD
These environment variables has been created and referenced as a secrets in the kong-ingress-controller deployment and kong deployment and kong-migrations job which need access or connect with the postgres database
When I execute the kubectl apply -f kongwithingres.yaml command I get the following output:
The kong-ingress-controller deployment, kong deployment and kong-migrations job were created successfully.
⟩ kubectl apply -f kongwithingres.yaml
namespace/kong unchanged
customresourcedefinition.apiextensions.k8s.io/kongplugins.configuration.konghq.com unchanged
customresourcedefinition.apiextensions.k8s.io/kongconsumers.configuration.konghq.com unchanged
customresourcedefinition.apiextensions.k8s.io/kongcredentials.configuration.konghq.com unchanged
customresourcedefinition.apiextensions.k8s.io/kongingresses.configuration.konghq.com unchanged
serviceaccount/kong-serviceaccount unchanged
clusterrole.rbac.authorization.k8s.io/kong-ingress-clusterrole unchanged
role.rbac.authorization.k8s.io/kong-ingress-role unchanged
rolebinding.rbac.authorization.k8s.io/kong-ingress-role-nisa-binding unchanged
clusterrolebinding.rbac.authorization.k8s.io/kong-ingress-clusterrole-nisa-binding unchanged
service/kong-ingress-controller created
deployment.extensions/kong-ingress-controller created
service/kong-proxy created
deployment.extensions/kong created
job.batch/kong-migrations created
[I]
But their respective pods have the CrashLoopBackOff status
NAME READY STATUS RESTARTS AGE
pod/kong-d8b88df99-j6hvl 0/1 Init:CrashLoopBackOff 5 4m24s
pod/kong-ingress-controller-984fc9666-cd2b5 0/2 Init:CrashLoopBackOff 5 4m24s
pod/kong-migrations-t6n7p 0/1 CrashLoopBackOff 5 4m24s
I am checking the respective logs of each pod and I found this:
The pod/kong-d8b88df99-j6hvl:
⟩ kubectl logs pod/kong-d8b88df99-j6hvl -p -n kong
Error from server (BadRequest): previous terminated container "kong-proxy" in pod "kong-d8b88df99-j6hvl" not found
And in their describe information this pod is getting the environment variables and the image
⟩ kubectl describe pod/kong-d8b88df99-j6hvl -n kong
Name: kong-d8b88df99-j6hvl
Namespace: kong
Status: Pending
IP: 10.244.1.18
Controlled By: ReplicaSet/kong-d8b88df99
Init Containers:
wait-for-migrations:
Container ID: docker://7007a89ada215daf853ec103d79dca60ccc5fb3a14c51ac6c5c56655da6da62f
Image: kong:1.0.0
Image ID: docker-pullable://kong#sha256:8fd6a312d7715a9cc85c49625a4c2f53951f6e4422926091e4d2ae67c480b6d5
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
kong migrations list
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 26 Feb 2019 16:25:01 +0100
Finished: Tue, 26 Feb 2019 16:25:01 +0100
Ready: False
Restart Count: 6
Environment:
KONG_ADMIN_LISTEN: off
KONG_PROXY_LISTEN: off
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_PG_HOST: zcrm365-postgresql1.postgres.database.azure.com
KONG_PG_USERNAME: <set to the key 'username' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_PASSWORD: <set to the key 'password' in secret 'az-pg-db-user-pass'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gnkjq (ro)
Containers:
kong-proxy:
Container ID:
Image: kong:1.0.0
Image ID:
Ports: 8000/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment:
KONG_PG_USERNAME: <set to the key 'username' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_PASSWORD: <set to the key 'password' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_HOST: zcrm365-postgresql1.postgres.database.azure.com
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_LISTEN: off
KUBERNETES_PORT_443_TCP_ADDR: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
KUBERNETES_PORT: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_SERVICE_HOST: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gnkjq (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-gnkjq:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-gnkjq
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m44s default-scheduler Successfully assigned kong/kong-d8b88df99-j6hvl to aks-default-75800594-1
Normal Pulled 7m9s (x5 over 8m40s) kubelet, aks-default-75800594-1 Container image "kong:1.0.0" already present on machine
Normal Created 7m8s (x5 over 8m40s) kubelet, aks-default-75800594-1 Created container
Normal Started 7m7s (x5 over 8m40s) kubelet, aks-default-75800594-1 Started container
Warning BackOff 3m34s (x26 over 8m38s) kubelet, aks-default-75800594-1 Back-off restarting failed container
The pod/kong-ingress-controller-984fc9666-cd2b5:
kubectl logs pod/kong-ingress-controller-984fc9666-cd2b5 -p -n kong
Error from server (BadRequest): a container name must be specified for pod kong-ingress-controller-984fc9666-cd2b5, choose one of: [admin-api ingress-controller] or one of the init containers: [wait-for-migrations]
[I]
And their respective description
⟩ kubectl describe pod/kong-ingress-controller-984fc9666-cd2b5 -n kong
Name: kong-ingress-controller-984fc9666-cd2b5
Namespace: kong
Status: Pending
IP: 10.244.2.18
Controlled By: ReplicaSet/kong-ingress-controller-984fc9666
Init Containers:
wait-for-migrations:
Container ID: docker://8eb035f755322b3ac72792d922974811933ba9a71afb1f4549cfe7e0a6519619
Image: kong:1.0.0
Image ID: docker-pullable://kong#sha256:8fd6a312d7715a9cc85c49625a4c2f53951f6e4422926091e4d2ae67c480b6d5
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
kong migrations list
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 26 Feb 2019 16:29:56 +0100
Finished: Tue, 26 Feb 2019 16:29:56 +0100
Ready: False
Restart Count: 7
Environment:
KONG_ADMIN_LISTEN: off
KONG_PROXY_LISTEN: off
KONG_PROXY_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_PROXY_ERROR_LOG: /dev/stderr
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_PG_HOST: zcrm365-postgresql1.postgres.database.azure.com
KONG_PG_USERNAME: <set to the key 'username' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_PASSWORD: <set to the key 'password' in secret 'az-pg-db-user-pass'> Optional: false
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kong-serviceaccount-token-rc4sp (ro)
Containers:
admin-api:
Container ID:
Image: kong:1.0.0
Image ID:
Port: 8001/TCP
Host Port: 0/TCP
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: http-get http://:8001/status delay=30s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:8001/status delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
KONG_PG_USERNAME: <set to the key 'username' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_PASSWORD: <set to the key 'password' in secret 'az-pg-db-user-pass'> Optional: false
KONG_PG_HOST: zcrm365-postgresql1.postgres.database.azure.com
KONG_ADMIN_ACCESS_LOG: /dev/stdout
KONG_ADMIN_ERROR_LOG: /dev/stderr
KONG_ADMIN_LISTEN: 0.0.0.0:8001, 0.0.0.0:8444 ssl
KONG_PROXY_LISTEN: off
KUBERNETES_PORT_443_TCP_ADDR: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
KUBERNETES_PORT: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_SERVICE_HOST: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kong-serviceaccount-token-rc4sp (ro)
ingress-controller:
Container ID:
Image: kong-docker-kubernetes-ingress-controller.bintray.io/kong-ingress-controller:0.3.0
Image ID:
Port: <none>
Host Port: <none>
Args:
/kong-ingress-controller
--kong-url=https://localhost:8444
--admin-tls-skip-verify
--default-backend-service=kong/kong-proxy
--publish-service=kong/kong-proxy
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Liveness: http-get http://:10254/healthz delay=30s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:10254/healthz delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: kong-ingress-controller-984fc9666-cd2b5 (v1:metadata.name)
POD_NAMESPACE: kong (v1:metadata.namespace)
KUBERNETES_PORT_443_TCP_ADDR: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
KUBERNETES_PORT: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_PORT_443_TCP: tcp://zcrm365-d73ab78d.hcp.westeurope.azmk8s.io:443
KUBERNETES_SERVICE_HOST: zcrm365-d73ab78d.hcp.westeurope.azmk8s.io
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kong-serviceaccount-token-rc4sp (ro)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
kong-serviceaccount-token-rc4sp:
Type: Secret (a volume populated by a Secret)
SecretName: kong-serviceaccount-token-rc4sp
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 12m default-scheduler Successfully assigned kong/kong-ingress-controller-984fc9666-cd2b5 to aks-default-75800594-2
Normal Pulled 10m (x5 over 12m) kubelet, aks-default-75800594-2 Container image "kong:1.0.0" already present on machine
Normal Created 10m (x5 over 12m) kubelet, aks-default-75800594-2 Created container
Normal Started 10m (x5 over 12m) kubelet, aks-default-75800594-2 Started container
Warning BackOff 2m14s (x49 over 12m) kubelet, aks-default-75800594-2 Back-off restarting failed container
[I]
~/workspace/ZCRM365/Deployments/Kubernetes/kong · (Deployments±)
⟩
I unknown the reason by which teh CrashLoopBackOff status and their status respective is Waiting: PodInitiazing
How to can I debug this behavior?
Is possible that Kong cannot talk to the Postgres database?
My AKS cluster is on Azure and also my postgres database and they have communication as a services.
UPDATE
These are the logs of my container pods created:
⟩ kubectl logs pod/kong-ingress-controller-984fc9666-w4vvn -p -n kong -c ingress-controller
Error from server (BadRequest): previous terminated container "ingress-controller" in pod "kong-ingress-controller-984fc9666-w4vvn" not found
[I]
⟩ kubectl logs pod/kong-d8b88df99-qsq4j -p -n kong -c kong-proxy
Error from server (BadRequest): previous terminated container "kong-proxy" in pod "kong-d8b88df99-qsq4j" not found
[I]
~/workspace/ZCRM365/Deployments/Kubernetes/kong · (Deployments±)
⟩

My kong-ingress-controller deployment pods are CrashLoopBackOff and some times in Waiting: PodInitiazing because I don't had in mind some things such as follow:
The main reason, such as says #Amityo, the kong-ingress-controller and kong have init-container called - wait-for-migrations which waits for the kong-migrations job before to be executed. Here, I can identify that is necessary perform my kong migrations
But my kong-migrations job was not working because I don't had the KONG_DATABASE environment variable parameter to setup the connection.
Other reason by which my deployment was not working is because kong internally to connect with postgres maybe wait that the user environment variable defined in the container to be called KONG_PG_USER. I was called KONG_PG_USERNAME and it was other reason to fail the execution of my script. (I am not sure completely about this)
⟩ kubectl create -f kongwithingres.yaml
namespace/kong created
secret/az-pg-db-user-pass created
customresourcedefinition.apiextensions.k8s.io/kongplugins.configuration.konghq.com created
customresourcedefinition.apiextensions.k8s.io/kongconsumers.configuration.konghq.com created
customresourcedefinition.apiextensions.k8s.io/kongcredentials.configuration.konghq.com created
customresourcedefinition.apiextensions.k8s.io/kongingresses.configuration.konghq.com created
serviceaccount/kong-serviceaccount created
clusterrole.rbac.authorization.k8s.io/kong-ingress-clusterrole created
role.rbac.authorization.k8s.io/kong-ingress-role created
rolebinding.rbac.authorization.k8s.io/kong-ingress-role-nisa-binding created
clusterrolebinding.rbac.authorization.k8s.io/kong-ingress-clusterrole-nisa-binding created
service/kong-ingress-controller created
deployment.extensions/kong-ingress-controller created
service/kong-proxy created
deployment.extensions/kong created
job.batch/kong-migrations created
[I]
~/workspace/ZCRM365/Deployments/Kubernetes/kong · (Deployments)
By the way, to start with kong I've recommend install konga which is a front-end dashboard tool to manage kong and check the things that we can make via yaml files.
We have this konga.yaml script to be installed like deployment in our kubernetes clusters
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: konga
namespace: kong
spec:
replicas: 1
template:
metadata:
labels:
app: konga
spec:
containers:
- env:
- name: NODE_TLS_REJECT_UNAUTHORIZED
value: "0"
image: pantsel/konga:latest
name: konga
ports:
- containerPort: 1337
And, we can start the service locally on our machines, via kubectl port-forward command
⟩ kubectl port-forward pod/konga-85b66cffff-mxq85 1337:1337 -n kong
Forwarding from 127.0.0.1:1337 -> 1337
Forwarding from [::1]:1337 -> 1337
We are going to http://localhost:1337/ and is necessary to register an admin user to use konga.
Create a kong connection (Connections in the dashboard) with the following kong admin URL http://kong-ingress-controller:8001/

Related

EKS Fargate pod for Airflow keeps restarting with error code

I am trying to deploy AIrflow on EKS Fargate using Helm. I have the EKS cluster, SC, PV, and PVC, along with namespace and fargate-profile(dev) all set up.
My problem comes when I do helm install:
helm upgrade --install airflow apache-airflow/airflow -n dev --values values.yaml --set volumePermissions.enbled=true --debug
[![list of pods][1]][1]
Above is the list of pods. The last 3 keep going into Crashloopbackoff.
Here is the describe of webserver pod:
C:\Users\tanma>kubectl describe pods -n dev airflow-webserver-775d548b98-wd5x8
Name: airflow-webserver-775d548b98-wd5x8
Namespace: dev
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: airflow-webserver
Node: fargate-ip-192-168-161-147.us-west-2.compute.internal/192.168.161.147
Start Time: Thu, 13 Oct 2022 17:12:54 -0400
Labels: component=webserver
eks.amazonaws.com/fargate-profile=dev
pod-template-hash=775d548b98
release=airflow
tier=airflow
Annotations: CapacityProvisioned: 0.25vCPU 0.5GB
Logging: LoggingDisabled: LOGGING_CONFIGMAP_NOT_FOUND
checksum/airflow-config: 978d20ff42d3de620bee24f2e35b1769f20ebd948890bf474bd940624e39f150
checksum/extra-configmaps: 2e44e493035e2f6a255d08f8104087ff10d30aef6f63176f1b18f75f73295598
checksum/extra-secrets: bb91ef06ddc31c0c5a29973832163d8b0b597812a793ef911d33b622bc9d1655
checksum/metadata-secret: d9bd679df96f2631a8559d02cc528fd78c3d73c06289be9816d83fb332e05b5e
checksum/pgbouncer-config-secret: da52bd1edfe820f0ddfacdebb20a4cc6407d296ee45bcb500a6407e2261a5ba2
checksum/webserver-config: 4a2281a4e3ed0cc5e89f07aba3c1bb314ea51c17cb5d2b41e9b045054a6b5c72
checksum/webserver-secret-key: a1e18ebcc73a51b6bafe52d95eee84dcdf132559cac0248fff6e58e409b4505e
kubernetes.io/psp: eks.privileged
Status: Running
IP: 192.168.161.147
IPs:
IP: 192.168.161.147
Controlled By: ReplicaSet/airflow-webserver-775d548b98
Init Containers:
wait-for-airflow-migrations:
Container ID: containerd://bf4919f7a268bbeaf1a8f8779e4da1551d76f622d9ce970f18a3f2a1f14c24d7
Image: apache/airflow:2.4.1
Image ID: docker.io/apache/airflow#sha256:e077b68d81d56d773bddbcdc8941b7a2c16a2087a641005dfc5f1b8dcadec90a
Port: <none>
Host Port: <none>
Args:
airflow
db
check-migrations
--migration-wait-timeout=60
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 13 Oct 2022 17:14:40 -0400
Finished: Thu, 13 Oct 2022 17:15:12 -0400
Ready: True
Restart Count: 0
Environment:
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__WEBSERVER__SECRET_KEY: <set to the key 'webserver-secret-key' in secret 'airflow-webserver-secret-key'> Optional: false
Mounts:
/opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pntv6 (ro)
Containers:
webserver:
Container ID: containerd://e479b50af8eefc8c99971cc9cc9b6345f826c09d5f770276b33518340298359d
Image: apache/airflow:2.4.1
Image ID: docker.io/apache/airflow#sha256:e077b68d81d56d773bddbcdc8941b7a2c16a2087a641005dfc5f1b8dcadec90a
Port: 8080/TCP
Host Port: 0/TCP
Args:
bash
-c
exec airflow webserver
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 143
Started: Thu, 13 Oct 2022 17:40:25 -0400
Finished: Thu, 13 Oct 2022 17:42:19 -0400
Ready: False
Restart Count: 9
Liveness: http-get http://:8080/health delay=15s timeout=30s period=5s #success=1 #failure=20
Readiness: http-get http://:8080/health delay=15s timeout=30s period=5s #success=1 #failure=20
Environment:
AIRFLOW__CORE__FERNET_KEY: <set to the key 'fernet-key' in secret 'airflow-fernet-key'> Optional: false
AIRFLOW__CORE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__DATABASE__SQL_ALCHEMY_CONN: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW_CONN_AIRFLOW_DB: <set to the key 'connection' in secret 'airflow-airflow-metadata'> Optional: false
AIRFLOW__WEBSERVER__SECRET_KEY: <set to the key 'webserver-secret-key' in secret 'airflow-webserver-secret-key'> Optional: false
Mounts:
/opt/airflow/airflow.cfg from config (ro,path="airflow.cfg")
/opt/airflow/config/airflow_local_settings.py from config (ro,path="airflow_local_settings.py")
/opt/airflow/logs from logs (rw)
/opt/airflow/pod_templates/pod_template_file.yaml from config (ro,path="pod_template_file.yaml")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pntv6 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: airflow-airflow-config
Optional: false
logs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: af-efs-fargate-1
ReadOnly: false
kube-api-access-pntv6:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning LoggingDisabled 31m fargate-scheduler Disabled logging because aws-logging configmap was not found. configmap "aws-logging" not found
Normal Scheduled 30m fargate-scheduler Successfully assigned dev/airflow-webserver-775d548b98-wd5x8 to fargate-ip-192-168-161-147.us-west-2.compute.internal
Normal Pulling 30m kubelet Pulling image "apache/airflow:2.4.1"
Normal Pulled 28m kubelet Successfully pulled image "apache/airflow:2.4.1" in 1m43.155801441s
Normal Created 28m kubelet Created container wait-for-airflow-migrations
Normal Started 28m kubelet Started container wait-for-airflow-migrations
Normal Pulled 28m kubelet Container image "apache/airflow:2.4.1" already present on machine
Normal Created 28m kubelet Created container webserver
Normal Started 28m kubelet Started container webserver
Warning Unhealthy 27m (x9 over 27m) kubelet Readiness probe failed: Get "http://192.168.161.147:8080/health": dial tcp 192.168.161.147:8080: connect: connection refused
Warning Unhealthy 10m (x156 over 27m) kubelet Liveness probe failed: Get "http://192.168.161.147:8080/health": dial tcp 192.168.161.147:8080: connect: connection refused
Warning BackOff 10s (x44 over 14m) kubelet Back-off restarting failed container
Any thoughts on why the pods keep restarting?
Appreciate your help here.
Thanks
[1]: https://i.stack.imgur.com/IPocP.png
Your host port is 0. I guess that could cause the webserver not to be able to expose its port.
However, you'd have to check the logs of the webserver pod itself to make sure this is the problem.
You need to make sure that this endpoint is available (which is not currently); http://192.168.161.147:8080/health
Ended up increasing the resources for webserver and this solved the problem.
THanks

Pod staying in Pending state

I have a kubernetes pod that is staying in Pending state. When I describe the pod, I am not seeing why it fails to start, I can just see Back-off restarting failed container.
This is what I can see when I describe the pod.
kubectl describe po jenkins-68d5474964-slpkj -n infrastructure
Name: jenkins-68d5474964-slpkj
Namespace: infrastructure
Priority: 0
PriorityClassName: <none>
Node: ip-172-20-120-29.eu-west-1.compute.internal/172.20.120.29
Start Time: Fri, 05 Feb 2021 17:10:34 +0100
Labels: app=jenkins
chart=jenkins-0.35.0
component=jenkins-jenkins-master
heritage=Tiller
pod-template-hash=2481030520
release=jenkins
Annotations: checksum/config=fc546aa316b7bb9bd6a7cbeb69562ca9f224dbfe53973411f97fea27e90cd4d7
Status: Pending
IP: 100.125.247.153
Controlled By: ReplicaSet/jenkins-68d5474964
Init Containers:
copy-default-config:
Container ID: docker://a6ce91864c181d4fc851afdd4a6dc2258c23e75bbed6981fe1cafad74a764ff2
Image: jenkins/jenkins:2.248
Image ID: docker-pullable://jenkins/jenkins#sha256:352f10079331b1e63c170b6f4b5dc5e2367728f0da00b6ad34424b2b2476426a
Port: <none>
Host Port: <none>
Command:
sh
/var/jenkins_config/apply_config.sh
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 05 Feb 2021 17:15:16 +0100
Finished: Fri, 05 Feb 2021 17:15:36 +0100
Ready: False
Restart Count: 5
Limits:
cpu: 2560m
memory: 2Gi
Requests:
cpu: 50m
memory: 256Mi
Environment:
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (rw)
/var/jenkins_home from jenkins-home (rw)
/var/jenkins_plugins from plugin-dir (rw)
/var/run/docker.sock from docker-sock (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5tbbb (rw)
Containers:
jenkins:
Container ID:
Image: jenkins/jenkins:2.248
Image ID:
Ports: 8080/TCP, 50000/TCP
Host Ports: 0/TCP, 0/TCP
Args:
--argumentsRealm.passwd.$(ADMIN_USER)=$(ADMIN_PASSWORD)
--argumentsRealm.roles.$(ADMIN_USER)=admin
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
cpu: 2560m
memory: 2Gi
Requests:
cpu: 50m
memory: 256Mi
Environment:
JAVA_OPTS:
JENKINS_OPTS:
JENKINS_SLAVE_AGENT_PORT: 50000
ADMIN_PASSWORD: <set to the key 'jenkins-admin-password' in secret 'jenkins'> Optional: false
ADMIN_USER: <set to the key 'jenkins-admin-user' in secret 'jenkins'> Optional: false
Mounts:
/usr/share/jenkins/ref/plugins/ from plugin-dir (rw)
/usr/share/jenkins/ref/secrets/ from secrets-dir (rw)
/var/jenkins_config from jenkins-config (ro)
/var/jenkins_home from jenkins-home (rw)
/var/run/docker.sock from docker-sock (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5tbbb (rw)
Conditions:
Type Status
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
jenkins-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: jenkins
Optional: false
plugin-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
secrets-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
jenkins-home:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: jenkins
ReadOnly: false
default-token-5tbbb:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5tbbb
Optional: false
docker-sock:
Type: HostPath (bare host directory volume)
Path: /var/run/docker.sock
HostPathType:
QoS Class: Burstable
Node-Selectors: nodePool=ci
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m default-scheduler Successfully assigned infrastructure/jenkins-68d5474964-slpkj to ip-172-20-120-29.eu-west-1.compute.internal
Normal Started 5m (x4 over 7m) kubelet, ip-172-20-120-29.eu-west-1.compute.internal Started container
Normal Pulling 4m (x5 over 7m) kubelet, ip-172-20-120-29.eu-west-1.compute.internal pulling image "jenkins/jenkins:2.248"
Normal Pulled 4m (x5 over 7m) kubelet, ip-172-20-120-29.eu-west-1.compute.internal Successfully pulled image "jenkins/jenkins:2.248"
Normal Created 4m (x5 over 7m) kubelet, ip-172-20-120-29.eu-west-1.compute.internal Created container
Warning BackOff 2m (x14 over 6m) kubelet, ip-172-20-120-29.eu-west-1.compute.internal Back-off restarting failed container
Once I run helm upgrade for that container, I can see:
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
jenkins 5 441d
jenkins-configs 1 441d
jenkins-tests 1 441d
==> v1/Deployment
NAME READY UP-TO-DATE AVAILABLE AGE
jenkins 0/1 1 0 441d
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
jenkins Bound pvc-8813319f-0d37-11ea-9864-0a7b1d347c8a 4Gi RWO aws-efs 441d
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
jenkins-7b85495f65-2w5mv 0/1 Init:0/1 3 2m9s
==> v1/Secret
NAME TYPE DATA AGE
jenkins Opaque 2 441d
jenkins-secrets Opaque 3 441d
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
jenkins LoadBalancer 100.65.2.235 a881a20a40d37... 8080:31962/TCP 441d
jenkins-agent ClusterIP 100.64.69.113 <none> 50000/TCP 441d
==> v1/ServiceAccount
NAME SECRETS AGE
jenkins 1 441d
==> v1beta1/ClusterRoleBinding
NAME AGE
jenkins-role-binding 441d
Can someone advice?
Now you cannot get any logs by kubectl logs pod_name because the pod status is initializing.
When you use kubectl logs command;
If the pod has multiple containers, you have to specify the container name explicitly.
If you have only one container, then no need to specify the container name.
If you want to get logs of initContainers, you need to specify the initContainer name.
For your case, the pod has one init container and seems it stuck now.
Init Containers:
copy-default-config:
Command:
sh
/var/jenkins_config/apply_config.sh
You can check the log of this container.
kubectl logs jenkins-68d5474964-slpkj copy-default-config
For me, the deployment was in this state because the installPlugins list was incorrectly set in the values passed to the Helm chart.
If it can help :)

Istio Prometheus pod in CrashLoopBackOff State

I am trying to setup Istio (1.5.4) for the bookinfo example provided on their website. I have used the demo configuration profile. But on verifying istio installation it fails since Prometheus pod has entered a CrashLoopBackOff state.
NAME READY STATUS RESTARTS AGE
grafana-5f6f8cbf75-psk78 1/1 Running 0 21m
istio-egressgateway-7f9f45c966-g7k9j 1/1 Running 0 21m
istio-ingressgateway-968d69c8b-bhxk5 1/1 Running 0 21m
istio-tracing-9dd6c4f7c-7fm79 1/1 Running 0 21m
istiod-86884c8c45-sw96x 1/1 Running 0 21m
kiali-869c6894c5-wqgjb 1/1 Running 0 21m
prometheus-589c44dbfc-xkwmj 1/2 CrashLoopBackOff 8 21m
The logs for the prometheus pod:
level=warn ts=2020-05-15T09:07:53.113Z caller=main.go:283 deprecation_notice="'storage.tsdb.retention' flag is deprecated use 'storage.tsdb.retention.time' instead."
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:330 msg="Starting Prometheus" version="(version=2.15.1, branch=HEAD, revision=8744510c6391d3ef46d8294a7e1f46e57407ab13)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:331 build_context="(go=go1.13.5, user=root#4b1e33c71b9d, date=20191225-01:04:15)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:332 host_details="(Linux 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 prometheus-589c44dbfc-xkwmj (none))"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:333 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2020-05-15T09:07:53.114Z caller=main.go:334 vm_limits="(soft=unlimited, hard=unlimited)"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:107 component=activeQueryTracker msg="Failed to create directory for logging active queries"
level=error ts=2020-05-15T09:07:53.157Z caller=query_logger.go:85 component=activeQueryTracker msg="Error opening query log file" file=data/queries.active err="open data/queries.active: no such file or directory"
panic: Unable to create mmap-ed active query log
goroutine 1 [running]:
github.com/prometheus/prometheus/promql.NewActiveQueryTracker(0x24dda5b, 0x5, 0x14, 0x2c62100, 0xc0005f63c0, 0x2c62100)
/app/promql/query_logger.go:115 +0x48c
main.main()
/app/cmd/prometheus/main.go:362 +0x5229
Describe pod output:
Name: prometheus-589c44dbfc-xkwmj
Namespace: istio-system
Priority: 0
Node: inspiron-7577/192.168.0.9
Start Time: Fri, 15 May 2020 14:21:14 +0530
Labels: app=prometheus
pod-template-hash=589c44dbfc
release=istio
Annotations: sidecar.istio.io/inject: false
Status: Running
IP: 172.17.0.11
IPs:
IP: 172.17.0.11
Controlled By: ReplicaSet/prometheus-589c44dbfc
Containers:
prometheus:
Container ID: docker://b6820a000ab67a5ce31d3a38f6f0d510bd150794b2792147fc17ef8f730c03bb
Image: docker.io/prom/prometheus:v2.15.1
Image ID: docker-pullable://prom/prometheus#sha256:169b743ceb4452266915272f9c3409d36972e41cb52f3f28644e6c0609fc54e6
Port: 9090/TCP
Host Port: 0/TCP
Args:
--storage.tsdb.retention=6h
--config.file=/etc/prometheus/prometheus.yml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Fri, 15 May 2020 14:37:50 +0530
Finished: Fri, 15 May 2020 14:37:53 +0530
Ready: False
Restart Count: 8
Requests:
cpu: 10m
Liveness: http-get http://:9090/-/healthy delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9090/-/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/istio-certs from istio-certs (rw)
/etc/prometheus from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
istio-proxy:
Container ID: docker://fa756c93510b6f402d7d88c31a5f5f066d4c254590eab70886e7835e7d3871be
Image: docker.io/istio/proxyv2:1.5.4
Image ID: docker-pullable://istio/proxyv2#sha256:e16e2801b7fd93154e8fcb5f4e2fb1240d73349d425b8be90691d48e8b9bb944
Port: 15090/TCP
Host Port: 0/TCP
Args:
proxy
sidecar
--domain
$(POD_NAMESPACE).svc.cluster.local
--configPath
/etc/istio/proxy
--binaryPath
/usr/local/bin/envoy
--serviceCluster
istio-proxy-prometheus
--drainDuration
45s
--parentShutdownDuration
1m0s
--discoveryAddress
istio-pilot.istio-system.svc:15012
--proxyLogLevel=warning
--proxyComponentLogLevel=misc:error
--connectTimeout
10s
--proxyAdminPort
15000
--controlPlaneAuthPolicy
NONE
--dnsRefreshRate
300s
--statusPort
15020
--trust-domain=cluster.local
--controlPlaneBootstrap=false
State: Running
Started: Fri, 15 May 2020 14:21:31 +0530
Ready: True
Restart Count: 0
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
OUTPUT_CERTS: /etc/istio-certs
JWT_POLICY: first-party-jwt
PILOT_CERT_PROVIDER: istiod
CA_ADDR: istio-pilot.istio-system.svc:15012
POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
POD_NAMESPACE: istio-system (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
HOST_IP: (v1:status.hostIP)
ISTIO_META_POD_NAME: prometheus-589c44dbfc-xkwmj (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: istio-system (v1:metadata.namespace)
ISTIO_META_MESH_ID: cluster.local
ISTIO_META_CLUSTER_ID: Kubernetes
Mounts:
/etc/istio-certs/ from istio-certs (rw)
/etc/istio/proxy from istio-envoy (rw)
/var/run/secrets/istio from istiod-ca-cert (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-cgqbc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus
Optional: false
istio-certs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istio-envoy:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium: Memory
SizeLimit: <unset>
istiod-ca-cert:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-ca-root-cert
Optional: false
prometheus-token-cgqbc:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-token-cgqbc
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned istio-system/prometheus-589c44dbfc-xkwmj to inspiron-7577
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "prometheus-token-cgqbc" : failed to sync secret cache: timed out waiting for the condition
Warning FailedMount 17m kubelet, inspiron-7577 MountVolume.SetUp failed for volume "config-volume" : failed to sync configmap cache: timed out waiting for the condition
Normal Pulled 17m kubelet, inspiron-7577 Container image "docker.io/istio/proxyv2:1.5.4" already present on machine
Normal Created 17m kubelet, inspiron-7577 Created container istio-proxy
Normal Started 17m kubelet, inspiron-7577 Started container istio-proxy
Warning Unhealthy 17m kubelet, inspiron-7577 Readiness probe failed: HTTP probe failed with statuscode: 503
Normal Pulled 16m (x4 over 17m) kubelet, inspiron-7577 Container image "docker.io/prom/prometheus:v2.15.1" already present on machine
Normal Created 16m (x4 over 17m) kubelet, inspiron-7577 Created container prometheus
Normal Started 16m (x4 over 17m) kubelet, inspiron-7577 Started container prometheus
Warning BackOff 2m24s (x72 over 17m) kubelet, inspiron-7577 Back-off restarting failed container
It is unable to create directory for logging. Please help with any ideas.
As istio 1.5.4 has been just released there are some issues with prometheus on minikube installed with istioctl manifest apply.
I checked it on a gcp and everything works fine there.
As a workaround, you can use istio operator which was tested by me and OP and as he mentioned in comments, it's working.
Thanks a lot #jt97! It did work.
Steps to install istio operator
Install the istioctl command.
Deploy the Istio operator: istioctl operator init.
Install istio
To install the Istio demo configuration profile using the operator, run the following command:
kubectl create ns istio-system
kubectl apply -f - <<EOF
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: example-istiocontrolplane
spec:
profile: demo
EOF
Could you tell me why the normal installation failed?
As I mentioned in comments, I don't know yet. If I found a reason I will update this question.

Mysql helm chart not working when we delete existing helm chart and try installing again

I have added stable/mysql chart in requirements.yaml
dependencies:
- name: mysql
version: 1.4.0
repository: https://kubernetes-charts.storage.googleapis.com
condition: global.install_mysql
Created a PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-azure-managed-disk
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-premium
resources:
requests:
storage: 5Gi
And added persistence in mycharts values.yaml
mysql:
persistence:
enabled: false
storageClass: managed-premium
existingClaim: mysql-azure-managed-disk
accessMode: ReadWriteOnce
size: 1Gi
annotations: {}
When I install mychart helm install --name mychart . the chart gets installed successfully and my other pod is also able to connect to the mysql pod with password from secrets.
But when I delete the chart and install again new pod is not able to connect to the mysql pod.
Has anyone faces this issue?
This is the first time log
C02XF0N6JG5M:mychart komal$ k logs -f mychart-mysql-6f5f77445f-d2sld
Initializing database
Database initialized
MySQL init process in progress...
Warning: Unable to load '/usr/share/zoneinfo/Factory' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/iso3166.tab' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/leap-seconds.list' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/posix/Factory' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/right/Factory' as time zone. Skipping it.
Warning: Unable to load '/usr/share/zoneinfo/zone.tab' as time zone. Skipping it.
MySQL init process done. Ready for start up.
Secrets created for first time
k get secrets mychart-mysql -o yaml
apiVersion: v1
data:
mysql-password: dmY3NjIybU9FTA==
mysql-root-password: WWRCVHo4MDFNcQ==
kind: Secret
metadata:
creationTimestamp: "2020-03-06T07:51:01Z"
labels:
app: mychart-mysql
chart: mysql-1.4.0
heritage: Tiller
release: reancore
name: reancore-mysql
namespace: default
resourceVersion: "2450905"
selfLink: /api/v1/namespaces/default/secrets/reancore-mysql
uid: a1ec1f81-8986-4aad-b952-af2dad1fab5a
type: Opaque
When I am trying to get thee logs second time, it doesn't give me anything in logs, but the pod is started and running.
This is the describe output
k describe pods mychart-mysql-6f5f77445f-fv4rj
Name: mychart-mysql-6f5f77445f-fv4rj
Namespace: default
Priority: 0
Node: aks-agentpool-16017342-vmss000000/10.240.0.4
Start Time: Fri, 06 Mar 2020 14:09:12 +0530
Labels: app=mychart-mysql
pod-template-hash=6f5f77445f
release=mychart
Annotations: <none>
Status: Running
IP: 10.244.0.139
IPs: <none>
Controlled By: ReplicaSet/mychart-mysql-6f5f77445f
Init Containers:
remove-lost-found:
Container ID: docker://e8791c6acbac6ac6a37c0e9bf039015a84330feee5a4ad52e98ca530b71df804
Image: busybox:1.29.3
Image ID: docker-pullable://busybox#sha256:8ccbac733d19c0dd4d70b4f0c1e12245b5fa3ad24758a11035ee505c629c0796
Port: <none>
Host Port: <none>
Command:
rm
-fr
/var/lib/mysql/lost+found
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 06 Mar 2020 14:09:34 +0530
Finished: Fri, 06 Mar 2020 14:09:34 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 10m
memory: 10Mi
Environment: <none>
Mounts:
/var/lib/mysql from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gdgms (ro)
Containers:
mychart-mysql:
Container ID: docker://54e5dc22543fcc028ab000eda1fe08569ed65b31c2072b74068eb861eb4ce119
Image: mysql:5.7.14
Image ID: docker-pullable://mysql#sha256:c8f03238ca1783d25af320877f063a36dbfce0daa56a7b4955e6c6e05ab5c70b
Port: 3306/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 06 Mar 2020 14:09:34 +0530
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 256Mi
Liveness: exec [sh -c mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD}] delay=30s timeout=5s period=10s #success=1 #failure=3
Readiness: exec [sh -c mysqladmin ping -u root -p${MYSQL_ROOT_PASSWORD}] delay=5s timeout=1s period=10s #success=1 #failure=3
Environment:
MYSQL_ROOT_PASSWORD: <set to the key 'mysql-root-password' in secret 'mychart-mysql'> Optional: false
MYSQL_PASSWORD: <set to the key 'mysql-password' in secret 'mychart-mysql'> Optional: true
MYSQL_USER:
MYSQL_DATABASE:
Mounts:
/var/lib/mysql from data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-gdgms (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: mysql-azure-managed-disk2
ReadOnly: false
default-token-gdgms:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-gdgms
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m4s default-scheduler Successfully assigned default/mychart-mysql-6f5f77445f-fv4rj to aks-agentpool-16017342-vmss000000
Normal Pulled 2m43s kubelet, aks-agentpool-16017342-vmss000000 Container image "busybox:1.29.3" already present on machine
Normal Created 2m43s kubelet, aks-agentpool-16017342-vmss000000 Created container remove-lost-found
Normal Started 2m42s kubelet, aks-agentpool-16017342-vmss000000 Started container remove-lost-found
Normal Pulled 2m42s kubelet, aks-agentpool-16017342-vmss000000 Container image "mysql:5.7.14" already present on machine
Normal Created 2m42s kubelet, aks-agentpool-16017342-vmss000000 Created container mychart-mysql
Normal Started 2m41s kubelet, aks-agentpool-16017342-vmss000000 Started container mychart-mysql
Somehow second time my applications pod is not able to connect to the mysql's pod
When I add mysqlRootPassword in values.yaml, even when I delete the chart and install new again. It connects to mysql pod every time.
mysql:
mysqlRootPassword: root1
persistence:
enabled: false
storageClass: managed-premium
existingClaim: mysql-azure-managed-disk
accessMode: ReadWriteOnce
size: 1Gi
annotations: {}

docker-registry deploys to K8S get an issue "CrashLoopBackOff"

I am stuck with docker-resgitry deployment to K8S. Here I show detail what I did. Hope you can give me any ideas.
My K8S version:
ii kubeadm 1.14.1-00 amd64 Kubernetes Cluster Bootstrapping Tool
ii kubectl 1.14.1-00 amd64 Kubernetes Command Line Tool
ii kubelet 1.14.1-00 amd64 Kubernetes Node Agent
ii kubernetes-cni 0.7.5-00 amd64 Kubernetes CNI
What I did?
Create selfcert
$ openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout cert.key -out cert.crt
Import selfcert to K8S
$ kubectl create secret tls registry-cert-secret --key cert.key --cert cert.crt
$ vim chart_values.yaml
ingress:
enabled: true
hosts:
- registry.mgmt.home.local
annotations:
kubernetes.io/ingress.class: traefik
tls:
- secretName: registry-cert-secret
hosts:
- registry.mgmt.home.local
secrets:
htpasswd: "admin:$2y$05$f95dCd6fRxQdDoPJ6mJIb.YMvR0qfhddSl3NSL1wCk1ZMl4JyFBDW"
s3:
accessKey: "admin"
secretKey: "admin2019"
storage: s3
s3:
region: us-east-1
regionEndpoint: http://minio.home.local:9000
secure: true
bucket: registry
then install with helm
$ helm install stable/docker-registry -f chart_values.yaml --name docker-registry
NAME: docker-registry
LAST DEPLOYED: Thu Oct 31 16:29:31 2019
NAMESPACE: default
STATUS: DEPLOYED
show the kubectl deployments
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
docker-registry 0/1 1 0 35m
get pods
$ kubectl get pods --namespace default
NAME READY STATUS RESTARTS AGE
docker-registry-6989668db6-78d84 0/1 **CrashLoopBackOff** 7 13m
docker-registry-6989668db6-jttrz 1/1 Terminating 0 37m
describe pod
$ kubectl describe pod docker-registry-6989668db6-78d84 --namespace default
Name: docker-registry-6989668db6-78d84
Namespace: default
Priority: 0
PriorityClassName: <none>
Node: k8s-worker-promox/10.102.11.223
Start Time: Thu, 31 Oct 2019 18:03:13 +0800
Labels: app=docker-registry
pod-template-hash=6989668db6
release=docker-registry
Annotations: checksum/config: 89b20bb43a348d6b8dedacac583a596ccef4e570a935e7c5b464ba746eb88307
Status: Running
IP: 10.244.52.10
Controlled By: ReplicaSet/docker-registry-6989668db6
Containers:
docker-registry:
Container ID: docker://9a40c5e100711b122ddd78439c9fa21790f04f5a442b704140639f8fbfbd8929
Image: registry:2.7.1
Image ID: docker-pullable://registry#sha256:8004747f1e8cd820a148fb7499d71a76d45ff66bac6a29129bfdbfdc0154d146
Port: 5000/TCP
Host Port: 0/TCP
Command:
/bin/registry
serve
/etc/docker/registry/config.yml
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Thu, 31 Oct 2019 18:14:21 +0800
Finished: Thu, 31 Oct 2019 18:15:19 +0800
Ready: False
Restart Count: 7
Liveness: http-get http://:5000/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:5000/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment:
REGISTRY_AUTH: htpasswd
REGISTRY_AUTH_HTPASSWD_REALM: Registry Realm
REGISTRY_AUTH_HTPASSWD_PATH: /auth/htpasswd
REGISTRY_HTTP_SECRET: <set to the key 'haSharedSecret' in secret 'docker-registry-secret'> Optional: false
REGISTRY_STORAGE_S3_ACCESSKEY: <set to the key 's3AccessKey' in secret 'docker-registry-secret'> Optional: false
REGISTRY_STORAGE_S3_SECRETKEY: <set to the key 's3SecretKey' in secret 'docker-registry-secret'> Optional: false
REGISTRY_STORAGE_S3_REGION: us-east-1
REGISTRY_STORAGE_S3_REGIONENDPOINT: http://10.102.11.218:9000
REGISTRY_STORAGE_S3_BUCKET: registry
REGISTRY_STORAGE_S3_SECURE: true
Mounts:
/auth from auth (ro)
/etc/docker/registry from docker-registry-config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qfwkm (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
auth:
Type: Secret (a volume populated by a Secret)
SecretName: docker-registry-secret
Optional: false
docker-registry-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: docker-registry-config
ingress:
Optional: false
default-token-qfwkm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qfwkm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 14m default-scheduler Successfully assigned default/docker-registry-6989668db6-78d84 to k8s-worker-promox
Normal Pulled 12m (x3 over 14m) kubelet, k8s-worker-promox Container image "registry:2.7.1" already present on machine
Normal Created 12m (x3 over 14m) kubelet, k8s-worker-promox Created container docker-registry
Normal Started 12m (x3 over 14m) kubelet, k8s-worker-promox Started container docker-registry
Normal Killing 12m (x2 over 13m) kubelet, k8s-worker-promox Container docker-registry failed liveness probe, will be restarted
Warning Unhealthy 12m (x7 over 14m) kubelet, k8s-worker-promox Liveness probe failed: HTTP probe failed with statuscode: 503
Warning Unhealthy 9m8s (x15 over 13m) kubelet, k8s-worker-promox Readiness probe failed: HTTP probe failed with statuscode: 503
Warning BackOff 4m26s (x18 over 8m40s) kubelet, k8s-worker-promox Back-off restarting failed container
I see the issue related to Liveness and Readiness. So they made the pod is trying to start/ restart many times, then it gets "Back-off".
Following the troubleshooting, I see that should be related to DNS. But, DNS should not have any issues. I tried to lookup at K8S host.
$ nslookup minio.home.local
Server: 10.102.11.201
Address: 10.102.11.201#53
Non-authoritative answer:
Name: minio.home.local
Address: 10.101.12.213
Updated November 1st. I went into another pod, then nslookup, this pod could not find minio.home.local. Is that related this issue? also I tried to replace minio.home.local to IP in *.yaml, but also get the same issue.
$ kubectl exec -it net-utils-5b5f89f777-2cwgq bash
root#net-utils-5b5f89f777-2cwgq:/#
root#net-utils-5b5f89f777-2cwgq:/#
root#net-utils-5b5f89f777-2cwgq:/#
root#net-utils-5b5f89f777-2cwgq:/# nslookup minio.home.local
Server: 10.96.0.10
Address: 10.96.0.10#53
** server can't find minio.skylab.local: NXDOMAIN
root#net-utils-5b5f89f777-2cwgq:/# ping minio.home.local
ping: unknown host
Googled/ Github discussion, but I still could not fix it. Do you have any ideas?
Thank you so much.