Zonal network endpoint group unhealthy even though that container application working properly - kubernetes

I've created a Kubernetes cluster on Google Cloud and even though the application is running properly (which I've checked running requests inside the cluster) it seems that the NEG health check is not working properly. Any ideas on the cause?
I've tried to change the service from NodePort to LoadBalancer, different ways of adding annotations to the service. I was thinking that perhaps it might be related to the https requirement in the django side.
# [START kubernetes_deployment]
apiVersion: apps/v1
kind: Deployment
metadata:
name: moner-app
labels:
app: moner-app
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: moner-app
template:
metadata:
labels:
app: moner-app
spec:
containers:
- name: moner-core-container
image: my-template
imagePullPolicy: Always
resources:
requests:
memory: "128Mi"
limits:
memory: "512Mi"
startupProbe:
httpGet:
path: /ht/
port: 5000
httpHeaders:
- name: "X-Forwarded-Proto"
value: "https"
failureThreshold: 30
timeoutSeconds: 10
periodSeconds: 10
initialDelaySeconds: 90
readinessProbe:
initialDelaySeconds: 120
httpGet:
path: "/ht/"
port: 5000
httpHeaders:
- name: "X-Forwarded-Proto"
value: "https"
periodSeconds: 10
failureThreshold: 3
timeoutSeconds: 10
livenessProbe:
initialDelaySeconds: 30
failureThreshold: 3
periodSeconds: 30
timeoutSeconds: 10
httpGet:
path: "/ht/"
port: 5000
httpHeaders:
- name: "X-Forwarded-Proto"
value: "https"
volumeMounts:
- name: cloudstorage-credentials
mountPath: /secrets/cloudstorage
readOnly: true
env:
# [START_secrets]
- name: THIS_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: GRACEFUL_TIMEOUT
value: '120'
- name: GUNICORN_HARD_TIMEOUT
value: '90'
- name: DJANGO_ALLOWED_HOSTS
value: '*,$(THIS_POD_IP),0.0.0.0'
ports:
- containerPort: 5000
args: ["/start"]
# [START proxy_container]
- image: gcr.io/cloudsql-docker/gce-proxy:1.16
name: cloudsql-proxy
command: ["/cloud_sql_proxy", "--dir=/cloudsql",
"-instances=moner-dev:us-east1:core-db=tcp:5432",
"-credential_file=/secrets/cloudsql/credentials.json"]
resources:
requests:
memory: "64Mi"
limits:
memory: "128Mi"
volumeMounts:
- name: cloudsql-oauth-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: ssl-certs
mountPath: /etc/ssl/certs
- name: cloudsql
mountPath: /cloudsql
# [END proxy_container]
# [START volumes]
volumes:
- name: cloudsql-oauth-credentials
secret:
secretName: cloudsql-oauth-credentials
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
- name: cloudsql
emptyDir: {}
- name: cloudstorage-credentials
secret:
secretName: cloudstorage-credentials
# [END volumes]
# [END kubernetes_deployment]
---
# [START service]
apiVersion: v1
kind: Service
metadata:
name: moner-svc
annotations:
cloud.google.com/neg: '{"ingress": true, "exposed_ports": {"5000":{}}}' # Creates an NEG after an Ingress is created
cloud.google.com/backend-config: '{"default": "moner-backendconfig"}'
labels:
app: moner-svc
spec:
type: NodePort
ports:
- name: moner-core-http
port: 5000
protocol: TCP
targetPort: 5000
selector:
app: moner-app
# [END service]
---
# [START certificates_setup]
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
name: managed-cert
spec:
domains:
- domain.com
- app.domain.com
# [END certificates_setup]
---
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: moner-backendconfig
spec:
customRequestHeaders:
headers:
- "X-Forwarded-Proto:https"
healthCheck:
checkIntervalSec: 15
port: 5000
type: HTTP
requestPath: /ht/
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: managed-cert-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: moner-ssl
networking.gke.io/managed-certificates: managed-cert
kubernetes.io/ingress.class: "gce"
spec:
defaultBackend:
service:
name: moner-svc
port:
name: moner-core-http

Apparently, you didn’t have a GCP firewall rule to allow traffic on port 5000 to your GKE nodes. Creating an ingress firewall rule with IP range - 0.0.0.0/0 and port - TCP 5000 targeted to your GKE nodes could allow your setup to work even with port 5000.

I'm still not sure why, but i've managed to work when moved the service to port 80 and kept the health check on 5000.
Service config:
# [START service]
apiVersion: v1
kind: Service
metadata:
name: moner-svc
annotations:
cloud.google.com/neg: '{"ingress": true, "exposed_ports": {"5000":{}}}' # Creates an NEG after an Ingress is created
cloud.google.com/backend-config: '{"default": "moner-backendconfig"}'
labels:
app: moner-svc
spec:
type: NodePort
ports:
- name: moner-core-http
port: 80
protocol: TCP
targetPort: 5000
selector:
app: moner-app
# [END service]
Backend config:
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: moner-backendconfig
spec:
customRequestHeaders:
headers:
- "X-Forwarded-Proto:https"
healthCheck:
checkIntervalSec: 15
port: 5000
type: HTTP
requestPath: /ht/

Related

Kubernetes: Cannot connect to service when using named targetPort

Here's my config:
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
spec:
revisionHistoryLimit: 3
selector:
matchLabels:
pod: 338f54d2-8f89-4602-a848-efcbcb63233f
template:
metadata:
labels:
pod: 338f54d2-8f89-4602-a848-efcbcb63233f
svc: app
spec:
imagePullSecrets:
- name: regcred
containers:
- name: server
image: server
ports:
- name: http-port
containerPort: 3000
resources:
limits:
memory: 128Mi
requests:
memory: 36Mi
envFrom:
- secretRef:
name: db-env
- secretRef:
name: oauth-env
startupProbe:
httpGet:
port: http
path: /
initialDelaySeconds: 1
periodSeconds: 1
failureThreshold: 10
livenessProbe:
httpGet:
port: http
path: /
periodSeconds: 15
---
apiVersion: v1
kind: Service
metadata:
name: app-service
spec:
selector:
pod: 338f54d2-8f89-4602-a848-efcbcb63233f
ports:
- port: 80
targetPort: http-port
When I try that I can't connect to my site. When I change targetPort: http-port back to targetPort: 3000 it works fine. I thought the point of naming my port was so that I could use it in the targetPort. Does it not work with deployments?

Why GKE shows backend instances not healthy?

Following this blog post I created a GKE Kubernetes cluster.
Successively I deployed Keycloak istances and if I use a load balancer:
apiVersion: v1
kind: Service
metadata:
labels:
app: load-balancer
name: load-balancer
namespace: keycloak
spec:
type: LoadBalancer
ports:
- port: 8080
protocol: TCP
targetPort: 8080
name: keycloak
selector:
app: keycloak
I can reach Keyclok.
After that following the Google documentation I created an Ingress for accessing to Keycloak with HTTPS:
managed-certificate.yaml
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
name: managed-cert
namespace: keycloak
spec:
domains:
- mydomain.com
- www.mydomain.com
keycloak-service.yaml
apiVersion: v1
kind: Service
metadata:
name: keycloak-service
namespace: keycloak
annotations:
cloud.google.com/neg: '{"ingress": true}'
spec:
selector:
app: keycloak
type: NodePort
ports:
- protocol: TCP
port: 443
targetPort: 8443
externalTrafficPolicy: Cluster
ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: managed-cert-ingress
namespace: keycloak
annotations:
kubernetes.io/ingress.global-static-ip-name: keycloak
networking.gke.io/managed-certificates: managed-cert
kubernetes.io/ingress.class: "gce"
spec:
defaultBackend:
service:
name: keycloak-service
port:
number: 443
I added liveness and readiness probes to the deployment definition of keycloak too.
But with this configuration GKE says that backend istances ar unhealthy, even if they are healthy and running:
I've read in some related questions on StackOverflow that is a issue with NAG. Should I add the firewall rules for NAG and Ingress?
If it is the point, which could be the rules?
EDIT: keycloak-deployment.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: keycloak-deployment
namespace: keycloak
labels:
app: keycloak
spec:
replicas: 2
selector:
matchLabels:
app: keycloak
template:
metadata:
labels:
app: keycloak
spec:
containers:
- name: keycloak
image: quay.io/keycloak/keycloak:latest
env:
- name: DB_VENDOR
value: "POSTGRES"
- name: DB_ADDR
value: "postgres"
- name: DB_DATABASE
value: "keycloak"
- name: DB_USER
value: "keycloak"
- name: DB_SCHEMA
value: "public"
- name: DB_PASSWORD
value: "password"
- name: KEYCLOAK_USER
value: "admin"
- name: KEYCLOAK_PASSWORD
value: "password"
- name: KEYCLOAK_STATISTICS
value: all
- name: JDBC_PARAMS
value: "useSSL=false"
- name: JGROUPS_DISCOVERY_PROTOCOL
value: "JDBC_PING"
- name: JGROUPS_DISCOVERY_PROPERTIES
value: datasource_jndi_name=java:jboss/datasources/KeycloakDS,info_writer_sleep_time=500,initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING ( own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, created timestamp default current_timestamp, ping_data BYTEA, constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name))"
resources:
limits:
memory: 512Mi
cpu: "1"
requests:
memory: 256Mi
cpu: "0.2"
startupProbe:
httpGet:
path: /health
port: 9990
initialDelaySeconds: 120
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 9990
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /health
port: 9990
successThreshold: 3

Kubernetes Dashboard Ingress returning empty response from server

I am trying to set up the kubernetes dashboard. I have enabled the custom ssl certs from my domain and can curl the pod directly with no issues - i can curl the service and it works with no issues. However, when I try to access via ingress I get (52) empty response from server. I have an NLB forwarding to the port of nginx controller service (ingress works fine with another app). Here is my ingress config:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
labels:
app: dashboard
name: dashboard-ingress
name: dashboard-ingress
namespace: kubernetes-dashboard
spec:
rules:
- host: k8sdash.domain.com
http:
paths:
- backend:
serviceName: kubernetes-dashboard
servicePort: 443
path: /
Here is the Daemonset config for my ingress controllers.
apiVersion: apps/v1
kind: DaemonSet
metadata:
annotations:
deprecated.daemonset.template.generation: "3"
creationTimestamp: "2020-05-19T15:48:13Z"
generation: 3
labels:
app: lb
app.kubernetes.io/component: controller
chart: nginx-ingress-1.36.3
heritage: Tiller
release: lb
name: lb-controller
namespace: kube-system
resourceVersion: "747622"
selfLink: /apis/apps/v1/namespaces/kube-system/daemonsets/lb-controller
uid: 19d830ba-f2d9-4c6f-bc8d-d64667a900c7
spec:
revisionHistoryLimit: 10
selector:
matchLabels:
app: lb
release: lb
template:
metadata:
creationTimestamp: null
labels:
app: lb
app.kubernetes.io/component: controller
component: controller
release: lb
spec:
containers:
- args:
- /nginx-ingress-controller
- --default-backend-service=kube-system/lb-default-backend
- --publish-service=kube-system/lb-controller
- --election-id=ingress-controller-leader
- --ingress-class=nginx
- --configmap=kube-system/lb-controller
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
name: lb-controller
ports:
- containerPort: 80
hostPort: 80
name: http
protocol: TCP
- containerPort: 443
hostPort: 443
name: https
protocol: TCP
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 10254
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
resources: {}
securityContext:
allowPrivilegeEscalation: true
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
runAsUser: 101
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
hostNetwork: true
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: lb
serviceAccountName: lb
terminationGracePeriodSeconds: 60
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
status:
currentNumberScheduled: 3
desiredNumberScheduled: 3
numberAvailable: 3
numberMisscheduled: 0
numberReady: 3
observedGeneration: 3

Expose a redis cluster - with a kubernetes statefulset to the internet

I created a statefulset that deploys a redis image to GCP on kubernetes. The challenge I am having is exposing it using a single domain name. Such that the pods can be accessed in the following order - redis.com/first, redis.com/second, redis.com/third
here are the YAML files
Statefulset
kind: StatefulSet
metadata:
name: app-redis
spec:
selector:
matchLabels:
app: apprenticeship-redis
serviceName: 'redis-service'
replicas: 3
template:
metadata:
labels:
app: app-redis
spec:
terminationGracePeriodSeconds: 10
containers:
- name: app-redis
image: redis
args:
- /etc/redis/redis.conf
volumeMounts:
- mountPath: /etc/redis
name: redis-config
readOnly: false
- name: redis-storage
mountPath: /data
readOnly: false
resources:
requests:
cpu: 50m
memory: 128Mi
limits:
cpu: 150m
memory: 256Mi
ports:
- containerPort: 6379
name: redis
livenessProbe:
exec:
command: ['redis-cli', 'ping']
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 2
volumes:
- name: redis-config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: redis-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Headless service
apiVersion: v1
kind: Service
metadata:
labels:
app: app-redis
name: redis-service
namespace: default
spec:
ports:
- name: server-port
port: 80
protocol: TCP
targetPort: 6379
clusterIP: None
selector:
statefulset.kubernetes.io/pod-name: app-redis-0
Loadbalancer
apiVersion: v1
kind: Service
metadata:
labels:
app: redis-service
name: app-redis
spec:
externalTrafficPolicy: Local
ports:
- port: 80
protocol: TCP
targetPort: 6379
selector:
app: app-redis
type: LoadBalancer
loadBalancerIP: xx.xx.xx.xxx
status:
loadBalancer:
ingress:
- ip: xx.xx.xx.xxx
Config map
apiVersion: v1
kind: ConfigMap
metadata:
name: redis-config
namespace: default
data:
redis.conf: |
dbfilename "dump.rdb"
dir /data
save 3600 1
save 300 10
save 60 100
appendonly yes
appendfilename "appendonly.aof"
Storage class
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: redis-storage
provisioner: kubernetes.io/gce-pd
Ingress
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: redis-ingress
annotations:
kubernetes.io/ingress.class: 'nginx'
nginx.ingress.kubernetes.io/force-ssl-redirect: 'false'
spec:
rules:
- host: app-redis.tk
http:
paths:
- path: /
backend:
serviceName: app-redis
servicePort: 80
Each pod in the StatefulSet will need to have a service linking to it.
This service will need to be created with:
selector:
statefulset.kubernetes.io/pod-name: <POD_NAME>
Then you will be able to set ingress and use it to redirect traffic based on path:
...
spec:
rules:
- http:
paths:
- path: /app-redis-0
backend:
serviceName: redis-service-0
servicePort: 6379
- path: /app-redis-1
backend:
serviceName: redis-service-1
servicePort: 6379
- path: /app-redis-2
backend:
serviceName: redis-service-2
servicePort: 6379
...
You can read about Exposing StatefulSets in Kubernetes and Kubernetes NodePort vs LoadBalancer vs Ingress? When should I use what?

Unhealthy load balancer on GCE

I have a couple of services and the loadbalancers work fine. Now I keep facing an issue with a service that runs fine, but when a loadbalancer is applied I cannot get it to work, because one service seams to be unhealty, but I cannot figure out why. How can I get that service healthy?
Here are my k8s yaml.
Deployment:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: api-production
spec:
replicas: 1
template:
metadata:
name: api
labels:
app: api
role: backend
env: production
spec:
containers:
- name: api
image: eu.gcr.io/foobar/api:1.0.0
livenessProbe:
httpGet:
path: /readinez
port: 8080
initialDelaySeconds: 45
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 8080
env:
- name: ENVIRONMENT
value: "production"
- name: GIN_MODE
value: "release"
resources:
limits:
memory: "500Mi"
cpu: "100m"
imagePullPolicy: Always
ports:
- name: api
containerPort: 8080
Service.yaml
kind: Service
apiVersion: v1
metadata:
name: api
spec:
selector:
app: api
role: backend
type: NodePort
ports:
- name: http
port: 8080
- name: external
port: 80
targetPort: 80
Ingress.yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: api
namespace: production
annotations:
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "gce"
spec:
tls:
- hosts:
- foo.bar.io
secretName: api-tls
rules:
- host: foo.bar.io
http:
paths:
- path: /*
backend:
serviceName: api
servicePort: 80
The problem was solved by configuring the ports in the correct way. Container, Service and LB need (obviously) to be aligned. I also added the initialDelaySeconds.
LB:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: api
namespace: production
annotations:
# kubernetes.io/ingress.allow-http: "false"
kubernetes.io/tls-acme: "true"
kubernetes.io/ingress.class: "gce"
spec:
tls:
- hosts:
- api.foo.io
secretName: api-tls
rules:
- host: api.foo.io
http:
paths:
- path: /*
backend:
serviceName: api
servicePort: 8080
Service:
kind: Service
apiVersion: v1
metadata:
name: api
spec:
selector:
app: api
role: backend
type: NodePort
ports:
- protocol: TCP
port: 8080
targetPort: 8080
name: http
Deployment:
kind: Deployment
apiVersion: extensions/v1beta1
metadata:
name: api-production
spec:
replicas: 1
template:
metadata:
name: api
labels:
app: api
role: backend
env: production
spec:
containers:
- name: api
image: eu.gcr.io/foobarbar/api:1.0.0
livenessProbe:
httpGet:
path: /readinez
port: 8080
initialDelaySeconds: 45
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 45
env:
- name: ENVIRONMENT
value: "production"
- name: GIN_MODE
value: "release"
resources:
limits:
memory: "500Mi"
cpu: "100m"
imagePullPolicy: Always
ports:
- containerPort: 8080