Kubernetes endpoints empty , can I restart the pods? - kubernetes

I have a situation where I have zero endpoints available for one service. To test this, I specially crafted a yaml descriptor that uses a simple node server to set and retrieve the ready/live status for a pod:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nodejs-deployment
labels:
app: nodejs
spec:
replicas: 3
selector:
matchLabels:
app: nodejs
template:
metadata:
labels:
app: nodejs
spec:
containers:
- name: nodejs
image: nodejs_server
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /is_alive
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 3
periodSeconds: 10
readinessProbe:
httpGet:
path: /is_ready
port: 8080
initialDelaySeconds: 5
timeoutSeconds: 3
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: nodejs-service
labels:
app: nodejs
spec:
ports:
- port: 80
protocol: TCP
targetPort: 8080
selector:
app: nodejs
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: nodejs-ingress
spec:
backend:
serviceName: nodejs-service
servicePort: 80
The node server has methods to set and retrieve the liveness and readiness.
When the app start I can see that 3 replicas are created and the status of them is ready. OK then now I trigger manually the status of their readiness to set to false [from outside the ingress]. One pod is correctly removed from the endpoint so no traffic is routed to it[that's OK as this is the expected behavior]. When I set all the ready-statuses to false for all pods the endpoints list is empty [still the expected behavior].
At that point I cannot set ready=true from outside the ingress as the traffic is not routed to any pod. Is there a way here for example of triggering a restart of the pod when the ready is not achieved after n-timer or n-seconds? Or when the endpoints list is empty?

Well, that is perfectly normal and expected behaviour. What you can do, on the side, is to forward traffic from localhost to a particular pod with kubectl port-forward. That way you can access the pod directly, without ingresses etc. and set it's readiness back to ok. If you want to restart when host it not ready for to long, just use the same endpoint for liveness probe, but trigger it after more tries.

Related

Kubernetes service routes traffic to only one of 5 pods

i'm playing around with k8s services. I have created simple Spring Boot app, that display it's version number and pod name when curling endpoint:
curl localhost:9000/version
1.3_car-registry-deployment-66684dd8c4-r274b
Then i dockerized it, pushed into my local Kind cluster and deployed with 5 replicas. Next I created service targeting all 5 pods. Lastly, i exposed service like so:
kubectl port-forward svc/car-registry-service 9000:9000
Now when curling my endpoint i expected to see randomly picked pod names, but instead I only get responses from single pod. Moreover, if i kill that one pod then my service stops working, ie i'm getting ERR_EMPTY_RESPONSE, even though there are 4 more pods available. What am I missing? Here's my deployment and service yamls:
apiVersion: apps/v1
kind: Deployment
metadata:
name: car-registry-deployment
spec:
replicas: 5
selector:
matchLabels:
app: car-registry
template:
metadata:
name: car-registry
labels:
app: car-registry
spec:
containers:
- name: car-registry
image: car-registry-database:v1.3
ports:
- containerPort: 9000
protocol: TCP
name: rest
readinessProbe:
exec:
command:
- sh
- -c
- curl http://localhost:9000/healthz | grep "OK"
initialDelaySeconds: 15
periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
name: car-registry-service
spec:
type: ClusterIP
selector:
app: car-registry
ports:
- protocol: TCP
port: 9000
targetPort: 9000
You’re using TCP, so you’re probably using keep-alive. Try to hit it with your browser or a new tty.
Try:
curl -H "Connection: close" http://your-service:port/path
Else, check kube-proxy logs to see if there’s any additional info. Your initial question doesn’t provide much detail.

I have a Kubernetes Deployment and Service working, but my Ingress somehow only shows up talking to port 80 no matter what

I have a GKE cluster that I'm working to get going on https load balancing.
So far I have:
deployment
service (x 2 -- see below)
ingress
SSL cert -- google managed version
All of these seem to be working, but I'm getting a 502 error when connecting to the hostname via https:
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
When trying to trace this down I found a debugging post but when combing through it I found that his ingress shows ports 80,443 ... while I can never get mine to show anything but port 80.
This is even after I split my service into two different services, one on port 443 and one on port 80, and now am only telling the ingress about the 443 service and it still shows up with just port 80 and I'm still getting the 502 error.
The YAML for the deployment (asked by the commenter below):
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-deployment
labels:
app: myapp
spec:
replicas: 2
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: gcr.io/myapp-dev/myapp-container:latest
ports:
- containerPort: 8080
The YAML for the '443 service':
apiVersion: v1
kind: Service
metadata:
name: my-service443
spec:
type: NodePort
selector:
app: myapp
ports:
- name: https
protocol: TCP
port: 443
targetPort: 8080
And the Ingress:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: my-ingress
annotations:
# If the class annotation is not specified it defaults to "gce".
kubernetes.io/ingress.global-static-ip-name: "kubething"
networking.gke.io/managed-certificates: clearspring-cert
kubernetes.io/ingress.class: "gce"
spec:
defaultBackend:
service:
name: my-service443
port:
name: https
I don't understand (a) why the ingress is showing only port 80 and why I'm still getting 502 errors.
Thanks much for any help whatsoever!
It looks like it was missing readiness and liveness probes; when I changed the deployment like this:
apiVersion: apps/v1
kind: Deployment
metadata:
name: cleardev-deployment
labels:
app: clearspring
spec:
replicas: 2
selector:
matchLabels:
app: clearspring
template:
metadata:
labels:
app: clearspring
spec:
containers:
- name: clearspring
image: gcr.io/clearspring-dev/clearspring-container:latest
ports:
- containerPort: 8080
readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 15
periodSeconds: 20
Then the status changed from UNHEALTHY to Unknown ... but I was still getting the 502 error.
The liveness probe did its job: the procedure was not running on port 8080 on all hosts, just on 127.0.0.1. I fixed that ... still not working but tried EXPOSE 8080 in the Dockerfile and now I guess I need to look at firewall rules because liveness/readiness probes can't connect.
Note that I had to delete and recreate the cluster to get this far ... I think. I just tried first updating the deployment and I didn't get any change from UNHEALTHY.

Hostname of pods in same statefulset can not be resolved

I am configuring a statefulset deploying 2 Jira DataCenter nodes. The statefulset results in 2 pods. Everything seems fine until the 2 pods try to connect to eachother. They do this with their short hostname being jira-0 and jira-1.
The jira-1 pod reports UnknownHostException when connecting to jira-0. The hostname can not be resolved.
I read about adding a headless service which I didn't have yet. After adding that I can resolve the FQDN but still no luck for the short name.
Then I read this page: DNS for Services and Pods and added:
dnsConfig:
searches:
- jira.default.svc.cluster.local
That solves my issue but I think it shouldn't be necessary to add this?
Some extra info:
Cluster on AKS with CoreDNS
Kubernetes v1.19.9
Network plugin: Kubenet
Network policy: none
My full yaml file:
apiVersion: v1
kind: Service
metadata:
name: jira
labels:
app: jira
spec:
clusterIP: None
selector:
app: jira
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: jira
spec:
serviceName: jira
replicas: 0
selector:
matchLabels:
app: jira
template:
metadata:
labels:
app: jira
spec:
containers:
- name: jira
image: atlassian/jira-software:8.12.2-jdk11
readinessProbe:
httpGet:
path: /jira/status
port: 8080
initialDelaySeconds: 120
periodSeconds: 10
livenessProbe:
httpGet:
path: /jira/
port: 8080
initialDelaySeconds: 600
periodSeconds: 10
envFrom:
– configMapRef:
name: jira-config
ports:
- containerPort: 8080
dnsConfig:
searches:
- jira.default.svc.cluster.local
That solves my issue but I think it shouldn't be necessary to add this?
From the StatefulSet documentation:
StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
The example above will create three Pods named web-0,web-1,web-2. A StatefulSet can use a Headless Service to control the domain of its Pods.
The pod-identity is will be subdomain to the governing service, eg. in your case it will be e.g:
jira-0.jira.default.svc.cluster.local
jira-1.jira.default.svc.cluster.local

Kubernetes deploy tomcat probleme about liveness and readiness

I try to deploy my tomcat on kubernetes but when I run : kubectl create -f deploy-tomcat.yaml I have always the same error :
error from server (need to declare liveness (found 0), need to declare readiness (found 0)
deploy-tomcat.yaml :
apiVersion: apps/v1
kind: Deployment
metadata:
name: tomcat-deployment
labels:
app: tomcat
spec:
replicas: 1
selector:
matchLabels:
app: tomcat
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat-image
ports:
- containerPort: 8080
I suggest you add a liveness and readinessProbe to your manifest (on containers level), e.g.:
readinessProbe:
tcpSocket:
port: 8080
livenessProbe:
tcpSocket:
port: 8080
Please note, that this is not K8s default behavior to enforce the existence of those probes. I assume it was added to your K8s cluster with a validating admission webhook.

GCE Ingress not picking up health check from readiness probe

When I create a GCE ingress, Google Load Balancer does not set the health check from the readiness probe. According to the docs (Ingress GCE health checks) it should pick it up.
Expose an arbitrary URL as a readiness probe on the pods backing the Service.
Any ideas why?
Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: frontend-prod
labels:
app: frontend-prod
spec:
selector:
matchLabels:
app: frontend-prod
replicas: 3
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
type: RollingUpdate
template:
metadata:
labels:
app: frontend-prod
spec:
imagePullSecrets:
- name: regcred
containers:
- image: app:latest
readinessProbe:
httpGet:
path: /healthcheck
port: 3000
initialDelaySeconds: 15
periodSeconds: 5
name: frontend-prod-app
- env:
- name: PASSWORD_PROTECT
value: "1"
image: nginx:latest
readinessProbe:
httpGet:
path: /health
port: 80
initialDelaySeconds: 5
periodSeconds: 5
name: frontend-prod-nginx
Service:
apiVersion: v1
kind: Service
metadata:
name: frontend-prod
labels:
app: frontend-prod
spec:
type: NodePort
ports:
- port: 80
targetPort: 80
protocol: TCP
name: http
selector:
app: frontend-prod
Ingress:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: frontend-prod-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: frontend-prod-ip
spec:
tls:
- secretName: testsecret
backend:
serviceName: frontend-prod
servicePort: 80
So apparently, you need to include the container port on the PodSpec.
Does not seem to be documented anywhere.
e.g.
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Thanks, Brian! https://github.com/kubernetes/ingress-gce/issues/241
This is now possible in the latest GKE (I am on 1.14.10-gke.27, not sure if that matters)
Define a readinessProbe on your container in your Deployment.
Recreate your Ingress.
The health check will point to the path in readinessProbe.httpGet.path of the Deployment yaml config.
Update by Jonathan Lin below: This has been fixed very recently. Define a readinessProbe on the Deployment. Recreate your Ingress. It will pick up the health check path from the readinessProbe.
GKE Ingress health check path is currently not configurable. You can go to http://console.cloud.google.com (UI) and visit Load Balancers list to see the health check it uses.
Currently the health check for an Ingress is GET / on each backend: specified on the Ingress. So all your apps behind a GKE Ingress must return HTTP 200 OK to GET / requests.
That said, the health checks you specified on your Pods are still being used ––by the kubelet to make sure your Pod is actually functioning and healthy.
Google has recently added support for CRD that can configure your Backend Services along with healthchecks:
apiVersion: cloud.google.com/v1beta1
kind: BackendConfig
metadata:
name: backend-config
namespace: prod
spec:
healthCheck:
checkIntervalSec: 30
port: 8080
type: HTTP #case-sensitive
requestPath: /healthcheck
See here.
Another reason why Google Cloud Load Balancer does not pick-up GCE health check configuration from Kubernetes Pod readiness probe could be that the service is configured as "selectorless" (the selector attribute is empty and you manage endpoints directly).
This is the case with e.g. kube-lego: see https://github.com/jetstack/kube-lego/issues/68#issuecomment-303748457 and https://github.com/jetstack/kube-lego/issues/68#issuecomment-327457982.
Original question does have selector specified in the service, so this hint doesn't apply. This hints serves visitors that have the same problem with a different cause.