How to define Pod health check port for livenessProbe/readinessProbe - kubernetes

How do I define distinct Pod ports, one for application and another for health check (readinessProbe)?
Is the specification for ports, shown below, a correct way to make the readinessProbe to check the health check port TCP/9090 ? I mean, is the readinessProbe going to reach port 9090 (assuming it is open by the running container of course) ? Or does one need to specify any other port (nodePort, targetPort, port, whatever) ?
kind: Deployment
spec:
template:
spec:
containers:
- name: myapp
image: <image>
ports:
- name: myapp-port
containerPort: 8080
protocol: TCP
- name: healthcheck-port
containerPort: 9090
protocol: TCP
readinessProbe:
httpGet:
port: healthcheck-port
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
periodSeconds: 10
successThreshold: 2
failureThreshold: 2

Yes, your specification snippet is almost correct. You don't need to specify any thing else to make readiness probe work.
Port names cannot be more than 15 characters, so the name healthcheck-port won't work. You might want to change the name to something smaller like healthcheck.

Your current configuration is almost correct as mentioned by #shashank-v except the port name.
What i would rather like to point out here apart from the name is to use the same port as best practice, which is TCP/8080 but have a healthz path where you application responds with ok or running. then in your httpget:
readinessProbe:
httpGet:
port: 8080
path: /healthz

You can specify any port and path (assuming it's http) for livenessProbe and readinessProbe, but, of course, you need to be serving something there.
It shouldn't be a service port, so NodePort is not an option, as that's kubelet in charge of the health of the containers, and it has direct access to the containers.

readinessProbe:
tcpSocket:
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
Good reference:
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-tcp-liveness-probe

Related

startup probes not working with exec as expected

I have a sample webapp and redis that I am running in Kubernetes.
I am using probes for the basic checks like below
Now I want to make sure that redis is up and running before the application.
below code snippet is from webapp.
when I run a command nc -zv <redis service name> 6379 it works well, but when I use it as command in startupProbe it gives me errors. I think the way I am passing command is not right, can someone help me understand what is wrong
error I get
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: exec: "nc -zv redis 6379": executable file not found in $PATH: unknown
readinessProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 20
periodSeconds: 5
livenessProbe:
httpGet:
path: /
port: 5000
initialDelaySeconds: 30
periodSeconds: 5
startupProbe:
exec:
command:
- nc -zv redis 6379
failureThreshold: 20
periodSeconds: 5
The command has to be entered in proper format as it is an array. The below code is in expected format.
startupProbe:
exec:
command:
- nc
- -zv
- redis
- "6379"
failureThreshold: 30
periodSeconds: 5

Docker Compose health check of HTTP API using tools outside the container

I am implementing a Docker Compose health check for Prysm Docker container. Prysm is Ethereum 2 node.
My goal is to ensure that RPC APIs (gRPC, JSON-RPC) of Prysm are up before starting other services in the same Docker Compose file, as those services depend on Prysm. I can use depends_on of Docker Compose file for this, but I need to figure out how to construct a check that checks if Prysm HTTP ports are ready to accept traffic.
The equivalent Kubernetes health check is:
readinessProbe:
initialDelaySeconds: 180
timeoutSeconds: 1
periodSeconds: 60
failureThreshold: 3
successThreshold: 1
httpGet:
path: /healthz
port: 9090
scheme: HTTP
livenessProbe:
initialDelaySeconds: 60
timeoutSeconds: 1
periodSeconds: 60
failureThreshold: 60
successThreshold: 1
httpGet:
path: /healthz
port: 9090
scheme: HTTP
The problem with Prysm image is that it lacks normal UNIX tools within the image (curl, netcat, /bin/sh) one usually uses to create such checks.
Is there a way to implement an HTTP health check with Docker Compose that would use built-in features in compose (are there any) or commands from the host system instead of ones within the container?
I managed to accomplish this by creating another service using Dockerize image.
version: '3'
services:
# Oracle connects to ETH1 and ETH2 nodes
# oracle:
stakewise:
container_name: stakewise-oracle
image: stakewiselabs/oracle:v1.0.1
# Do not start oracle service until beacon health check succeeds
depends_on:
beacon_ready:
condition: service_healthy
# ETH2 Prysm node
beacon:
container_name: eth2-beacon
image: gcr.io/prysmaticlabs/prysm/beacon-chain:latest
restart: always
hostname: beacon-chain
# An external startup check tool for Prysm
# Using https://github.com/jwilder/dockerize
# Simply wait that TCP port of RPC becomes available before
# starting the Oracle to avoid errors on the startup.
beacon_ready:
image: jwilder/dockerize
container_name: eth2-beacon-ready
command: "/bin/sh -c 'while true ; do dockerize -wait tcp://beacon-chain:3500 -timeout 300s ; sleep 99 ; done'"
depends_on:
- beacon
healthcheck:
test: ["CMD", "dockerize", "-wait", "tcp://beacon-chain:3500"]
interval: 1s
retries: 999

How to verify certificates for Liveness probe configured to https?

I'm using readinessProbe on my container and configured it work on HTTPS with scheme attribute.
My server expects the get the certificates. how can I configure the readiness probe to support HTTPS with certificates exchange? I don't want it to skip the certificates
readinessProbe:
httpGet:
path: /eh/heartbeat
port: 2347
scheme: HTTPS
initialDelaySeconds: 210
periodSeconds: 10
timeoutSeconds: 5
You can use Readiness command instead of HTTP request. This will give you complete control over the check, including the certificate exchange.
So, instead of:
readinessProbe:
httpGet:
path: /eh/heartbeat
port: 2347
scheme: HTTPS
, you would have something like:
readinessProbe:
exec:
command:
- python
- your_script.py
Be sure the script returns 0 if all is well, and non-zero value on failure.
(python your_script.py is, of course, just one example. You would know what is the best approach for you)

Increase startup threshold for k8s container in v1.12

Following the documentation here, I could set the threshold for container startup like so:
startupProbe:
httpGet:
path: /healthz
port: liveness-port
failureThreshold: 30
periodSeconds: 10
Unfortunately, it seems like startupProbe.failureThreshold is not compatible with our current k8s version (1.12.1):
unknown field "startupProbe" in io.k8s.api.core.v1.Container; if you choose to ignore these errors, turn validation off with --validate=false
Is there a workaround for this? I'd like to give a container a chance of ~40+ minutes to start.
Yes, startupProbe was introduced with 1.16 - so you cannot use it with Kubernetes 1.12.
I am guessing you are defining a livenessProbe - so the easiest way to get around your problem is to remove the livenessProbe. Most applications won't need one (some won't even need a readinessProbe). See also this excellent article: Liveness Probes are Dangerous.
If you have a probe, you could specify initialDelaySeconds and make it some large value that is sufficient for your container to start up.
If you didn't care about probes at all, then you could just let it execute a command that will never fail e.g. whoami
Take what you need from the example below:
readinessProbe:
exec:
command:
- whoami
initialDelaySeconds: 2400
periodSeconds: 5
You could do the same config for livenessProbe if you require one.
I know this is not an answer for this question, but can be useful...
"startupProbes" comes with k8s 1.16+.
If you are suing helm you can surround your block startupProbes with this in your template:
{{- if (semverCompare ">=1.16-0" .Capabilities.KubeVersion.GitVersion) }}
startupProbe:
httpGet:
path: /healthz
port: liveness-port
failureThreshold: 30
periodSeconds: 10
{{- end }}

What would be the opa policy in .rego for the following examples?

I am new to opa and k8s, i dont have much knowledge or experience in this field. i would like to have policy in rego code (opa policy) and execute to see the result.
the following examples are:
Always Pull Images - Ensure every container sets its ‘imagePullPolicy’ to ‘Always’
Check for Liveness Probe - Ensure every container sets a livenessProbe
Check for Readiness Probe - Ensure every container sets a readinessProbe
for the following, i would like have an opa policy:
1.Always Pull Images:
apiVersion: v1
kind: Pod
metadata:
name: test-image-pull-policy
spec:
containers:
- name: nginx
image: nginx:1.13
imagePullPolicy: IfNotPresent
2.Check for Liveness Probe
3.Check for Readiness Probe
containers:
- name: opa
image: openpolicyagent/opa:latest
ports:
- name: http
containerPort: 8181
args:
- "run"
- "--ignore=.*" # exclude hidden dirs created by Kubernetes
- "--server"
- "/policies"
volumeMounts:
- readOnly: true
mountPath: /policies
name: example-policy
livenessProbe:
httpGet:
scheme: HTTP # assumes OPA listens on localhost:8181
port: 8181
initialDelaySeconds: 5 # tune these periods for your environemnt
periodSeconds: 5
readinessProbe:
httpGet:
path: /health?bundle=true # Include bundle activation in readiness
scheme: HTTP
port: 8181
initialDelaySeconds: 5
periodSeconds: 5
Is there any way to create the opa policy for the above conditions. Could any one help as i am new to opa. Thanks in advance.
For the liveness and readiness probe checks, you can simply test if those fields are defined:
package kubernetes.admission
deny["container is missing livenessProbe"] {
container := input_container[_]
not container.livenessProbe
}
deny["container is missing readinessProbe"] {
container := input_container[_]
not container.readinessProbe
}
input_container[container] {
container := input.request.object.spec.containers[_]
}
#Always Pull Images
package kubernetes.admission
deny[msg] {
input.request.kind.kind = "Pod"
container = input.request.object.spec.containers[_]
container.imagePullPolicy != "Always"
msg = sprintf("Forbidden imagePullPolicy value \"%v\"", [container.imagePullPolicy])
}