Monitor and take action based on pod log event - kubernetes

I have deployed PagerBot https://github.com/stripe-contrib/pagerbot to our internal k8s cluster as a learning opportunity. I had fun writing a helm chart for it!
The bot appears to disconnect from slack at an unknown time and never reconnect. I kill the pod and the deployment recreates it and it connects again (we are using the Slack RTM option).
The pod logs the following entry when it disconnects:
2018-02-24 02:31:14.382590 I [9:34765020] PagerBot::SlackRTMAdapter -- Closed connection to chat. --
I want to learn a method of monitoring for this log entry and taking action. Initially I thought a Liveness probe would be the way to go using a command that returns non-zero when this entry is logged. But the logs aren't stored internally to the container (that I can see).
How do you monitor and take action based on logs that can be seen using kubectl logs pod-name?
Can I achieve this in our Prometheus test deployment? Should I be using a known k8s feature?

I would argue the best course of action is to extend pagerbot to surface more than just the string literal pong in its /ping endpoint, then use that as its livelinessProbe, with a close second being to teach the thing to just reconnect, as that's almost certainly cheaper than tearing down the Pod
Having said that, one approach you may consider is a sidecar container that uses the Pod's service account credentials to monitor the sibling's container (akin to if kubectl logs -f -c pagerbot $my_pod_name | grep "Closed connection to chat"; then kill -9 $pagerbot_pid; fi type deal). That is a little awkward, but I can't immediately think of why it wouldn't work

I ended up landing on a "liveness probe" to solve my problem. I've added the following to my deployment for the pageyBot deployment:
livenessProbe:
exec:
command:
- bash
- -c
- "ss -an | grep -q 'EST.*:443 *$'"
initialDelaySeconds: 120
periodSeconds: 60
Basically tests to see if a connection is established for 443 which we noticed goes away when the bot disconnects.

Related

EKS: kubectl exec does not respect streamingConnectionIdleTimeout

Using EKS with Kubernetes 1.21, managed nodegroups in a private subnet. I'm trying to set the cluster up so that kubectl exec times out after inactivity regardless of the workload being execed into, and without any client configuration.
I'm aware of https://github.com/containerd/containerd/issues/5563, except we're on 1.21 with Docker runtime, not containerd yet.
I set streamingConnectionIdleTimeout: 3600s on the kubelet in the launch template:
cat /etc/kubernetes/kubelet/kubelet-config.json | jq '.streamingConnectionIdleTimeout = "3600s"' > /etc/kubernetes/kubelet/kubelet-config.json
/etc/eks/bootstrap.sh {{CLUSTER_NAME}}
And confirmed with curl -sSL "http://localhost:8001/api/v1/nodes/(node name)/proxy/configz".
However, kubectl exec still does not time out.
I confirmed /proc/sys/net/ipv4/tcp_keepalive_time = 7200 on both the client and the node, so we should be hitting the streaming connection idle timeout before Linux starts sending keepalive probes.
Reading through How kubectl exec Works, it seems possible that the EKS managed control plane is keeping the connection alive. There are people online who have the opposite problem - their connection times out regardless of streamingConnectionIdleTimeout - and they solve it by adjusting the timeout on the load balancer in front of their k8s API server. However, there are no knobs (that I know of) to tweak in that regard on the EKS managed control plane.
I would appreciate any input on this topic.

GCP Alerting Policy for failed GKE CronJob

What would be the best way to set up a GCP monitoring alert policy for a Kubernetes CronJob failing? I haven't been able to find any good examples out there.
Right now, I have an OK solution based on monitoring logs in the Pod with ERROR severity. I've found this to be quite flaky, however. Sometimes a job will fail for some ephemeral reason outside my control (e.g., an external server returning a temporary 500) and on the next retry, the job runs successfully.
What I really need is an alert that is only triggered when a CronJob is in a persistent failed state. That is, Kubernetes has tried rerunning the whole thing, multiple times, and it's still failing. Ideally, it could also handle situations where the Pod wasn't able to come up either (e.g., downloading the image failed).
Any ideas here?
Thanks.
First of all, confirm the GKE’s version that you are running. For that, the following commands are going to help you to identify the GKE’s
default version and the available versions too:
Default version.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.defaultVersion)"
Available versions.
gcloud container get-server-config --flatten="channels" --filter="channels.channel=RAPID" \
--format="yaml(channels.channel,channels.validVersions)"
Now that you know your GKE’s version and based on what you want is an alert that is only triggered when a CronJob is in a persistent failed state, GKE Workload Metrics was the GCP’s solution that used to provide a fully managed and highly configurable solution for sending to Cloud Monitoring all Prometheus-compatible metrics emitted by GKE workloads (such as a CronJob or a Deployment for an application). But, as it is right now deprecated in G​K​E 1.24 and was replaced with Google Cloud Managed Service for Prometheus, then this last is the best option you’ve got inside of GCP, as it lets you monitor and alert on your workloads, using Prometheus, without having to manually manage and operate Prometheus at scale.
Plus, you have 2 options from the outside of GCP: Prometheus as well and Ranch’s Prometheus Push Gateway.
Finally and just FYI, it can be done manually by querying for the job and then checking it's start time, and compare that to the current time, this way, with bash:
START_TIME=$(kubectl -n=your-namespace get job your-job-name -o json | jq '.status.startTime')
echo $START_TIME
Or, you are able to get the job’s current status as a JSON blob, as follows:
kubectl -n=your-namespace get job your-job-name -o json | jq '.status'
You can see the following thread for more reference too.
Taking the “Failed” state as the medullary point of your requirement, setting up a bash script with kubectl to send an email if you see a job that is in “Failed” state can be useful. Here I will share some examples with you:
while true; do if `kubectl get jobs myjob -o jsonpath='{.status.conditions[?(#.type=="Failed")].status}' | grep True`; then mail email#address -s jobfailed; else sleep 1 ; fi; done
For newer K8s:
while true; do kubectl wait --for=condition=failed job/myjob; mail#address -s jobfailed; done

use of kubectl log in readiness probe

I have a server which is running inside of a kubernetes pod.
Its log output can be retrieved using "kubectl logs".
The application goes through some start up before it is ready to process incoming messages.
It indicates its readiness through a log message.
The "kubectl logs" command is not available from within the pod. I think it would be insecure to even try to install it.
Is there a way of either:
getting the log from within the container? or
running a readiness probe that is executed outside of the container?
(rather than as a docker exec)
Here are some options I've considered:
Redirecting the output to a log file loses it from "Kubectl log"
Teeing it to a log file avoids that limitation but creates an unnecessary duplicate log.
stdout and stderr of the application are anonymous pipes (to kubernetes) so eavesdropping on /proc/1/fd/1 or /proc/1/fd/2 will not work.
A better option may be to use the http API. For example this question
kubectl proxy --port=8080
And from within the container:
curl -XGET http://127.0.0.1:8080/api
However I get an error:
Starting to serve on 127.0.0.1:8080
I0121 17:05:38.928590 49105 log.go:172] http: Accept error: accept tcp 127.0.0.1:8080: accept4: too many open files; retrying in 5ms
2020/01/21 17:05:38 http: proxy error: dial tcp 127.0.0.1:8080: socket: too many open files
Does anyone have a solution or a better idea?
You can actually do what you want. Create a kubernetes "serviceaccount" object with permissions just to do what you want, use the account for your health check pod, and just run kubectl logs as you described. You install kubectl, but limit the permissions avaialable to it.
However, there's a reason you don't find examples of that- its not a great way of building this thing. Is there really no way that you can do a health check endpoint in your app? That is so much more convenient for your purposes.
Finally, if the answer to that really is "NO", could you have your app write a ready file? Instead of print "READY" do touch /app/readyfile. then your health check can just check if that file exists. (to make this work, you would have to create a volume and mount it at /app in both your app container and the health check container so they can both see the generated file)
Too many open files was because I did not run kubectl with sudo.
So the log can be retrieved via the http API with:
sudo kubectl proxy --port 8080
And then from inside the app:
curl -XGET http://127.0.0.1:8080/api/v1/namespaces/default/pods/mypodnamehere/log
That said, I agree with #Paul Becotte that having the application created a ready file would be a better design.

Liveliness probe test in google cloud clustered kubernetes environment

I want to test liveliness probe in google cloud clustered kubernetes environment. How can I bring a pod or container down to test liveliness probes ?
The problem is that replica sets will automatically bring the pods up, if I delete any on those.
On Kubernetes, pods are mortal, and the number of live pods at any given time is guaranteed by the replicasets (which are wrapped by the deployments). So, to take your pods down, you can scale down your deployment to the number you need, or even to zero, like this:
kubectl scale deployment your-deployment-name --replicas=0
However, if you are trying to test and verify that the kubernetes service resource not sending packets to the non live or non ready pod, here's what you can do: You can create another pod with same labels as your real application pods, such that label selectors in the service would match this new pod as well. Configure the pod to have an invalid liveness/readiness probes, so it will not be considered live/ready. Then, hit your service with requests etc. to verify that it never hits the new pod you created.
The question is (quote) "...How can I bring a pod or container down to test liveliness probes ?". The type of probe isn't specified but I'll assume it is HTTP GET or TCP Socket.
Assuming you have proper access to the node/host on which the pod is running:
Start a single pod.
Verify that the liveness probe checks out - that's it, it is working.
Find out on which node the pod is running. This, for example, will return the IP address:
kubectl -n <namespace> get pod <pod-name> -o jsonpath={.status.hostIP}
Log onto the node.
Find the PID of the application process. For example, list all processes (ps aux) and look for the specific process or grep by (part of the) name: ps aux | grep -i <name>. Take the number in the second column. For example, the PID in this ps aux partial output is 13314:
nobody 13314 0.0 0.6 145856 38644 ? Ssl 13:24 0:00 /bin/prometheus --storage....
While on the node, suspend (pause/stop) the process by executing kill -STOP <PID>. For example, for the PID from above:
kill -STOP 13314
At this point:
If there is no liveness probe defined, the pod should still be in Running status and not restarted even though it won't be responding to attempts for connections. To resume the stopped process, execute kill -CONT <PID>.
A properly configured HTTP GET or TCP Socket liveness probe should fail because connection with the application can't be established.
Notice that this method may also work for "exec.command" probes depending what those commands do.
It is to note, also, that most applications run as PID 1 in a (Docker) container. As the Docker docs explain "...A process running as PID 1 inside a container is treated specially by Linux: it ignores any signal with the default action. So, the process will not terminate on SIGINT or SIGTERM unless it is coded to do so". That is probably the reason why the approach won't work from inside the container.

Control order of container termination in a single pod in Kubernetes

I have two containers inside one pod. One is my application container and the second is a CloudSQL proxy container. Basically my application container is dependent on this CloudSQL container.
The problem is that when a pod is terminated, the CloudSQL proxy container is terminated first and only after some seconds my application container is terminated.
So, before my container is terminated, it keeps sending requests to the CloudSQL container, resulting in errors:
could not connect to server: Connection refused Is the server running on host "127.0.0.1" and accepting TCP/IP connections on port 5432
That's why, I thought it would be a good idea to specify the order of termination, so that my application container is terminated first and only then the cloudsql one.
I was unable to find anything that could do this in the documentation. But maybe there is some way.
This is not directly possible with the Kubernetes pod API at present. Containers may be terminated in any order. The Cloud SQL pod may die more quickly than your application, for example if it has less cleanup to perform or fewer in-flight requests to drain.
From Termination of Pods:
When a user requests deletion of a pod, the system records the intended grace period before the pod is allowed to be forcefully killed, and a TERM signal is sent to the main process in each container.
You can get around this to an extent by wrapping the Cloud SQL and main containers in different entrypoints, which communicate their exit status between each other using a shared pod-level file system.
This solution will not work with the 1.16 release of the Cloud SQL proxy (see comments) as this release ceased to bundle a shell with the container. The 1.17 release is now available in Alpine or Debian Buster variants, so this version is now a viable upgrade target which is once again compatible with this solution.
A wrapper like the following may help with this:
containers:
- command: ["/bin/bash", "-c"]
args:
- |
trap "touch /lifecycle/main-terminated" EXIT
<your entry point goes here>
volumeMounts:
- name: lifecycle
mountPath: /lifecycle
- name: cloudsql_proxy
image: gcr.io/cloudsql-docker/gce-proxy
command: ["/bin/bash", "-c"]
args:
- |
/cloud_sql_proxy <your flags> &
PID=$!
function stop {
while true; do
if [[ -f "/lifecycle/main-terminated" ]]; then
kill $PID
fi
sleep 1
done
}
trap stop EXIT
# We explicitly call stop to ensure the sidecar will terminate
# if the main container exits outside a request from Kubernetes
# to kill the Pod.
stop &
wait $PID
volumeMounts:
- name: lifecycle
mountPath: /lifecycle
You'll also need a local scratch space to use for communicating lifecycle events:
volumes:
- name: lifecycle
emptyDir:
How does this solution work? It intercepts in the Cloud SQL proxy container the SIGTERM signal passed by the Kubernetes supervisor to each of your pod's containers on shutdown. The "main process" running in that container is a shell, which has spawned a child process running the Cloud SQL proxy. Thus, the Cloud SQL proxy is not immediately terminated. Rather, the shell code blocks waiting for a signal (by simple means of a file appearing in the file system) from the main container that it has successfully exited. Only at that point is the Cloud SQL proxy process terminated and the sidecar container returns.
Of course, this has no effect on forced termination in the event your containers take too long to shutdown and exceed the configured grace period.
The solution depends on the containers you are running having a shell available to them; this is true of the Cloud SQL proxy (except 1.16, and 1.17 onwards when using the alpine or debian variants), but you may need to make changes to your local container builds to ensure this is true of your own application containers.