Identify which GKE Node is serving a client request - kubernetes

I have deployed an application on Google Kubernetes Engine. I would like to identify which client request is being services by which node/pod in GKE. Is there a way to map a client request to the pod/node it was serviced by?

The answer to your question greatly depends on the amount of monitoring and instrumentation you have at your disposal.
The most common way to go about it is to add a prometheus client to the code running on your pods, and use it to write metrics containing labels that can identify the client requests you are interested in.
Once Prometheus scrapes your metrics, they will be enriched with the node/pod emitting them, and you can get the data you are after.

I think The Downward API is what you need. It allows you to expose Pod and node info to the running container. Your application can simply echo the content of certain env variables containing the information you need. This way you can see which Pod and scheduled on which node is handling a particular request.
A few words about what it is from kubernetes documentation:
There are two ways to expose Pod and Container fields to a running
Container:
Environment variables
Volume Files
Together, these two ways of exposing Pod and Container fields are
called the Downward API.
I would recommend you to take a closer look specifically at Exposing Pod Information to Containers Through Environment Variables. The following example Pod exposes to the container its name as well as node name:
apiVersion: v1
kind: Pod
metadata:
name: dapi-envars-fieldref
spec:
containers:
- name: test-container
image: k8s.gcr.io/busybox
command: [ "sh", "-c"]
args:
- while true; do
echo -en '\n';
printenv MY_NODE_NAME MY_POD_NAME;
sleep 10;
done;
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
restartPolicy: Never
It's just an example that I hope meets your particular requirements but keep in mind that you may expose this way many more relevant information. Take a quick look at the list of Capabilities of the Downward API.

Related

K8s configmap for application dynamic configuration

I have a microservice for handling retention policy.
This application has default configuration for retention, e.g.: size for retention, files location etc.
But we also want create an API for the user to change this configuration with customized values on runtime.
I created a configmap with the default values, and in the application I used k8s client library to get/update/watch the configmap.
My question is, is it correct to use configmap for dynamic buisness configuration? or is it meant for static configuration that user is not supposed to touch during runtime?
Thanks in advance
There are no rules against it. A lot of software leverages kube API to do some kind of logic / state, ie. leader election. All of those require the app to apply changes to a kube resource. With that in mind do remember it always puts some additional load on your API and if you're unlucky that might become an issue. About two years ago we've been experiencing API limits exhaustion on one of the managed k8s services cause we were using a lot of deployments that had rather intensive leader election logic (2 requests per pod every 5 sec). The issue is long gone since then, but it shows what you have to take into account when designing interactions like this (retries, backoffs etc.)
Using configMaps is perfectly fine for such use cases. You can use a client library in order to watch for updates on the given configMap, however a cleaner solution would be to mount the configMap as a file into the pod and have your configuration set up from the given file. Since you're mounting the configMap as a Volume, changes won't need a pod restart for changes to be visible within the pod (unlike env variables that only "refresh" once the pod get's recreated).
Let's say you have this configMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: special-config
namespace: default
data:
SPECIAL_LEVEL: very
SPECIAL_TYPE: charm
And then you mount this configMap as a Volume into your Pod:
apiVersion: v1
kind: Pod
metadata:
name: dapi-test-pod
spec:
containers:
- name: test-container
image: registry.k8s.io/busybox
command: [ "/bin/sh", "-c", "ls /etc/config/" ]
volumeMounts:
- name: config-volume
mountPath: /etc/config
volumes:
- name: config-volume
configMap:
# Provide the name of the ConfigMap containing the files you want
# to add to the container
name: special-config
restartPolicy: Never
When the pod runs, the command ls /etc/config/ produces the output below:
SPECIAL_LEVEL
SPECIAL_TYPE
This way you would also reduce "noise" to the API-Server as you can simply query the given files for updates to any configuration.

StatefulSet - Get starting pod during volumemount

I have a StatefulSet that starts a MYSQL cluster. The only downside at it for the moment is that for every replica I need to create a Persistent Volume and a Persistent Volume Claim with a select that matches label and podindex.
This means I cannot dynamically add replicas whithout manual interaction.
For this reason I'm searching for a soluction that gives me the option to have only 1 Volume and 1 Claim. And during the pod creation it knows his own pod name for the subPath during mount. (initContainer would be used to check and create the directories on the volume before the application container starts).
So I search a correct way for a code like:
volumeMounts:
- name: mysql-datadir
mountPath: /var/lib/mysql
subPath: "${PODNAME}/datadir"
You can get POD_NAME from the metadata ( the downward API ) by setting ENV var:
env:
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
But, you you cannot use ENV vars in volumes declarations (as far as i know...). So, everything else could be reached via workarounds. One of the workarounds is described here

Kubernetes - How to aggregate application logs

I have a microservice deployed in a Tomcat container/pod. There are four different files generated in the container - access.log, tomcat.log, catalina.out and application.log (log4j output). What is the best approach to send these logs to Elasticsearch (or similar platform).
I read through the information on this page Logging Architecture - Kubernetes 5. Is “Sidecar container with a logging agent” the best option for my use case?
Is it possible to fetch pod labels (e.g.: version) and add it to each line? If it is doable, use a logging agent like fluentd? (I just want to know the direction I should take).
Yes, the best option for your use case is to have to have one tail -f sidecar per log file and then install either a fluentd or a fluent-bit daemonset that will handle shipping and enriching the log events.
The fluentd elasticsearch cluster addon is available at that link. It will install a fluentd daemonset and a minimal ES cluster. The ES cluster is not production ready so please see the README for details on what must be changed.
Is it possible to fetch pod labels (e.g.: version) and add it to each
line?
You can mount information from Pod metadata description to its file system, after that you can configure your agent to use this data. Here is an example:
apiVersion: v1
kind: Pod
metadata:
name: volume-test
spec:
containers:
- name: container-test
image: busybox
volumeMounts:
- name: all-in-one
mountPath: "/projected-volume"
readOnly: true
volumes:
- name: all-in-one
projected:
sources:
- secret:
name: mysecret
items:
- key: username
path: my-group/my-username
- downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
- path: "cpu_limit"
resourceFieldRef:
containerName: container-test
resource: limits.cpu
- configMap:
name: myconfigmap
items:
- key: config
path: my-group/my-config
If it is doable, use a logging agent like fluentd?
Tomcat cannot send logs to Elasticsearch by itself, it needs an agent for that (e.g., Fluentd, Logstash). So, if you want to use Exposing logs directly from the application option, you need to build a Tomcat image with the agent in it. And it seems almost the same as Using a sidecar container with the logging agent option with a harder way to configure. Exposing logs directly from the application option is more related to applications developed by you.

How do I pass the region information as environment variable to a container running in Kubernetes?

I have a use case where I publish messages from containers deployed in various regions and I would like to tag those messages from the region they originated from. Also, I want to do this in a container engine agnostic way so specifically want to access the region info as an environment variable.
You can expose pod information as an environment variable using the Downward API
However, this isn't supported for node labels, as per these github issues.
What you can do is follow this example and labels your pods/deployments (and also maybe pin those pods/deployments using a NodeSelector) and then expose that info. An example:
apiVersion: v1
kind: Pod
metadata:
name: dapi-envars-fieldref
labels:
zone: us-west-2
spec:
containers:
- name: test-container
image: k8s.gcr.io/busybox
command: [ "sh", "-c", $(ZONE)]
env:
- name: ZONE
valueFrom:
fieldRef:
fieldPath: metadata.labels.zone
restartPolicy: Never
Please note, I haven't tested this so YMMV

Is it possible to get the container name from within the container in kubernetes?

Say I have the following pod spec.
apiVersion: apps/v1beta1
kind: Deployment
metadata:
# Unique key of the Deployment instance
name: deployment-example
spec:
# 3 Pods should exist at all times.
replicas: 3
template:
metadata:
labels:
# Apply this label to pods and default
# the Deployment label selector to this value
app: nginx
spec:
containers:
- name: nginx
# Run this image
image: nginx:1.10
Here, the name of the container is nginx. Is there a way to get the "nginx" string from within the running container?
I mean, once I exec into the container with something like
kubectl exec -it <pod-name> -c nginx bash
Is there a programmatic way to get to the given container name in the pod spec ?
Note that this is not necessarily the docker container name that gets printed in docker ps. Kubernetes composes a longer name for the spawned docker container.
The downward api looks promising in this regard. However container name is not mentioned in the Capabilities of the Downward API section.
The container name is not available trough the downward api. You can use yaml anchors and aliases (references). Unfortunately they are not scoped so you will have to come up with unique names for the anchors - it does not matter what they are as they are not present in the parsed document.
Subsequent occurrences of a previously serialized node are presented as alias nodes. The first occurrence of the node must be marked by an anchor to allow subsequent occurrences to be presented as alias nodes.
An alias node is denoted by the “*” indicator. The alias refers to the most recent preceding node having the same anchor. It is an error for an alias node to use an anchor that does not previously occur in the document. It is not an error to specify an anchor that is not used by any alias node.
First occurrence: &anchor Foo
Second occurrence: *anchor
Override anchor: &anchor Bar
Reuse anchor: *anchor
Here is a full working example:
apiVersion: v1
kind: Pod
metadata:
name: reftest
spec:
containers:
- name: &container1name first
image: nginx:1.10
env:
- name: MY_CONTAINER_NAME
value: *container1name
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: &container2name second
image: nginx:1.10
env:
- name: MY_CONTAINER_NAME
value: *container2name
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
Not sure container name within the deployment is available somehow.
For the deployment name, one way that works in OpenShift for deployment configs (and so presumably Kubernetes deployments), is to take the value of the HOSTNAME environment variable, which will be of the form <deployment-name>-<deployment-number>-<random-string>.
Drop from second last - onwards and the lead component is the deployment name.
Would be a fair bit of mucking around, but one could maybe then infer the container name somehow by querying the REST API for deployment resource object based on that deployment name.
What specifically are you after the container name for? If knew what you need it for, may be able to suggest other options.
How about using the container hostname then chopping off the generated components?
$ kubectl exec alpine-tools-645f786645-vfp82 hostname | cut -d- -f1,2
alpine-tools
Although this is very dependent on how you name Pods/containers..
$ kubectl exec -it alpine-tools-645f786645-vfp82 /bin/sh
/ # hostname
alpine-tools-645f786645-vfp82
/ # hostname | cut -d- -f1,2
alpine-tools