Kubernetes promtail sidecar: how to get labels from the parent pod metadata - kubernetes

I have some kubernetes applications that log to files rather than stdout/stderr, and I collect them with Promtail sidecars. But since the sidecars execute with "localhost" target, I don't have a kubernetes_sd_config that will apply pod metadata to labels for me. So I'm stuck statically declaring my labels.
# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: promtail
name: sidecar-promtail
data:
config.yml: |
client:
url: http://loki.loki.svc.cluster.local:3100/loki/api/v1/push
backoff_config:
max_period: 5m
max_retries: 10
min_period: 500ms
batchsize: 1048576
batchwait: 1s
external_labels: {}
timeout: 10s
positions:
filename: /tmp/promtail-positions.yaml
server:
http_listen_port: 3101
target_config:
sync_period: 10s
scrape_configs:
- job_name: sidecar-logs
static_configs:
- targets:
- localhost
labels:
job: sidecar-logs
__path__: "/sidecar-logs/*.log"
----
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-logger
spec:
selector:
matchLabels:
run: test-logger
template:
metadata:
labels:
run: test-logger
spec:
volumes:
- name: nfs
persistentVolumeClaim:
claimName: nfs-claim
- name: promtail-config
configMap:
name: sidecar-promtail
containers:
- name: sidecar-promtail
image: grafana/promtail:2.1.0
volumeMounts:
- name: nfs
mountPath: /sidecar-logs
- mountPath: /etc/promtail
name: promtail-config
- name: simple-logger
image: foo/simple-logger
volumeMounts:
- name: nfs
mountPath: /logs
What is the best way to label the collected logs based on the parent pod's metadata?

You can do the following:
In the sidecar container, expose the pod name, node name and other information you need as environment variables, then add the flag '-config.expand-env' to enable environment expansion inside promtail config file, e.g.:
...
- name: sidecar-promtail
image: grafana/promtail:2.1.0
# image: grafana/promtail:2.4.1 # use this one if environment expansion is not available in 2.1.0
args:
# Enable environment expansion in promtail config file
- '-config.expand-env'
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
...
Then in your configMap, add the environment variables in your static_config labels as such:
...
scrape_configs:
- job_name: sidecar-logs
static_configs:
- targets:
- localhost
labels:
job: sidecar-logs
pod: ${POD_NAME}
node_name: ${NODE_NAME}
__path__: "/sidecar-logs/*.log"
...

Related

LoadBalancer inside RabbitMQ Cluster

I'm new to RabbitMQ and Kubernetes world, and I'm trying to achieve the following objectives:
I've deployed successfully a RabbitMQ Cluster on Kubernetes(minikube) and exposed via loadbalancer(doing minikube tunnel on local)
I can connect successfully to my Queue with a basic Spring Boot app. I can send and receive message from my cluster.
In my clusters i have 4 nodes. I have applied the mirror queue policy (HA), and the created queue is mirrored also on other nodes.
This is my cluster configuration:
apiVersion: v1
kind: Secret
metadata:
name: rabbit-secret
type: Opaque
data:
# echo -n "cookie-value" | base64
RABBITMQ_ERLANG_COOKIE: V0lXVkhDRFRDSVVBV0FOTE1RQVc=
apiVersion: v1
kind: ConfigMap
metadata:
name: rabbitmq-config
data:
enabled_plugins: |
[rabbitmq_federation,rabbitmq_management,rabbitmq_federation_managementrabbitmq_peer_discovery_k8s].
rabbitmq.conf: |
loopback_users.guest = false
listeners.tcp.default = 5672
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_k8s
cluster_formation.k8s.host = kubernetes.default.svc.cluster.local
cluster_formation.k8s.address_type = hostname
cluster_formation.node_cleanup.only_log_warning = true
##cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config
##cluster_formation.classic_config.nodes.1 = rabbit#rabbitmq-0.rabbitmq.rabbits.svc.cluster.local
##cluster_formation.classic_config.nodes.2 = rabbit#rabbitmq-1.rabbitmq.rabbits.svc.cluster.local
##cluster_formation.classic_config.nodes.3 = rabbit#rabbitmq-2.rabbitmq.rabbits.svc.cluster.local
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: rabbitmq
spec:
serviceName: rabbitmq
replicas: 4
selector:
matchLabels:
app: rabbitmq
template:
metadata:
labels:
app: rabbitmq
spec:
serviceAccountName: rabbitmq
initContainers:
- name: config
image: busybox
command: ['/bin/sh', '-c', 'cp /tmp/config/rabbitmq.conf /config/rabbitmq.conf && ls -l /config/ && cp /tmp/config/enabled_plugins /etc/rabbitmq/enabled_plugins']
volumeMounts:
- name: config
mountPath: /tmp/config/
readOnly: false
- name: config-file
mountPath: /config/
- name: plugins-file
mountPath: /etc/rabbitmq/
containers:
- name: rabbitmq
image: rabbitmq:3.8-management
lifecycle:
postStart:
exec:
command:
- /bin/sh
- -c
- >
until rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} await_startup; do sleep 1; done;
rabbitmqctl --erlang-cookie ${RABBITMQ_ERLANG_COOKIE} set_policy ha-two "" '{"ha-mode":"exactly", "ha-params": 2, "ha-sync-mode": "automatic"}'
ports:
- containerPort: 15672
name: management
- containerPort: 4369
name: discovery
- containerPort: 5672
name: amqp
env:
- name: RABBIT_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: RABBIT_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: RABBITMQ_NODENAME
value: rabbit#$(RABBIT_POD_NAME).rabbitmq.$(RABBIT_POD_NAMESPACE).svc.cluster.local
- name: RABBITMQ_USE_LONGNAME
value: "true"
- name: RABBITMQ_CONFIG_FILE
value: "/config/rabbitmq"
- name: RABBITMQ_ERLANG_COOKIE
valueFrom:
secretKeyRef:
name: rabbit-secret
key: RABBITMQ_ERLANG_COOKIE
- name: K8S_HOSTNAME_SUFFIX
value: .rabbitmq.$(RABBIT_POD_NAMESPACE).svc.cluster.local
volumeMounts:
- name: data
mountPath: /var/lib/rabbitmq
readOnly: false
- name: config-file
mountPath: /config/
- name: plugins-file
mountPath: /etc/rabbitmq/
volumes:
- name: config-file
emptyDir: {}
- name: plugins-file
emptyDir: {}
- name: config
configMap:
name: rabbitmq-config
defaultMode: 0755
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "standard"
resources:
requests:
storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
name: rabbitmq
spec:
type: LoadBalancer
ports:
- port: 15672
targetPort: 15672
name: management
- port: 4369
targetPort: 4369
name: discovery
- port: 5672
targetPort: 5672
name: amqp
selector:
app: rabbitmq
If I understood correctly, rabbitmq with HA receives message on one node(master) and then response is mirrored to slave, right?
But if I want to load balance the workload? For example suppose that I sends 200 messages per second.
These 200 messages are received all from the master node. What I want, instead, is that these 200 messages are distributed across nodes, for example 100 messages are received from node1, 50 messages are received on node 2 and the rest on node 3. It's possible to do that? And if yes, how can I achieve it on kubernetes?

Convert filebeat [#timestamp] from UTC to local timezone

I am trying to collect logs from the running pod in the KOPS cluster. I run filebeat DemonSet in the KOPS cluster to collect logs from my pod(application) and then ship those logs to the outside of the cluster where the logstash service is accepting them and saves them into a file.
I noticed filebeat always producing the logs with UTC timestamp even though all of my nodes and pods are running in SGT timezone.
I set add_locale in filebeat processor but it doesn't help.
add_locale:
format: offset
nodes timezone
pod timezone
Complete filebeat-kubernetes.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: logging
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: logging
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
# To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
templates:
- condition:
equals:
kubernetes.namespace: default
- condition:
contains:
kubernetes.pod.name: "application1"
config:
- type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}*.log
- condition:
contains:
kubernetes.pod.name: "application2"
config:
- type: container
paths:
- /var/log/containers/*${data.kubernetes.container.id}*.log
processors:
- add_locale:
format: offset
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
output.logstash:
hosts: ["IP:5044"]
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: logging
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:7.10.1
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
- name: tz-config
mountPath: /etc/localtime
volumes:
- name: config
configMap:
defaultMode: 0640
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
# When filebeat runs as non-root user, this directory needs to be writable by group (g+w).
path: /var/lib/filebeat-data
type: DirectoryOrCreate
- name: tz-config
hostPath:
path: /usr/share/zoneinfo/Singapore
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: logging
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: filebeat
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: logging
labels:
k8s-app: filebeat
---
output log from filebeat
Unfortunately, I don't have enough reputation to add a comment so posting this as an answer. The following reference doc mentions:
The processor adds the a event.timezone value to each event.
so it could be possible that the log timestamp itself is not converted to the local timezone but it adds additional field in the event logs to represent the timezone and that can be used to format the logs by the application consuming the logs.

subPathExpr for parallel pods

We have parallel jobs on EKS and we would like the jobs to write to hostPath.
We are using subPathExpr with environment variable as according to the documentation. However, after the run, the hostPath contains only the one folder probably due to racing condition from the parallel jobs and whichever job get hold of the hostPath.
We are on Kubernetes 1.17. Is subPathExpr meant for this use case of allowing parallel jobs to write to the same hostPath? What are other options to allow parallel jobs to write to host volume?
apiVersion: batch/v1
kind: Job
metadata:
name: gatling-job
spec:
ttlSecondsAfterFinished: 300 # delete after 5 minutes
completions: 5
parallelism: 5
backoffLimit: 0
template:
spec:
restartPolicy: "Never"
containers:
- name: gatling
image: GATLING_IMAGE_NAME
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
volumeMounts:
- name: perftest-results
mountPath: /opt/gatling/results
subPathExpr: $(POD_NAME)
volumes:
- name: perftest-results
hostPath:
path: /data/perftest-results
Tested with a simple job template as below, and files were created in respective folder and worked as expected.
Will investigate the actual project. Closing for now.
apiVersion: batch/v1
kind: Job
metadata:
name: subpath-jobs
labels:
name: subpath-jobs
spec:
completions: 5
parallelism: 5
backoffLimit: 0
template:
spec:
restartPolicy: "Never"
containers:
- name: busybox
image: busybox
workingDir: /outputs
command: [ "touch" ]
args: [ "a_file.txt" ]
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
volumeMounts:
- name: job-output
mountPath: /outputs
subPathExpr: $(POD_NAME)
volumes:
- name: job-output
hostPath:
path: /data/outputs
type: DirectoryOrCreate
# ls -R /data
/data:
outputs
/data/outputs:
subpath-jobs-6968q subpath-jobs-6zp4x subpath-jobs-nhh96 subpath-jobs-tl8fx subpath-jobs-w2h9f
/data/outputs/subpath-jobs-6968q:
a_file.txt
/data/outputs/subpath-jobs-6zp4x:
a_file.txt
/data/outputs/subpath-jobs-nhh96:
a_file.txt
/data/outputs/subpath-jobs-tl8fx:
a_file.txt
/data/outputs/subpath-jobs-w2h9f:
a_file.txt

Pod does not see secrets

The pod that created in the same default namespace as it's secret does not see values from it.
Secret's file contains following:
apiVersion: v1
kind: Secret
metadata:
name: backend-secret
data:
SECRET_KEY: <base64 of value>
DEBUG: <base64 of value>
After creating this secret via kubectl create -f backend-secret.yaml I'm launching pod with the following configuration:
apiVersion: v1
kind: Pod
metadata:
name: backend
spec:
containers:
- image: backend
name: backend
ports:
- containerPort: 8000
imagePullSecrets:
- name: dockerhub-credentials
volumes:
- name: secret
secret:
secretName: backend-secret
But pod crashes after trying to extract this environment variable via python's os.environ['DEBUG'] line.
How to make it work?
If you mount secret as volume, it will be mounted in a defined directory where key name will be the file name. For example click here
If you want to access secrets from the environment into your pod then you need to use secret in an environment variable like following.
apiVersion: v1
kind: Pod
metadata:
name: backend
spec:
containers:
- image: backend
name: backend
ports:
- containerPort: 8000
env:
- name: DEBUG
valueFrom:
secretKeyRef:
name: backend-secret
key: DEBUG
- name: SECRET_KEY
valueFrom:
secretKeyRef:
name: backend-secret
key: SECRET_KEY
imagePullSecrets:
- name: dockerhub-credentials
Ref: https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-environment-variables
Finally, I've used these lines at Deployment.spec.template.spec.containers:
containers:
- name: backend
image: zuber93/wts_backend
imagePullPolicy: Always
envFrom:
- secretRef:
name: backend-secret
ports:
- containerPort: 8000

unable to recognize "filebeat-kubernetes.yaml": no matches for kind "DaemonSet" in version "extensions/v1beta1"

I'm trying to run FileBeat on minikube following this doc with k8s 1.16
https://www.elastic.co/guide/en/beats/filebeat/7.4/running-on-kubernetes.html
I downloaded the manifest file as instructed
curl -L -O https://raw.githubusercontent.com/elastic/beats/7.4/deploy/kubernetes/filebeat-kubernetes.yaml
Contents of the yaml file below
---
apiVersion: v1
kind: ConfigMap
metadata:
name: filebeat-config
namespace: kube-system
labels:
k8s-app: filebeat
data:
filebeat.yml: |-
filebeat.inputs:
- type: container
paths:
- /var/log/containers/*.log
processors:
- add_kubernetes_metadata:
host: ${NODE_NAME}
matchers:
- logs_path:
logs_path: "/var/log/containers/"
# To enable hints based autodiscover, remove `filebeat.inputs` configuration and uncomment this:
#filebeat.autodiscover:
# providers:
# - type: kubernetes
# host: ${NODE_NAME}
# hints.enabled: true
# hints.default_config:
# type: container
# paths:
# - /var/log/containers/*${data.kubernetes.container.id}.log
processors:
- add_cloud_metadata:
- add_host_metadata:
cloud.id: ${ELASTIC_CLOUD_ID}
cloud.auth: ${ELASTIC_CLOUD_AUTH}
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
---
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
spec:
template:
metadata:
labels:
k8s-app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:7.4.0
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
- name: ELASTIC_CLOUD_ID
value:
- name: ELASTIC_CLOUD_AUTH
value:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0600
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
path: /var/lib/filebeat-data
type: DirectoryOrCreate
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: filebeat
subjects:
- kind: ServiceAccount
name: filebeat
namespace: kube-system
roleRef:
kind: ClusterRole
name: filebeat
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: filebeat
labels:
k8s-app: filebeat
rules:
- apiGroups: [""] # "" indicates the core API group
resources:
- namespaces
- pods
verbs:
- get
- watch
- list
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
---
When I try the deploy step,
kubectl create -f filebeat-kubernetes.yaml
I get the output + error:
configmap/filebeat-config created
clusterrolebinding.rbac.authorization.k8s.io/filebeat created
clusterrole.rbac.authorization.k8s.io/filebeat created
serviceaccount/filebeat created
error: unable to recognize "filebeat-kubernetes.yaml": no matches for kind "DaemonSet" in version "extensions/v1beta1"
As we can see there
DaemonSet, Deployment, StatefulSet, and ReplicaSet resources will no longer be served from extensions/v1beta1, apps/v1beta1, or apps/v1beta2 by default in v1.16. Migrate to the apps/v1 API
You need to change apiVersion
apiVersion: extensions/v1beta1 -> apiVersion: apps/v1
Then there is another error
missing required field "selector" in io.k8s.api.apps.v1.DaemonSetSpec;
So we have to add selector field
spec:
selector:
matchLabels:
k8s-app: filebeat
Edited DaemonSet yaml:
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: filebeat
namespace: kube-system
labels:
k8s-app: filebeat
spec:
selector:
matchLabels:
k8s-app: filebeat
template:
metadata:
labels:
k8s-app: filebeat
spec:
serviceAccountName: filebeat
terminationGracePeriodSeconds: 30
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: filebeat
image: docker.elastic.co/beats/filebeat:7.4.0
args: [
"-c", "/etc/filebeat.yml",
"-e",
]
env:
- name: ELASTICSEARCH_HOST
value: elasticsearch
- name: ELASTICSEARCH_PORT
value: "9200"
- name: ELASTICSEARCH_USERNAME
value: elastic
- name: ELASTICSEARCH_PASSWORD
value: changeme
- name: ELASTIC_CLOUD_ID
value:
- name: ELASTIC_CLOUD_AUTH
value:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
securityContext:
runAsUser: 0
# If using Red Hat OpenShift uncomment this:
#privileged: true
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: config
mountPath: /etc/filebeat.yml
readOnly: true
subPath: filebeat.yml
- name: data
mountPath: /usr/share/filebeat/data
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: varlog
mountPath: /var/log
readOnly: true
volumes:
- name: config
configMap:
defaultMode: 0600
name: filebeat-config
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: varlog
hostPath:
path: /var/log
# data folder stores a registry of read status for all files, so we don't send everything again on a Filebeat pod restart
- name: data
hostPath:
path: /var/lib/filebeat-data
type: DirectoryOrCreate
Let me know if that help you.