We have parallel jobs on EKS and we would like the jobs to write to hostPath.
We are using subPathExpr with environment variable as according to the documentation. However, after the run, the hostPath contains only the one folder probably due to racing condition from the parallel jobs and whichever job get hold of the hostPath.
We are on Kubernetes 1.17. Is subPathExpr meant for this use case of allowing parallel jobs to write to the same hostPath? What are other options to allow parallel jobs to write to host volume?
apiVersion: batch/v1
kind: Job
metadata:
name: gatling-job
spec:
ttlSecondsAfterFinished: 300 # delete after 5 minutes
completions: 5
parallelism: 5
backoffLimit: 0
template:
spec:
restartPolicy: "Never"
containers:
- name: gatling
image: GATLING_IMAGE_NAME
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
volumeMounts:
- name: perftest-results
mountPath: /opt/gatling/results
subPathExpr: $(POD_NAME)
volumes:
- name: perftest-results
hostPath:
path: /data/perftest-results
Tested with a simple job template as below, and files were created in respective folder and worked as expected.
Will investigate the actual project. Closing for now.
apiVersion: batch/v1
kind: Job
metadata:
name: subpath-jobs
labels:
name: subpath-jobs
spec:
completions: 5
parallelism: 5
backoffLimit: 0
template:
spec:
restartPolicy: "Never"
containers:
- name: busybox
image: busybox
workingDir: /outputs
command: [ "touch" ]
args: [ "a_file.txt" ]
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
volumeMounts:
- name: job-output
mountPath: /outputs
subPathExpr: $(POD_NAME)
volumes:
- name: job-output
hostPath:
path: /data/outputs
type: DirectoryOrCreate
# ls -R /data
/data:
outputs
/data/outputs:
subpath-jobs-6968q subpath-jobs-6zp4x subpath-jobs-nhh96 subpath-jobs-tl8fx subpath-jobs-w2h9f
/data/outputs/subpath-jobs-6968q:
a_file.txt
/data/outputs/subpath-jobs-6zp4x:
a_file.txt
/data/outputs/subpath-jobs-nhh96:
a_file.txt
/data/outputs/subpath-jobs-tl8fx:
a_file.txt
/data/outputs/subpath-jobs-w2h9f:
a_file.txt
Related
I have some kubernetes applications that log to files rather than stdout/stderr, and I collect them with Promtail sidecars. But since the sidecars execute with "localhost" target, I don't have a kubernetes_sd_config that will apply pod metadata to labels for me. So I'm stuck statically declaring my labels.
# ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
labels:
app: promtail
name: sidecar-promtail
data:
config.yml: |
client:
url: http://loki.loki.svc.cluster.local:3100/loki/api/v1/push
backoff_config:
max_period: 5m
max_retries: 10
min_period: 500ms
batchsize: 1048576
batchwait: 1s
external_labels: {}
timeout: 10s
positions:
filename: /tmp/promtail-positions.yaml
server:
http_listen_port: 3101
target_config:
sync_period: 10s
scrape_configs:
- job_name: sidecar-logs
static_configs:
- targets:
- localhost
labels:
job: sidecar-logs
__path__: "/sidecar-logs/*.log"
----
# Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-logger
spec:
selector:
matchLabels:
run: test-logger
template:
metadata:
labels:
run: test-logger
spec:
volumes:
- name: nfs
persistentVolumeClaim:
claimName: nfs-claim
- name: promtail-config
configMap:
name: sidecar-promtail
containers:
- name: sidecar-promtail
image: grafana/promtail:2.1.0
volumeMounts:
- name: nfs
mountPath: /sidecar-logs
- mountPath: /etc/promtail
name: promtail-config
- name: simple-logger
image: foo/simple-logger
volumeMounts:
- name: nfs
mountPath: /logs
What is the best way to label the collected logs based on the parent pod's metadata?
You can do the following:
In the sidecar container, expose the pod name, node name and other information you need as environment variables, then add the flag '-config.expand-env' to enable environment expansion inside promtail config file, e.g.:
...
- name: sidecar-promtail
image: grafana/promtail:2.1.0
# image: grafana/promtail:2.4.1 # use this one if environment expansion is not available in 2.1.0
args:
# Enable environment expansion in promtail config file
- '-config.expand-env'
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
...
Then in your configMap, add the environment variables in your static_config labels as such:
...
scrape_configs:
- job_name: sidecar-logs
static_configs:
- targets:
- localhost
labels:
job: sidecar-logs
pod: ${POD_NAME}
node_name: ${NODE_NAME}
__path__: "/sidecar-logs/*.log"
...
The yaml below injects the pod's name into the container as RUN_ID. If this cron job spins up 10 pods (parallelism = 10), each of the 10 pods will have a different run id. But I want all the 10 pods to have the same run id. DownwardApi doesn't seem to support retrieving the job id. Is there any other way to do it?
In my case it is not necessary that it needs to be the job id. Any random id that could be set in all 10 pods when a new job is spin up will do. So any ideas for that will also help.
apiVersion: batch/v1
kind: CronJob
metadata:
name: ${CRONJOB_NAME}
namespace: ${NAMESPACE_NAME}
spec:
schedule: "0 8 * * *"
successfulJobsHistoryLimit: 1
failedJobsHistoryLimit: 1
jobTemplate:
spec:
backoffLimit: 4
parallelism: ${PARALLEL_JOBS_COUNT}
completions: ${PARALLEL_JOBS_COUNT}
template:
spec:
containers:
- name: ${CONTAINER_NAME}
image: ${DOCKER_IMAGE_NAME}
imagePullPolicy: IfNotPresent
env:
- name: RUN_ID
valueFrom:
fieldRef:
fieldPath: metadata.name ---> this gets the pod's name
.
.
I did RUN_ID=${POD_NAME%\-*} in a command to extract the job name from the pod name. This solved my use case.
spec:
containers:
- name: ${CONTAINER_NAME}
image: ${ACR_DNS}/${JMETER_DOCKER_IMAGE}
imagePullPolicy: IfNotPresent
command:
- /bin/sh
- -c
- 'export RUN_ID=${POD_NAME%\-*}; cd /config; /entrypoint.sh -n -Jserver.rmi.ssl.disable=true -Ljmeter.engine=debug -Jjmeterengine.force.system.exit=true -t \$(JMETER_JMX_FILE)'
env:
- name: RUN_ID
valueFrom:
fieldRef:
fieldPath: metadata.name
I am trying to mount my applications' logs directory to nfs dynamically including node_name.
No success so far.
I tried as below:
kind: Pod
apiVersion: v1
metadata:
name: nfs-in-a-pod
spec:
containers:
- name: app
image: alpine
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: NODE_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: spec.nodeName
volumeMounts:
- name: nfs-volume
mountPath: /var/nfs
subPath: /$(NODE_NAME)
command: ["/bin/sh"]
args: ["-c", "sleep 500000"]
volumes:
- name: nfs-volume
nfs:
server: ip_adress_here
path: /mnt/events
I think instead of subPath you should use subPathExpr, as mentioned in the documentation.
Use the subPathExpr field to construct subPath directory names from Downward API environment variables. This feature requires the VolumeSubpathEnvExpansion feature gate to be enabled. It is enabled by default starting with Kubernetes 1.15. The subPath and subPathExpr properties are mutually exclusive.
In this example, a Pod uses subPathExpr to create a directory pod1 within the hostPath volume /var/log/pods, using the pod name from the Downward API. The host directory /var/log/pods/pod1 is mounted at /logs in the container.
apiVersion: v1
kind: Pod
metadata:
name: pod1
spec:
containers:
- name: container1
env:
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: busybox
command: [ "sh", "-c", "while [ true ]; do echo 'Hello'; sleep 10; done | tee -a /logs/hello.txt" ]
volumeMounts:
- name: workdir1
mountPath: /logs
subPathExpr: $(POD_NAME)
restartPolicy: Never
volumes:
- name: workdir1
hostPath:
path: /var/log/pods
Hope that´s it.
What I want to do is providing pod with unified log store, currently persisted to hostPath, but I also want this path including UID so I can easily get its path after pod destroyed.
For example:
apiVersion: v1
kind: Pod
metadata:
name: pod-with-logging-support
spec:
containers:
- image: python:2.7
name: web-server
command:
- "sh"
- "-c"
- "python -m SimpleHTTPServer > /logs/http.log 2>&1"
volumeMounts:
- mountPath: /logs
name: log-dir
volumes:
- name: log-dir
hostPath:
path: /var/log/apps/{metadata.uid}
type: DirectoryOrCreate
metadata.uid is what I want to fill in, but I do not how to do it.
For logging it's better to use another strategy.
I suggest you to look at this link.
Your logs are best managed if streamed to stdout and grabbed by an agent, like shown in this picture:
Don't persist your log on filesystem, but gather them using an agent and put them together for further analysis.
Fluentd is very popular and deserves to be known.
After searching the doc from kubernetes, I finally see a solution for my specific problem. This feature is exactly what I wanted.
So I can create the pod with
apiVersion: v1
kind: Pod
metadata:
name: pod-with-logging-support
spec:
containers:
- image: python:2.7
name: web-server
command:
- "sh"
- "-c"
- "python -m SimpleHTTPServer > /logs/http.log 2>&1"
env:
- name: POD_UID
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.uid
volumeMounts:
- mountPath: /logs
name: log-dir
subPath: $(POD_UID)
volumes:
- name: log-dir
hostPath:
path: /var/log/apps/
type: DirectoryOrCreate
Currently, I'm writting my init container specs inside:
metadata:
annotations:
pod.beta.kubernetes.io/init-containers: '[
{
"name": "sdf",
"image": "sdf"
...
So, it forces me to write init container specs in json format.
My question is: Is there any way to write init-container specs without using this way?
From Kubernetes 1.6 on there's a new syntax available. Same format as for normal pod spec, just use initContainers instead.
Since 1.6, u are possible to write it in yaml way. Here is an example that we used to build up the galera cluster.
spec:
serviceName: "galera"
replicas: 3
template:
metadata:
labels:
app: mysql
spec:
initContainers:
- name: install
image: gcr.io/google_containers/galera-install:0.1
imagePullPolicy: Always
volumeMounts:
- name: data
mountPath: /var/lib/mysql
- name: config
mountPath: /etc/mysql/conf.d
- name: bootstrap
image: debian:jessie
command:
- "hello world"
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
volumeMounts:
- name: workdir
mountPath: "/hello"
containers:
- name: mysql
xxxxxx